AI in public services

An AI agent is only as safe as the access you give it.

Better models and better prompts make mistakes rarer. They do not change what a mistake can reach when it happens. Public services need a harness: infrastructure that decides what any asker can discover, what it may ask, what it gets back, and what trace it leaves. Registry Stack is that harness for registry data.

Policy brief

The latest brief makes the frame wider.

Safeguarding Digital Public Infrastructure in the Age of AI separates two pressures that are often blurred together: AI makes DPI easier to build, and DPI can feed AI with sensitive records. This page focuses on the second risk. Once a record leaves its source, AI can combine it, infer from it, and act on it at scale.

Read the v0.5 brief on Zenodo →
  • Decide first whether AI belongs in the task at all.
  • Make field meaning explicit before systems connect.
  • Scale safeguards and recourse with real influence, not the system label.
  • Share answers, not records, when a program only needs a specific fact.

The harness

Built into the infrastructure once, inherited by every integration.

A harness governs every model, every vendor, and every well-meaning builder, including the ones who arrive after the safety review. Each request passes one answering point that checks authority, meaning, and criteria, then answers narrowly instead of exposing everything the registry holds. The harness is the same governance layer the rest of this site describes; AI only increases what a mistake can cost.

A worked example

Two ways to build the same integration.

A housing-benefit program needs to know whether applicants fall below an income threshold. Automation makes both versions of this integration cheap to build. Only one of them is safe to run.

Copy the records

The obvious integration asks the income registry for the household’s records, and gets them: income history, employer, every family member, copied into a second system. The copy is out of date the day it arrives. Nobody noticed that the registry records gross monthly income while the rule needs net annual income. And when a family is wrongly refused, no one can explain how the copied data produced the denial.

Ask the question

The program asks what it actually needs to know: does this household sit below the threshold for this scheme? Authority is checked before any data is read, the fields are validated against a shared vocabulary, the criteria are evaluated where the data lives, and what comes back is a signed yes-or-no answer with a validity window. The family can see the exchange in the audit trail and challenge it.

The question shifts

From "can these systems connect?" to "is this request legitimate?"

AI-assisted integration lets teams build connections faster than they can understand the data, the authority for access, or the meaning of each field. "Can two systems connect?" is no longer a strong enough question. A registry surface should be able to answer practical review questions, for every caller, human or automated.

  • Which entity or evidence response is being requested?
  • Which purpose, policy, and scope apply?
  • Which fields or claims can leave the source boundary?
  • Which semantic definition is being used?
  • Which audit event records the access or evidence decision?

Zero trust

The security world has a name for this approach.

Zero trust: no caller is trusted because of where it comes from or what credential it carries, and every request is verified at the moment of use. Recent joint guidance on adopting AI agents from the US, UK, Australian, Canadian, and New Zealand cyber security agencies reaches the same conclusion: govern agents inside existing security frameworks, give them the least access they need, and never broad access to sensitive data or critical systems. Registry Stack is where that principle becomes enforceable for registry data, and an AI agent gets no special status: it is one more untrusted caller.

Never trust by default

A request arriving through a trusted exchange channel, from a known ministry system, or with a valid token is still a claim to be checked, not a fact. Authority, purpose, and scope are verified on every request, including requests from systems that were trusted yesterday.

Least privilege, by construction

The classic failure is the over-permissioned integration: a service account that can read everything in case it ever needs something. Here the only thing available is the narrow answer: a claim, an aggregate, an evidence response. A caller that is compromised, or simply wrong, can leak only what it was given, never the registry.

Assume failure, keep the proof

Zero trust plans for the day a caller misbehaves. Every answer is signed, scoped, and time-limited, and every decision leaves an audit record, so a reviewer can reconstruct exactly what left the registry, for whom, and under which policy, and the people affected can challenge it.

What remains hard

Infrastructure can enforce safeguards. Institutions still carry accountability.

Registry Stack can check policy, meaning, scope, evidence, and audit trails at runtime. It cannot replace legal authority, public oversight, meaningful appeal, or the political choice of how much data should flow.

Recourse still needs an owner

A signed answer and audit trail can anchor an appeal, but the escalation path between the program and the registry still has to be staffed, funded, and understood by the people affected.

Federation keeps power bounded

One central gateway would see too much. The safer pattern is one answering point per registry, run by that registry's steward and federated with peers under explicit trust rules.

Audit trails need safeguards too

The fact that someone asked about a person can itself be sensitive. Audit logs need minimized access, usage policy, and oversight, just like the records they protect.

To be clear

Registry Stack is not an AI product.

AI is what makes safer registry surfaces urgent; it is not what Registry Stack sells. Registry Stack does not make eligibility decisions, automate governance, or replace semantic review, authorization policy, human accountability, appeal, or recourse. It gives every caller, human or automated, a controlled answer instead of a raw record.

What a pilot looks like →

Read the docs

See how policy becomes a runtime check.