Diagram of an AI-driven step: data inputs flow into a model or agent, which asserts an output and executes an action.

Why We Built Our AI Agent So It Can't Hold Your Data

Credential Network rebuilt its AI agent around one rule: agents reason, tools act, and data is referenced, never held. Bulk sensitive records stay out of the model's working memory by design, every action runs through a reviewed workflow that waits for human confirmation, and every task leaves a clean audit trail. The result is AI you can actually describe to a compliance officer.

Rob WerfelmannSenior AI Engineer

June 1, 2026· 7 min read

Agents reason. Tools act. Data is referenced, never held.

If you run a healthcare practice, you've probably been pitched a lot of AI lately. You've also probably read a few headlines that made you nervous: a chatbot that confidently gave a wrong answer, a system that took an action nobody authorized, sensitive records that ended up somewhere they shouldn't. These aren't hypotheticals: in April 2026, an AI agent at one software company deleted its entire production database and its volume-level backups in about nine seconds, acting on its own initiative during a routine task. The most recent surviving backup was three months old, and the company only got its data back days later with direct help from its infrastructure provider. The promise of AI is real, but so is the hesitation. When the data in question is provider credentials, patient-adjacent information, and the records your practice is legally accountable for, "move fast and break things" is not an option.

We spent April and May working through this problem with Eric Glover of AppliedIngenuity, and his framing changed how we build. His core argument is simple and, once you see it, hard to unsee: AI risk is an architecture problem, not a model problem. This post is about how we put that idea to work at Credential Network.

The insight: risk lives in the system, not the model

Most AI safety conversations focus on the model itself (pick a better one, write a smarter prompt, add "guardrails"). Eric's point is that those efforts help only at the margins. What actually determines your risk is the architecture around the model: what the AI is allowed to see, what its output flows into, and what it's permitted to do without anyone checking.

He breaks AI failure into three mechanisms that any executive can hold in their head:

Data: what the AI sees. The risk of information being exposed where it shouldn't be.
Output: what the AI says. The risk of a confident, plausible-sounding answer that happens to be wrong.
Action: what the AI does. The risk of an irreversible step taken automatically: a record written, a notice sent, a workflow triggered.

The dangerous part is that the same small error grows or shrinks depending entirely on the system around it. A misread number caught by a human is a minor correction. The same misread number wired straight into an automated action is an expensive mess. The AI didn't change. The architecture did.

That distinction is the whole game. And it's good news, because architecture problems have engineering answers.

How our agent used to work, and why we changed it

Our product is built around an AI agent that helps with credentialing work. The early version used a common pattern in the industry called "ReAct" (the agent reasons about what to do, then acts, then reasons again, in a loop). It's flexible and it demos beautifully. The agent decides, in the moment, what steps to take.

The problem is that flexibility and accountability pull in opposite directions. When an agent freely decides its own actions, you can't promise a customer exactly what it will and won't do, because the answer is "it depends on what it decides this time." For a healthcare buyer who has to answer to a compliance officer, "it depends" is not an acceptable answer.

So we changed the architecture around a single principle:

Agents reason. Tools act. Data is referenced, never held.

What that means in practice

Agents reason. The AI's job is to interpret what you're asking for, plan the right approach, and choose which pre-built workflow fits. It is not the final authority on facts, and it does not reach into your systems to act on its own. Once it hands a task off to be executed, it actually steps back. It's the smart receptionist who understands what you need and routes it correctly, not the person personally pulling your records.

Tools act. Every real action (anything that changes a record or moves a process forward) runs through a defined, reviewed workflow: a fixed recipe of steps we built and tested in advance, not whatever the model improvises in the moment. The agent can only choose from a known library of these workflows; it cannot invent a step outside that set. And the workflows are designed to pause for explicit human confirmation before anything is committed (the equivalent of an "are you sure?" in front of any step that actually changes something). So the worst case isn't a surprise action. It's a proposed step that sits and waits for a yes.

Data is referenced, never held. This is the piece healthcare leaders tend to care about most. When a workflow runs, the sensitive records (provider details, the contents of a search, the result of an onboarding) flow directly between our secured systems and your screen. They do not pass through the AI's working memory. The model deals in short references and brief summaries (a pointer that means "the five providers named Smith," not the five full records themselves). Because the bulky, sensitive data never enters the AI's context, a whole category of exposure (data being logged, cached, or sent to a third-party model and retained under its terms rather than ours) simply can't happen, because the data was never there to leak. A useful way to picture it: the AI is the front door, not the hallway. It lets you in and points you the right way, then steps aside while the real work happens through channels built to protect it.

This design also keeps a single error from snowballing. Because each step passes a reference that resolves to the authoritative record, not a value the model has re-typed into its reasoning, the model never becomes the carrier of your data, and a wrong value has no clean path to travel from one step into the next.

Why this builds trust, not just safety

In Eric's framework, this architecture does three things at once.

It bounds action risk, the category he calls the single most impactful architectural decision in any AI system. The nine-second database deletion is the cautionary version of this: a capable model inside an architecture that let it find a stray credential and run a destructive command on its own, with no checkpoint between intent and consequence. Notably, that wasn't a weak model failing; it was a frontier model inside an architecture that gave it too much room, which is exactly Eric's point that the system, not the model, sets the risk. In our design, the same impulse goes nowhere. Because actions only happen through reviewed, deterministic workflows, and those workflows wait for explicit confirmation before committing anything, the worst case isn't "the AI did something irreversible." It's "the AI proposed a step that didn't pass our checks and got stopped."

It contains output risk by pairing the probabilistic model with deterministic verification. The agent can reason and suggest, but a confident-but-wrong answer (a hallucinated number, a misread field) doesn't flow straight into your records. It has to clear a check first, and harm requires both the model to err and the check to miss, which is a far smaller target than either failing alone.

And it shrinks data risk to almost nothing for the most sensitive material, because that material is never handed to the model in the first place.

It also gives you a clean audit trail. Because every task runs as a tracked, step-by-step workflow rather than an improvised conversation, there's a complete record of what was asked, which steps ran, what a person confirmed, and what changed. When a workflow can't complete, it stops and preserves what happened rather than failing silently. For a practice that has to answer to a compliance officer or a regulator, "show me exactly what the AI did, and when" has a real answer.

There's a practical dividend, too. Because the architecture (not a single all-knowing model) is doing the heavy lifting, we don't have to reach for the largest, most expensive model to get reliable results. That keeps responses fast and the service affordable, and it means your experience doesn't hinge on one AI vendor's pricing or availability. The intelligence is in the system, so the model doesn't have to carry all of it.

The payoff is reliability you can actually describe to a board or a compliance officer. We can tell you the complete list of things our agent can do. We can show you where every check sits. We can produce a clean account of how data flows. None of that is possible when an AI is free to improvise, and all of it is what earns the trust that a healthcare practice rightly demands before letting software near its credentialing.

The takeaway

AI doesn't have to be a leap of faith. The teams getting this right aren't the ones with the cleverest prompts. They're the ones who treated reliability as an architecture decision and built the model into a system that stays safe even when the model is wrong.

That's the bet we've made at Credential Network: an agent smart enough to be genuinely useful, inside an architecture disciplined enough to be genuinely trustworthy.

Thanks to Eric Glover for the framework that shaped this work. His full piece, “AI Risk Is an Architecture Problem,” is worth reading if you want to go deeper on how to think about AI risk in your own organization.

Frequently asked

Does Credential Network's AI agent store or see my sensitive data?

Sensitive records such as provider details and search results flow directly between our secured systems and your screen, never through the AI's working memory. The model only handles short references and brief summaries, so the bulky sensitive data is never in its context to be logged, cached, or sent to a third-party model.

Can the AI agent take an action on its own without approval?

Every action that changes a record or moves a process forward runs through a defined, reviewed workflow, and those workflows pause for explicit human confirmation before anything is committed. The agent can only choose from a known library of workflows and cannot invent a step outside that set.

Why does AI architecture matter more than the AI model itself?

Because the same error grows or shrinks depending on the system around the model. A misread value caught by a human is a minor correction, but the same value wired straight into an automated action can be an expensive mistake. Risk is determined by what the AI is allowed to see, what its output flows into, and what it can do without a check, not by the model alone.