Frameworks

Beyond code: how TRiSM redefines AI's promise

TRiSM has been around for three years and most people still think it means observability dashboards. It doesn't. Here's the framework you actually need before your AI agent meets an examiner.

2026-05-095 min readAshish K. Saxena

You shipped an AI agent. It works. Then someone on the risk team asks how you'd reconstruct a decision it made three months ago for an examiner.

The room goes quiet. That moment is what TRiSM was supposed to prevent.

TRiSM stands for Trust, Risk and Security Management for AI. Gartner coined it in 2023. Three years later, most teams who use the term still mean "we run an observability dashboard on our LLM calls."

That's not what it is. This post breaks the framework down to four parts and tells you what each one is actually for. By the end you'll be able to look at your AI deployment and say, with specifics, which of the four you have and which you don't.

What TRiSM actually covers

The framework has four pieces. They're sequential. Each one's output feeds the next.

Pre-deployment validation. Did this agent ship with known failure modes mapped? Did anyone test it against the adversarial inputs your industry sees?
Runtime capture. Is every decision, every prompt, every tool call being captured in a way you could reconstruct later?
Governance documentation. Can you produce the artifact your specific regulator expects? Not "a report." The specific format that examiner reads.
Independent attestation. Has a third party with no incentive in the outcome signed off?

Skip any one and the chain breaks. You can have the best runtime logging in the world and still fail an SR 11-7 examination because you don't have the documentation pack. You can have an immaculate model card and lose a CFPB inquiry because your runtime capture can't reconstruct individual decisions.

Why "observability" isn't TRiSM

Observability tools track latency, cost, error rates and maybe a vague notion of "quality." They were built for engineering teams who want to know if the system is up.

Examiners don't ask if the system is up. They ask things like:

For this specific transaction on this specific date, what was the agent's input, what was the model's response and what features drove the decision?
Across the last twelve months, what was the disparate impact across protected classes for this customer-facing agent?
What evidence do you have that the third-party model the agent uses was independently validated against your bank's risk standards?

If your observability stack can't answer those, it's not TRiSM. It's just dashboards.

The four-step gap in 2026

Here's what we see at banks $10B+ AUM and at health systems planning clinical AI. The same picture, with different acronyms.

Stage	Common state	What it should be
Pre-deploy	Manual red-team, results in a Google Doc	Versioned adversarial test suite, run on every model change, blocking deploy
Runtime	LangSmith or Datadog dashboards	Tamper-evident capture with feature snapshots, decision metadata and hash chain
Governance	Risk team copy-pastes into a Word template	Auto-assembled artifact in the specific format your regulator's examiner team uses
Sign-off	$500K Big 4 engagement, six months	Bonded auditor with documented independence, weeks not quarters

Each cell on the left looks fine in a slide deck. Each cell on the right is what an OCC examiner or FDA reviewer actually needs to see.

What "vertical TRiSM" means

Horizontal AI governance tools treat regulators as a generic abstraction. "We support compliance reporting." Vertical TRiSM means the artifact you ship is shaped exactly like the document the specific regulator reads.

SR 11-7's model risk documentation pack has a structure. The OCC examiner team knows that structure. Ship them something else and you'll spend the rest of the examination explaining why your evidence doesn't map.

FDA's 510(k) submission has a different structure. ECOA disparate impact analysis has yet another. EU AI Act Annex IV is a fourth. None of them are interchangeable.

The horizontal tools that try to be "compliance for everyone" end up being compliance for nobody specifically. You're left to assemble the artifact yourself, which defeats the purpose.

Where to start this quarter

If you're building or auditing an AI agent in a regulated industry, here's a one-quarter plan that actually moves the needle.

Inventory. Catalog every AI agent in production. For each one, mark the risk tier and the regulator framework that applies.
Pick the worst. Find the highest-risk agent. Treat it as your pattern. Build the four-step TRiSM stack on that one agent end to end.
Replicate. Once the pattern works, you replicate it. The cost of agents two through ten is a tenth of agent one because the schema, the documentation template and the auditor relationship are reusable.

You don't need to solve everything. You need to ship one defensible agent and prove the pattern.

Examiners can be patient with progress. They can't be patient with absence.

What's next

Caventia exists to ship the four-step TRiSM stack as one platform: AgentGuard for pre-deploy, Audit Trail for runtime, Compliance Passport for governance documentation, Auditor Bridge for independent sign-off. One audit log spine running through all four.

The full SR 11-7 framework that drove the architecture is in the SR 11-7 whitepaper. The FDA Q-Sub equivalent for healthcare is in the FDA whitepaper. Both are free and both are written for the model risk officers and clinical AI leads who actually read them.

If your gauntlet is SR 11-7, FDA Q-Sub or something stranger, the same four-step pattern applies.

From the founder

If this resonates, talk to the founder directly.

Caventia is taking ten design partners in 2026. Conversations are with Ashish K. Saxena, not a sales team. Thirty minutes, your specific regulator gap, no purchase obligation.

Talk to the founder ← All notes