§ Learn — Decision Provenance 101

The ideas, in plain language.

Decision provenance is a small set of ideas that, once seen, are hard to unsee. This primer explains them without jargon and without a sales pitch — the concepts are vendor-neutral, and the vendor questions at the end work on anyone, including us.

§ 01

What is a decision receipt?

When a store sells you something, it hands you a receipt: an independent record of what happened, created at the moment it happened, that both sides can rely on later. A decision receipt is the same idea applied to a machine-generated decision. It records what was decided, what evidence was considered, which policies were evaluated and what they concluded, and enough information to reproduce the decision — all bundled together, integrity-hashed, and signed.

The critical property is that the receipt is generated by a trusted process — not by the AI system grading its own work. A system attesting to its own good behavior is not evidence; it is a claim.

§ 02

What makes evidence admissible?

Courts worked this out long before computers: evidence counts when its origin is known, its handling is documented, and its integrity can be checked by someone who does not trust the person presenting it. Applied to machine decisions, admissible evidence has three properties:

Provenance — every artifact traces to a source, and the chain from source to decision is recorded, not inferred afterward.
Integrity — artifacts carry cryptographic hashes, so any alteration after the fact is detectable.
Independent verifiability — a third party with the same inputs can check the evidence without trusting whoever produced it.

Ordinary application logs fail all three. They are assembled after the fact, mutable by the operator, and verifiable by no one.

§ 03

Why does replay matter?

Replay means a decision can be reproduced from its recorded inputs — same evidence, same policy version, same result, bit for bit. It matters because explanation without reproduction is storytelling. Anyone can narrate why a system probably did something; replay demonstrates it.

Replay only works if the system was built for determinism: identifiers that contain no timestamps, hostnames, or random values, and recorded inputs complete enough to rerun the decision anywhere — including on a machine with no network connection. If a vendor says their system is explainable, the test is simple: can they rerun last month's decision today and get the identical answer?

§ 04

What does policy-as-code mean?

Most organizations have AI policies. Most of them are documents — which means they are enforced by memory, goodwill, and annual audit. Policy-as-code moves the rules into the execution path: every request is checked against machine-readable policy before it runs, every evaluation result is logged, and the safe default is deny — anything not explicitly permitted does not happen.

The difference shows up under pressure. A written policy violated under deadline leaves no trace until someone looks. Policy-as-code produces a record either way: the request, the rule, the verdict. The policy stops being a hope and becomes a property of the system.

§ 05

Where do agents change the picture?

Everything above gets sharper when the AI does not just answer questions but takes actions — reading files, calling tools, writing to systems. Two ideas matter most:

Bounded autonomy — agents operate inside explicit limits enforced by the infrastructure, not requested in the prompt. Tool access is default-deny; capabilities are declared, not ambient.
Dangerous combinations — the well-documented failure pattern is one run that combines untrusted input, access to secrets, and a path to the outside world. Keeping those three apart is a structural rule, and a system either enforces it or it does not.

Memory is the quiet risk: if an agent remembers things users told it, those memories are input too, and they need the same quarantine and verification as anything else crossing a trust boundary.

§ 06

How to evaluate vendors

These questions are vendor-neutral by design. A credible vendor answers them specifically; an evasive answer is itself an answer.

Can you reproduce a specific past decision — bit-identical — from recorded inputs? Show me, on a decision from last quarter.
Who generates your evidence: the agent that made the decision, or an independent trusted process?
Are your evidence identifiers deterministic — free of timestamps, hostnames, and random values — so a third party can recompute them?
Is policy evaluated in the execution path, and is every verdict logged? Is the default deny?
Can untrusted input, secrets, and external network access ever occur in the same agent run? What enforces the answer?
Can a party who does not trust you verify a decision — and does everything still work air-gapped?

If a vendor can answer all six with specifics, you are in good hands — whoever they are. If they cannot, no dashboard will fix it.

See the concepts on a real workflow.

The 10-Day Decision Assurance Pilot applies everything on this page to one of your actual decision workflows — receipts, replay, and policy verdicts included.

Pilot details Read the doctrine