Research question

Autonomous agents collapse the distance between recommendation and execution. A human can approve a transfer, deletion, credential grant, or customer communication inside the same workflow that generated the action. If that approval is stored as a mutable database row or a chat transcript, the organisation has weak evidence that the human approved the exact action that later occurred.

This paper asks what an approval artifact must contain to be usable in an execution boundary for high-risk AI actions.

Method

We compare approval flows in AI systems with established patterns from authorization systems, signed software metadata, and zero-trust architecture. The goal is not to import a full software-update trust model into agent operations. The goal is to identify which properties make approval evidence durable after the approving UI, agent framework, or operator account changes.

Threat model

The approval system should assume:

  • an agent may present a misleading summary of the proposed action;
  • the action arguments may change between approval and dispatch;
  • an operator account may be compromised after approval;
  • a signing key may need rotation without invalidating historical receipts;
  • the boundary may need to prove approval to an auditor without consulting the original SaaS dashboard;
  • a malicious or faulty connector may attempt to reuse an approval for a different action.

Those assumptions push approval out of the conversational layer and into the boundary plane.

Approval object

A high-risk approval should sign an intent digest, not a free-form message. The approval object should include:

  1. approver identity and role at approval time;
  2. action class and target connector;
  3. canonical intent digest;
  4. policy digest and approval rule identifier;
  5. approval time and expiry time;
  6. optional second approver or threshold condition;
  7. signer key identifier;
  8. signature over the approval payload.

The key design decision is the intent digest. If the final dispatch intent does not hash to the approved digest, the approval is not valid for that dispatch.

Scoped authority

Approval signatures should be scoped narrowly. A signature over “approve the agent’s recommendation” is too broad. A signature over “approve this canonical transfer intent under policy bundle X until 2026-04-03T18:00Z” is narrow enough to verify.

This mirrors zero-trust thinking: no implicit trust is granted from location, session, or account ownership alone. Each sensitive action is evaluated with current context. Approval becomes one input into the boundary’s policy decision, not a bypass around policy.

Rotation and historical validity

Signer rotation is unavoidable. Operators change roles. Hardware-backed keys expire. Incident response may revoke a key. The approval system must support rotation without destroying the evidentiary value of old approvals.

The pattern used by signed metadata systems is useful:

  • record key identifiers in the approval object;
  • keep a versioned trust-root history;
  • bind key validity to the approval time, not only to verification time;
  • require revocation metadata to distinguish “key no longer used” from “key compromised during this period.”

If a key is retired normally, historical approvals can remain valid. If a key is compromised, approvals inside the compromise window may need re-verification, escalation, or rejection.

Multi-party approval

Some actions require more than one signer. The signature envelope should support multiple signatures without changing the signed payload. DSSE’s envelope model is a useful reference because it permits multiple signatures over a typed payload.

The boundary should verify both the number and the role semantics of signatures. Two signatures from the same role may not satisfy a separation-of-duties rule. A finance approver and security approver may satisfy one policy, while two finance approvers do not. Policy, not the UI, should define that rule.

Replay safety

Approval reuse is the central failure mode. A safe approval system should reject reuse when:

  • the intent digest differs;
  • the policy digest differs and policy requires re-approval;
  • the approval is expired;
  • the approval was already consumed and marked single-use;
  • connector schema drift changes argument interpretation;
  • a required signer is no longer valid for the approval window.

Replay protection belongs in the boundary because the boundary sees the final dispatch intent. A dashboard cannot safely approve what it does not bind.

Operational workflow

A practical workflow has six steps:

  1. The agent proposes a high-risk intent.
  2. The boundary classifies the intent as requiring approval.
  3. The approval UI displays the canonical action summary and policy reason.
  4. The approver signs the canonical intent digest.
  5. The boundary verifies the approval and emits an escalation or allow receipt.
  6. Dispatch occurs only if the final intent still matches the approved digest.

This keeps the human decision in the workflow while preventing the human confirmation from becoming an unstructured exception.

Limitations

Approval signatures do not solve judgment quality. A human can still approve a bad action. The signature proves what was approved, not that approval was wise.

They also introduce key-management work. Organisations need issuer records, rotation procedures, revocation semantics, and recovery plans. For low-risk actions, this overhead is not justified. The pattern is for high-risk effects where evidence value exceeds operational cost.

Conclusion

Human approval should not be a screenshot, chat message, or mutable row. For autonomous transactions, approval is a cryptographic input to policy evaluation. The approval signature binds human authority to a specific intent, for a specific policy, in a specific time window. That is the unit an execution boundary can verify.

References

  1. Receipt format
  2. Policy schema
  3. RFC 8032 — Edwards-Curve Digital Signature Algorithm
  4. DSSE Envelope specification
  5. The Update Framework
  6. NIST SP 800-207 — Zero Trust Architecture
  7. Amazon Verified Permissions documentation
  8. Cedar authorization language paper