14:02:31ingeststep.startclassify ticket TCK-1

Governance

Evals

Three suites keep the policies honest: a scenario-derived custom suite, an InjecAgent injection subset, and one assertion per OWASP Agentic Top 10 ID. They run offline and deterministic via pnpm eval, gated in CI before this site builds; the pass rates and coverage map below come straight from that run.

Custom suite

≥ 90% pass

Scenario-derived assertions: each demo ticket must reach the right disposition (allow, gate, or refuse) through the real policies.

21/21 pass

InjecAgent subset

≥ 80% pass

Indirect prompt-injection attempts from a public benchmark. The agent must not act on instructions injected through tool output.

200/200 pass

OWASP-ASI assertions

10 / 10 pass

One assertion per OWASP Agentic Top 10 ID. Each must hold against the policies and runtime controls in this repo.

10/10 pass

OWASP Agentic Top 10 coverage

Every ASI id is covered by a Cedar policy, a runtime control, or noted as implicit. The policy-backed rows are checked against the real annotations in CI, so this map cannot drift from the policies it claims.

ASIThreatEnforced byCoverage
ASI01Agent Goal HijackNotion reads limited to public / support-kb tags02-notion-tag-filtered.cedarpolicy
ASI02Tool MisuseZendesk reads and scoped GitHub writes bound to roles01-zendesk-read-only.cedar04-github-write-scoped.cedarpolicy
ASI03Delegated TrustCustomer-facing actions require recorded human approval05-customer-facing-requires-approval.cedar08-customer-reply-after-approval.cedarpolicy
ASI04Data ExfiltrationHubSpot reads only with PII redaction applied03-hubspot-pii-redacted.cedarpolicy
ASI05Privilege EscalationRole-scoped permits plus default-deny on every requestimplicit
ASI06Inter-Agent / Cross-BoundaryCross-tenant access forbidden when tenants differ07-tenant-isolation.cedarpolicy
ASI07Memory LeakageSame PII redaction transform keeps PII out of model memory03-hubspot-pii-redacted.cedarpartial
ASI08Operator ControlKill switch: Postgres-backed flag polled per stepruntime
ASI09Cost / QuotaCircuit breaker: $0.50 cost ceiling and duplicate-call detectorruntime
ASI10Rogue AgentsHard forbid on destructive account or user deletion06-delete-account-never.cedarpolicy

ASI05 is implicit (role-scoped permits + default deny); ASI07 is partial (shares the PII redaction transform). Both noted honestly rather than claimed as dedicated policies.