C18 New in v1.3 Layer 2 - Runtime Enforcement

Data Quality Gates

Retrieval-time freshness, confidence, and provenance checks on memory items. Memory Gateway emits a quality_decision per retrieval; tier-aware action matrix governs allow / flag / deny.

Why

Without this control, an agent passes every GATE check on the path from prompt to action while operating on stored content that is stale, low-confidence, or unverifiable. The failure mode is well-governed wrongness: the policy decision is correct given the input, the tool call is signed and audited, the replay is reproducible, but the input itself was already untrustworthy when the Memory Gateway returned it. This happens through three common routes. A vector store entry is years old and describes a process, price, or policy that has since changed; retrieval returns it because the embedding still matches; the agent acts on it. An upstream pipeline writes content with no provenance reference (a scraped document, an unattributed PDF, a transcription with no source link); the Memory Gateway accepts the write because it is not a provenance check on writes - only on reads - and the read returns content whose origin cannot be verified. A model upstream has hallucinated content into a knowledge base during a batch generation step and the hallucination has now propagated as if it were ground truth. Prompt-based constraints fail because the model cannot reliably introspect the quality of retrieved content. C08 fails because C08 is about adversarial isolation, not quality: a benign-looking but stale document passes every injection defence. C09 invariants fail because invariants are boolean rules over the tool request, not over the retrieved context. C10 replay fails because replay reproduces the behaviour given the same inputs - it does not validate the inputs. The retrieval boundary is the last point in the control plane where minimum quality gates can be applied before content reaches the model. This control reduces risk by enforcing freshness, confidence, and provenance gates at retrieval time inside the Memory Gateway, and by surfacing low-quality retrievals as evidence and as obligations rather than silently passing them.

What

A retrieval-time quality gate implemented as a hardened module inside the Memory Gateway. The gate evaluates every retrieved item against three quality dimensions before the Memory Gateway returns the item to the agent runtime. Freshness Each stored item carries a freshness reference: either a document timestamp captured at write time or a freshness assertion produced by the upstream pipeline. At retrieval time, the gate computes the item’s age against the calling tool or context’s configured freshness TTL. The TTL is configured per content class (legal text, product pricing, internal policy, public reference) and lives in a quality bundle separate from the policy bundle. Items older than TTL are either denied, downgraded (returned with a stale=true flag that downstream consumers must respect), or routed for HITL review depending on tier. Confidence Each stored item carries a confidence score recorded at write time. The score’s semantics are defined by the upstream pipeline (retrieval similarity threshold at ingest, source reputation score, manual verification flag); GATE does not produce the score, it enforces the threshold. At retrieval time, the gate compares the score against a configured minimum and either denies the retrieval or returns it with a low_confidence=true flag. Provenance reference Each stored item carries a provenance_uri and a provenance_hash. The gate denies retrievals where the provenance reference is missing, unresolvable, or fails hash verification against the recorded source. This is distinct from C03 artifact integrity (which covers code and policy artifacts) and from poisoning detection (which is about adversarial writes). Provenance enforcement here is about the existence and verifiability of a citable source, not about its trustworthiness. Invariants the control guarantees: • No retrieval is returned to the agent runtime without either passing all three gates or carrying explicit quality flags that downstream consumers can see. • Quality decisions are recorded in the same evidence stream as policy decisions (C11 ledger, C10 replay trace) and are correlated to the originating tool or context request. • The quality bundle is versioned and signed (C03) and the bundle hash is included in every quality decision record. The gate produces one of four outcomes per retrieval: pass, flag (return with quality flags set), downgrade (return a reduced subset of fields, suppressing the body where confidence is too low to display but high enough to acknowledge existence), or deny.

How

Control-plane flow The Memory Gateway evaluates ACL and TTL (existing behaviour), then runs poisoning detection (existing), then invokes the quality gate. The quality gate reads the item’s metadata (timestamp, confidence, provenance), evaluates each dimension against the configured thresholds in the quality bundle, and emits a gate.memory.quality_decision event into the ledger. The decision and the item are returned to the agent runtime, with flags set as the decision requires. Deployment The quality gate runs in-process inside the Memory Gateway, not as a separate service. Adding a second network hop on the retrieval path would impose latency on every memory read. The gate consumes a quality bundle loaded at startup (and reloaded on bundle update). The quality bundle is a versioned artifact: content_class to TTL mapping, content_class to minimum_confidence mapping, provenance_required flag per content_class, and the action matrix (pass, flag, downgrade, deny) keyed by autonomy tier and content class. Safe rollout Begin in flag-only mode. The gate evaluates every retrieval and emits decisions, but no retrievals are denied or downgraded - only flagged. This baseline establishes the false-deny rate (retrievals that would have been denied) and the data quality posture across content classes. After two weeks of flag-only operation, promote to enforce mode for one content class at a time, starting with classes whose downstream tools are read-only or low-impact, and ending with classes whose downstream tools are financial or production-write. Testing For each quality dimension, maintain CI tests with positive and negative cases. Positive freshness: a synthetic item with a timestamp older than TTL produces a deny or flag (per tier). Negative freshness: an item within TTL passes cleanly. Positive confidence: an item below threshold is flagged or denied. Positive provenance: an item with missing provenance_uri is denied. Tests run on every quality bundle change. Interaction with C08 C08 runs after C18 on the retrieval path. The order matters: C18 first decides whether the item is fit for retrieval at all; C08 then decides whether the item’s content is safe to admit into the prompt channel. A stale document is denied by C18 before C08 ever sees it. A fresh, high-confidence, properly provenanced document with embedded injection content passes C18 and is caught by C08. Interaction with the autonomy tier Sandbox tier may run with flag-only enforcement. Bounded tier requires enforcement on freshness and confidence; provenance may be flag-only. High-privilege tier requires enforcement on all three dimensions. Tier-specific behaviour is expressed in the action matrix in the quality bundle. Interaction with HITL A quality-gate failure may produce an HITL obligation, but only for retrievals destined for high-impact tool categories. Routing all flagged retrievals to HITL produces approval fatigue (the same failure mode noted in C09). Default behaviour for bounded tier on a flag outcome is to log and return with the flag set; HITL is reserved for deny outcomes on high-privilege tier, where the retrieval would otherwise have been blocked.

Evidence

gate.memory.quality_decision event per retrieval: timestamp, request_hash, item_id, content_class, freshness_age_seconds, confidence_score, provenance_uri, provenance_hash_verified (bool), quality_bundle_hash, outcome (pass, flag, downgrade, deny), flags_set, trace_id, ledger_event_id. Quality bundle change log: signed bundle hash per version with approver identity and change rationale. Coverage metric: percent of memory retrievals with a recorded quality decision. Target 100% for bounded and high-privilege tiers.

  • Quality posture report: distribution of outcomes per content class, computed daily, used to track data quality drift over time.
  • Stale-retrieval rate: percent of retrievals returning items older than TTL but not denied (flag-only or downgraded). Tracked as a leading indicator of upstream pipeline staleness.
  • Provenance failure rate: percent of retrievals where provenance hash verification failed. A non-zero rate indicates upstream pipeline drift or tampering.

Failure modes

Quality scores not produced upstream. Stored items arrive at the Memory Gateway with no confidence score or no timestamp because the upstream pipeline does not produce them. The gate cannot evaluate and defaults to pass. Mitigation: the Memory Gateway rejects writes from sources that do not produce required quality metadata, treating the absence of metadata as a write-time violation. This pushes the responsibility back upstream where it belongs. TTL set globally rather than per content class. A single TTL applied across all content produces either too many false denies (where the content is naturally long-lived, like legal text) or too many stale passes (where the content turns over rapidly, like pricing). Mitigation: content_class taxonomy is required at write time; TTL is per-class in the quality bundle. Confidence threshold treated as accuracy. The confidence score is a property of the retrieval or the source, not of the content’s correctness. A high-confidence retrieval of a confidently-stated falsehood still produces wrong outputs. Mitigation: documentation makes clear that C18 is a minimum quality floor, not a correctness guarantee; data quality assurance remains an upstream responsibility. Provenance check satisfied by self-reference. An ingestion pipeline that writes its own document IDs as the provenance reference satisfies hash verification trivially while providing no real provenance. Mitigation: provenance_uri schema requires a resolvable external reference (HTTP URI, signed source identifier, or a registered upstream pipeline identity); self-referential URIs fail validation. Flag-only mode left permanent. The same failure mode as C09: the control is deployed in observe-only and never promoted because promotion would reveal coverage gaps. Mitigation: documented promotion criteria and an executive escalation if flag-only persists beyond a defined window. Downstream consumers ignore flags. The gate flags a retrieval but the agent runtime or the prompt template does not honour the flag, so the flagged content reaches the model anyway. Mitigation: the agent runtime treats flagged content as a structured field, not as inline text; the prompt template surfaces flags explicitly; C13 semantic traces capture whether the agent acknowledged the flag in its reasoning category. Quality bundle change without policy review. The quality bundle becomes a back door to relax controls without going through policy review. Mitigation: quality bundle changes are subject to the same change control as invariant bundle changes in C09, with signed approvals and a separate review path. NIST AI RMF alignment C18 maps to MEASURE and MANAGE. MEASURE: the control implements MS-2.10 (data quality is monitored) and MS-4 (feedback from operations is integrated into the AI system). MANAGE: the control implements MG-3 (risks from third-party entities are managed) by treating upstream content sources as third parties whose output is gated at the retrieval boundary. Rationale: retrieval-time minimum quality enforcement. ISO/IEC 42001 alignment C18 maps to A.7.4 (quality of data used in AI systems) and A.7.5 (data provenance), with a supporting link to A.8.3 (information for interested parties) via the quality decision evidence stream. Typical evidence produced: quality decision logs, quality bundle versions, content-class TTL configuration, provenance verification reports.

NIST AI RMF alignment

C18 maps to MEASURE and MANAGE. See the framework paper for the specific subcontrol mappings.

ISO/IEC alignment

C18 maps to ISO/IEC 27001 and ISO/IEC 42001. Typical evidence: see the Evidence section above.

Contents
On this page
All controls