2024highAI Evaluation / Epistemic Integrity

Signal Without Semantics: llms.txt Existence as Proxy for Truth

System Description

LLM-based evaluation system tasked with assessing whether an AI product has authoritative grounding (e.g., USDA-validated nutrition data). The system detected the existence of an llms.txt file but did not retrieve or parse its contents.

Authoritative Output Type

Product categorization as grounded/authoritative vs. ungrounded/generic based on declared data sources

Missing Required State

llms_txt.content (parsed file contents); Evidence binding (content hash or excerpt reference in reasoning trace); Confidence calibration tied to telemetry completeness

Why This Is SAF

The system recognized a pointer (llms.txt existence) without binding the evidence (its contents). When the authoritative declaration ('USDA-validated nutrition') was not ingested, the system substituted priors about 'AI recipe tools' being typically ungrounded, producing a classification with the wrong sign—labeling a grounded product as generic. This is the canonical 'pointer without payload' failure: signal detection without payload ingestion, unlicensed inference substituted for declared state, confidence not conditioned on evidence coverage, and core claim reversal (wrong sign error).

Completeness Gate Question

Did you actually retrieve and parse the contents of llms.txt, or only detect its existence?

Documented Consequence

Incorrect classification of product as lacking authoritative grounding when the llms.txt explicitly declared USDA validation. The system produced confident output that directly contradicted the evidence it claimed to have referenced—the meta-irony being that the system behaved like an agent with no evidence-binding layer while evaluating a product designed to provide exactly that capability.

Notes

- **Verified**: 2024-12-21 - **Notes**: Directly observed failure demonstrating pointer-without-payload pattern. The enforcement primitive that would have prevented this: Required state = llms_txt.content (parsed); Refusal preference = if content is null, output 'I can see llms.txt exists but I do not have its contents; I cannot assess declared data sources.'

Prevent this in your system.

The completeness gate question above is exactly what Ontic checks before any claim gets out. No evidence, no emission.

Check Your Risk Read the Docs