RFC-0016: Human-in-the-Loop Protocol

Purpose

Specify when and how human reviewers are integrated into CAA decision flows, ensuring human oversight is deterministic, auditable, and fail-closed.

Without this specification, "escalate to human review" is a hand-wave. This RFC makes human review a governed channel, not an escape hatch.

The Problem

Human review is invoked throughout CAA:

Oracle conflicts that exceed escalation thresholds (RFC-0002)
Safety-critical domains where automation is prohibited
Emergency escalation for crisis scenarios (RFC-0011)
Inferred values requiring confirmation (RFC-0005)
Disputed claims where oracle data contradicts user input

But "human review" is underspecified:

Question	Current Answer
Who reviews?	Unspecified
What do they see?	Unspecified
What can they decide?	Unspecified
How long do they have?	Unspecified
What if they don't respond?	Unspecified
How is their decision audited?	Unspecified

This RFC provides normative answers.

Human Review Triggers

Human review is triggered when any of the following conditions occur:

Trigger	Source RFC	Condition
`oracle_conflict`	RFC-0002	Same-tier oracles disagree on same axis
`escalation_threshold`	RFC-0002	Value delta exceeds configured threshold
`always_human_axis`	RFC-0002	Axis is in `always_human_axes` list
`user_oracle_conflict`	RFC-0002	User self-report contradicts oracle data
`emergency_escalation`	RFC-0011	Crisis scenario detected (e.g., test_016)
`inferred_high_stakes`	RFC-0005	Inferred value in high-stakes domain
`cascade_limit_exceeded`	RFC-0002	Conflict resolution exceeded cascade depth
`policy_requires_human`	RFC-0010	Workflow configured for mandatory human review

Human Review Request

When human review is triggered, the system MUST create a HumanReviewRequest:

interface HumanReviewRequest {
  // Identity
  request_id: string; // UUID
  execution_id: string; // Parent workflow execution

  // Trigger
  trigger: HumanReviewTrigger;
  trigger_source: string; // RFC reference (e.g., "RFC-0002:oracle_conflict")

  // Context
  workflow_id: string;
  domain: string;
  risk_tier: "standard" | "elevated" | "critical" | "emergency";

  // Decision Required
  decision_type: HumanDecisionType;
  options: HumanDecisionOption[];

  // Evidence Package
  evidence: HumanReviewEvidence;

  // Timing
  created_at: string; // RFC 3339
  deadline: string; // RFC 3339 - when decision is required by
  timeout_behavior: TimeoutBehavior;

  // Assignment
  required_reviewer_role: string; // Role required (e.g., "medical_reviewer", "compliance_officer")
  assigned_to?: string; // Specific reviewer if pre-assigned
  escalation_chain: string[]; // Ordered list of fallback reviewers
}

type HumanReviewTrigger =
  | "oracle_conflict"
  | "escalation_threshold"
  | "always_human_axis"
  | "user_oracle_conflict"
  | "emergency_escalation"
  | "inferred_high_stakes"
  | "cascade_limit_exceeded"
  | "policy_requires_human";

type HumanDecisionType =
  | "select_value" // Choose between conflicting values
  | "confirm_inference" // Confirm or reject inferred value
  | "approve_action" // Approve high-stakes action
  | "resolve_conflict" // Resolve user vs. oracle conflict
  | "override_block" // Override a blocked request (restricted)
  | "emergency_response"; // Handle crisis scenario

Evidence Package

The human reviewer MUST receive sufficient context to make an informed decision:

interface HumanReviewEvidence {
  // User Context
  user_input: string; // Original user request
  user_id?: string; // Anonymized user identifier
  conversation_summary?: string; // Relevant context from conversation

  // State
  extracted_state: Record<string, unknown>;
  verified_state: Record<string, unknown>;
  disputed_axes: string[];

  // Oracle Data
  oracle_results: OracleEvidencePackage[];

  // Conflict Details (if applicable)
  conflict_details?: {
    axis: string;
    values: Array<{
      source: string;
      value: unknown;
      tier: OracleTier;
      verified_at: string;
    }>;
    delta?: number; // For numeric conflicts
  };

  // Risk Assessment
  risk_factors: string[];
  potential_harms: string[];

  // System Recommendation (if any)
  system_recommendation?: {
    recommended_decision: string;
    confidence: number;
    rationale: string;
  };
}

interface OracleEvidencePackage {
  oracle_id: string;
  oracle_name: string;
  tier: OracleTier;
  value: unknown;
  provenance: {
    source_url?: string;
    retrieved_at: string;
    verification_method: string;
  };
}

Evidence Presentation Rules:

No PII unless necessary — User identity is anonymized unless the decision requires it
Oracle data is primary — Always show oracle values before user claims
Risk factors are explicit — Never hide potential harms from reviewer
System recommendations are labeled — Clearly marked as non-binding

Human Decision Response

The reviewer responds with a HumanReviewDecision:

interface HumanReviewDecision {
  // Identity
  request_id: string; // Links to request
  decision_id: string; // UUID for this decision

  // Reviewer
  reviewer_id: string; // Who made the decision
  reviewer_role: string; // Their role
  reviewed_at: string; // RFC 3339

  // Decision
  decision: string; // Selected option key
  rationale: string; // Required explanation (min 20 chars)
  confidence: "high" | "medium" | "low";

  // Override (restricted)
  is_override: boolean; // True if overriding system recommendation
  override_justification?: string; // Required if is_override=true

  // Escalation
  escalate_further: boolean; // Request additional review
  escalation_reason?: string;

  // Attestation
  attested_review_complete: boolean; // Reviewer confirms full evidence review
  attestation_hash: string; // SHA-256 of evidence package at review time
}

Decision Constraints:

Rationale is mandatory — No decision without explanation
Override requires justification — Overriding system recommendation requires separate justification
Attestation required — Reviewer must confirm they reviewed the evidence
Evidence hash prevents tampering — Hash ensures evidence wasn't modified post-review

Decision Options

Each trigger type has standard decision options:

select_value (Oracle Conflict)

const options: HumanDecisionOption[] = [
  {
    key: "use_primary",
    label: "Use primary oracle value",
    requires_justification: false,
  },
  {
    key: "use_secondary",
    label: "Use secondary oracle value",
    requires_justification: true,
  },
  {
    key: "use_conservative",
    label: "Use more restrictive value",
    requires_justification: false,
  },
  {
    key: "reject_both",
    label: "Reject both - request new data",
    requires_justification: true,
  },
];

confirm_inference (Inferred Value)

const options: HumanDecisionOption[] = [
  {
    key: "confirm",
    label: "Confirm inferred value is correct",
    requires_justification: false,
  },
  {
    key: "correct",
    label: "Provide corrected value",
    requires_justification: true,
    requires_value: true,
  },
  {
    key: "reject",
    label: "Reject - insufficient evidence",
    requires_justification: true,
  },
];

resolve_conflict (User vs. Oracle)

const options: HumanDecisionOption[] = [
  {
    key: "trust_oracle",
    label: "Oracle data is authoritative",
    requires_justification: false,
  },
  {
    key: "trust_user",
    label: "User claim is valid (oracle outdated)",
    requires_justification: true,
  },
  {
    key: "flag_fraud",
    label: "Flag potential fraud/manipulation",
    requires_justification: true,
  },
  {
    key: "request_verification",
    label: "Request additional verification",
    requires_justification: false,
  },
];

emergency_response (Crisis Scenario)

const options: HumanDecisionOption[] = [
  {
    key: "escalate_911",
    label: "Escalate to emergency services",
    requires_justification: false,
  },
  {
    key: "provide_resources",
    label: "Provide crisis resources only",
    requires_justification: false,
  },
  {
    key: "false_positive",
    label: "Not a genuine crisis",
    requires_justification: true,
  },
];

Timeout and Escalation

Human review has strict timing requirements:

interface TimeoutPolicy {
  // Base timeouts by risk tier
  timeouts_by_tier: {
    standard: number; // Default: 24 hours (86400s)
    elevated: number; // Default: 4 hours (14400s)
    critical: number; // Default: 1 hour (3600s)
    emergency: number; // Default: 5 minutes (300s)
  };

  // Escalation
  escalation_intervals: number[]; // e.g., [0.5, 0.75, 0.9] of timeout
  notification_channels: string[]; // e.g., ["email", "sms", "slack"]

  // Timeout behavior
  on_timeout: TimeoutBehavior;
}

type TimeoutBehavior =
  | "escalate" // Move to next reviewer in chain
  | "auto_conservative" // Apply most conservative option
  | "fail_closed" // BLOCKED - no authorization
  | "extend" // Extend deadline (max 1x)
  | "auto_system"; // Use system recommendation (if available)

Timeout Escalation Flow:

50% of timeout: First reminder to assigned reviewer
75% of timeout: Escalate to next in chain, notify manager
90% of timeout: Final warning, prepare fallback
100% of timeout: Execute on_timeout behavior

Default Timeout Behaviors by Domain:

Domain	Default `on_timeout`
`medicine`	`fail_closed`
`law`	`fail_closed`
`finance`	`auto_conservative`
`engineering`	`fail_closed`
`nutrition`	`auto_conservative`
`general`	`escalate`

Reviewer Roles and Authority

Not all reviewers can make all decisions:

interface ReviewerRole {
  role_id: string;
  role_name: string;

  // Permissions
  can_review_domains: string[]; // Domains this role can review
  can_override: boolean; // Can override system recommendations
  can_approve_actions: boolean; // Can approve high-stakes actions
  max_risk_tier: "standard" | "elevated" | "critical" | "emergency";

  // Constraints
  requires_credentials: string[]; // Required certifications (e.g., "medical_license")
  requires_training: string[]; // Required training courses
  review_limit_per_day?: number; // Prevent reviewer fatigue
}

Standard Roles:

Role	Domains	Override	Max Tier	Notes
`general_reviewer`	`general`, `nutrition`	No	standard	Basic triage
`compliance_officer`	All	Yes	elevated	Policy enforcement
`medical_reviewer`	`medicine`	Yes	critical	Requires medical license
`legal_reviewer`	`law`	Yes	critical	Requires bar admission
`crisis_responder`	All	Yes	emergency	24/7 availability required
`super_admin`	All	Yes	emergency	Full authority, audit-logged

Audit Requirements

All human review activity MUST be logged:

interface HumanReviewAuditRecord {
  // Event
  event_type:
    | "request_created"
    | "assigned"
    | "viewed"
    | "decided"
    | "escalated"
    | "timeout";
  timestamp: string;

  // Links
  request_id: string;
  execution_id: string;
  reviewer_id?: string;

  // Content
  event_details: Record<string, unknown>;

  // Evidence integrity
  evidence_hash_at_event: string;

  // Compliance
  retention_required_until: string; // Based on domain retention policy
}

Audit Invariants:

Request creation is logged — Every HumanReviewRequest creates an audit record
Evidence access is logged — Every time a reviewer views evidence, it's logged
Decision is immutable — Once submitted, decisions cannot be modified
Timeout is logged — Every timeout event is recorded with fallback action
Retention per domain — Medical: 7 years, Finance: 5 years, Legal: 10 years, General: 1 year

Integration with Other RFCs

RFC	Integration Point
RFC-0002	`require_human` conflict strategy triggers HumanReviewRequest
RFC-0005	`REQUIRES_CONFIRMATION` for inferred values can route to human review
RFC-0006	State negotiation may escalate to human when ambiguous
RFC-0009	EvaluationEnvelope can have `status: "pending_human_review"`
RFC-0010	`RefusalReason.code: "PENDING_HUMAN_REVIEW"` for in-progress reviews
RFC-0011	`EMERGENCY_ESCALATION` triggers `emergency` risk tier review
Appendix-Audit	`workflow.human_review.queued` and `workflow.human_review.completed` events

Acceptance Criteria

A system is compliant with RFC-0016 if:

Human review triggers are deterministic and documented
Evidence packages contain all required fields
Decisions include rationale and attestation
Timeout behavior is fail-closed for high-stakes domains
Reviewer roles have explicit authority boundaries
All human review activity is auditable
Override decisions are separately logged and justified

Fail-Closed Requirements

Human review MUST fail closed:

Scenario	Behavior
No reviewer available	BLOCKED (not auto-approve)
Timeout without decision	`on_timeout` per domain (fail_closed for high-stakes)
Reviewer lacks authority	Escalate, do not auto-approve
Evidence tampering detected	BLOCKED + security alert
Reviewer attestation missing	Decision rejected, re-review required

Prohibited: Any configuration that auto-approves on human review timeout in medicine, law, finance, or engineering domains.