Skip to content
OnticBeta
RFC-0016

Human-in-the-Loop Protocol

draft

RFC-0016: Human-in-the-Loop Protocol

Purpose

Specify when and how human reviewers are integrated into CAA decision flows, ensuring human oversight is deterministic, auditable, and fail-closed.

Without this specification, "escalate to human review" is a hand-wave. This RFC makes human review a governed channel, not an escape hatch.


The Problem

Human review is invoked throughout CAA:

  • Oracle conflicts that exceed escalation thresholds (RFC-0002)
  • Safety-critical domains where automation is prohibited
  • Emergency escalation for crisis scenarios (RFC-0011)
  • Inferred values requiring confirmation (RFC-0005)
  • Disputed claims where oracle data contradicts user input

But "human review" is underspecified:

QuestionCurrent Answer
Who reviews?Unspecified
What do they see?Unspecified
What can they decide?Unspecified
How long do they have?Unspecified
What if they don't respond?Unspecified
How is their decision audited?Unspecified

This RFC provides normative answers.


Human Review Triggers

Human review is triggered when any of the following conditions occur:

TriggerSource RFCCondition
oracle_conflictRFC-0002Same-tier oracles disagree on same axis
escalation_thresholdRFC-0002Value delta exceeds configured threshold
always_human_axisRFC-0002Axis is in always_human_axes list
user_oracle_conflictRFC-0002User self-report contradicts oracle data
emergency_escalationRFC-0011Crisis scenario detected (e.g., test_016)
inferred_high_stakesRFC-0005Inferred value in high-stakes domain
cascade_limit_exceededRFC-0002Conflict resolution exceeded cascade depth
policy_requires_humanRFC-0010Workflow configured for mandatory human review

Human Review Request

When human review is triggered, the system MUST create a HumanReviewRequest:

interface HumanReviewRequest {
  // Identity
  request_id: string; // UUID
  execution_id: string; // Parent workflow execution

  // Trigger
  trigger: HumanReviewTrigger;
  trigger_source: string; // RFC reference (e.g., "RFC-0002:oracle_conflict")

  // Context
  workflow_id: string;
  domain: string;
  risk_tier: "standard" | "elevated" | "critical" | "emergency";

  // Decision Required
  decision_type: HumanDecisionType;
  options: HumanDecisionOption[];

  // Evidence Package
  evidence: HumanReviewEvidence;

  // Timing
  created_at: string; // RFC 3339
  deadline: string; // RFC 3339 - when decision is required by
  timeout_behavior: TimeoutBehavior;

  // Assignment
  required_reviewer_role: string; // Role required (e.g., "medical_reviewer", "compliance_officer")
  assigned_to?: string; // Specific reviewer if pre-assigned
  escalation_chain: string[]; // Ordered list of fallback reviewers
}

type HumanReviewTrigger =
  | "oracle_conflict"
  | "escalation_threshold"
  | "always_human_axis"
  | "user_oracle_conflict"
  | "emergency_escalation"
  | "inferred_high_stakes"
  | "cascade_limit_exceeded"
  | "policy_requires_human";

type HumanDecisionType =
  | "select_value" // Choose between conflicting values
  | "confirm_inference" // Confirm or reject inferred value
  | "approve_action" // Approve high-stakes action
  | "resolve_conflict" // Resolve user vs. oracle conflict
  | "override_block" // Override a blocked request (restricted)
  | "emergency_response"; // Handle crisis scenario

Evidence Package

The human reviewer MUST receive sufficient context to make an informed decision:

interface HumanReviewEvidence {
  // User Context
  user_input: string; // Original user request
  user_id?: string; // Anonymized user identifier
  conversation_summary?: string; // Relevant context from conversation

  // State
  extracted_state: Record<string, unknown>;
  verified_state: Record<string, unknown>;
  disputed_axes: string[];

  // Oracle Data
  oracle_results: OracleEvidencePackage[];

  // Conflict Details (if applicable)
  conflict_details?: {
    axis: string;
    values: Array<{
      source: string;
      value: unknown;
      tier: OracleTier;
      verified_at: string;
    }>;
    delta?: number; // For numeric conflicts
  };

  // Risk Assessment
  risk_factors: string[];
  potential_harms: string[];

  // System Recommendation (if any)
  system_recommendation?: {
    recommended_decision: string;
    confidence: number;
    rationale: string;
  };
}

interface OracleEvidencePackage {
  oracle_id: string;
  oracle_name: string;
  tier: OracleTier;
  value: unknown;
  provenance: {
    source_url?: string;
    retrieved_at: string;
    verification_method: string;
  };
}

Evidence Presentation Rules:

  1. No PII unless necessary — User identity is anonymized unless the decision requires it
  2. Oracle data is primary — Always show oracle values before user claims
  3. Risk factors are explicit — Never hide potential harms from reviewer
  4. System recommendations are labeled — Clearly marked as non-binding

Human Decision Response

The reviewer responds with a HumanReviewDecision:

interface HumanReviewDecision {
  // Identity
  request_id: string; // Links to request
  decision_id: string; // UUID for this decision

  // Reviewer
  reviewer_id: string; // Who made the decision
  reviewer_role: string; // Their role
  reviewed_at: string; // RFC 3339

  // Decision
  decision: string; // Selected option key
  rationale: string; // Required explanation (min 20 chars)
  confidence: "high" | "medium" | "low";

  // Override (restricted)
  is_override: boolean; // True if overriding system recommendation
  override_justification?: string; // Required if is_override=true

  // Escalation
  escalate_further: boolean; // Request additional review
  escalation_reason?: string;

  // Attestation
  attested_review_complete: boolean; // Reviewer confirms full evidence review
  attestation_hash: string; // SHA-256 of evidence package at review time
}

Decision Constraints:

  1. Rationale is mandatory — No decision without explanation
  2. Override requires justification — Overriding system recommendation requires separate justification
  3. Attestation required — Reviewer must confirm they reviewed the evidence
  4. Evidence hash prevents tampering — Hash ensures evidence wasn't modified post-review

Decision Options

Each trigger type has standard decision options:

select_value (Oracle Conflict)

const options: HumanDecisionOption[] = [
  {
    key: "use_primary",
    label: "Use primary oracle value",
    requires_justification: false,
  },
  {
    key: "use_secondary",
    label: "Use secondary oracle value",
    requires_justification: true,
  },
  {
    key: "use_conservative",
    label: "Use more restrictive value",
    requires_justification: false,
  },
  {
    key: "reject_both",
    label: "Reject both - request new data",
    requires_justification: true,
  },
];

confirm_inference (Inferred Value)

const options: HumanDecisionOption[] = [
  {
    key: "confirm",
    label: "Confirm inferred value is correct",
    requires_justification: false,
  },
  {
    key: "correct",
    label: "Provide corrected value",
    requires_justification: true,
    requires_value: true,
  },
  {
    key: "reject",
    label: "Reject - insufficient evidence",
    requires_justification: true,
  },
];

resolve_conflict (User vs. Oracle)

const options: HumanDecisionOption[] = [
  {
    key: "trust_oracle",
    label: "Oracle data is authoritative",
    requires_justification: false,
  },
  {
    key: "trust_user",
    label: "User claim is valid (oracle outdated)",
    requires_justification: true,
  },
  {
    key: "flag_fraud",
    label: "Flag potential fraud/manipulation",
    requires_justification: true,
  },
  {
    key: "request_verification",
    label: "Request additional verification",
    requires_justification: false,
  },
];

emergency_response (Crisis Scenario)

const options: HumanDecisionOption[] = [
  {
    key: "escalate_911",
    label: "Escalate to emergency services",
    requires_justification: false,
  },
  {
    key: "provide_resources",
    label: "Provide crisis resources only",
    requires_justification: false,
  },
  {
    key: "false_positive",
    label: "Not a genuine crisis",
    requires_justification: true,
  },
];

Timeout and Escalation

Human review has strict timing requirements:

interface TimeoutPolicy {
  // Base timeouts by risk tier
  timeouts_by_tier: {
    standard: number; // Default: 24 hours (86400s)
    elevated: number; // Default: 4 hours (14400s)
    critical: number; // Default: 1 hour (3600s)
    emergency: number; // Default: 5 minutes (300s)
  };

  // Escalation
  escalation_intervals: number[]; // e.g., [0.5, 0.75, 0.9] of timeout
  notification_channels: string[]; // e.g., ["email", "sms", "slack"]

  // Timeout behavior
  on_timeout: TimeoutBehavior;
}

type TimeoutBehavior =
  | "escalate" // Move to next reviewer in chain
  | "auto_conservative" // Apply most conservative option
  | "fail_closed" // BLOCKED - no authorization
  | "extend" // Extend deadline (max 1x)
  | "auto_system"; // Use system recommendation (if available)

Timeout Escalation Flow:

  1. 50% of timeout: First reminder to assigned reviewer
  2. 75% of timeout: Escalate to next in chain, notify manager
  3. 90% of timeout: Final warning, prepare fallback
  4. 100% of timeout: Execute on_timeout behavior

Default Timeout Behaviors by Domain:

DomainDefault on_timeout
medicinefail_closed
lawfail_closed
financeauto_conservative
engineeringfail_closed
nutritionauto_conservative
generalescalate

Reviewer Roles and Authority

Not all reviewers can make all decisions:

interface ReviewerRole {
  role_id: string;
  role_name: string;

  // Permissions
  can_review_domains: string[]; // Domains this role can review
  can_override: boolean; // Can override system recommendations
  can_approve_actions: boolean; // Can approve high-stakes actions
  max_risk_tier: "standard" | "elevated" | "critical" | "emergency";

  // Constraints
  requires_credentials: string[]; // Required certifications (e.g., "medical_license")
  requires_training: string[]; // Required training courses
  review_limit_per_day?: number; // Prevent reviewer fatigue
}

Standard Roles:

RoleDomainsOverrideMax TierNotes
general_reviewergeneral, nutritionNostandardBasic triage
compliance_officerAllYeselevatedPolicy enforcement
medical_reviewermedicineYescriticalRequires medical license
legal_reviewerlawYescriticalRequires bar admission
crisis_responderAllYesemergency24/7 availability required
super_adminAllYesemergencyFull authority, audit-logged

Audit Requirements

All human review activity MUST be logged:

interface HumanReviewAuditRecord {
  // Event
  event_type:
    | "request_created"
    | "assigned"
    | "viewed"
    | "decided"
    | "escalated"
    | "timeout";
  timestamp: string;

  // Links
  request_id: string;
  execution_id: string;
  reviewer_id?: string;

  // Content
  event_details: Record<string, unknown>;

  // Evidence integrity
  evidence_hash_at_event: string;

  // Compliance
  retention_required_until: string; // Based on domain retention policy
}

Audit Invariants:

  1. Request creation is logged — Every HumanReviewRequest creates an audit record
  2. Evidence access is logged — Every time a reviewer views evidence, it's logged
  3. Decision is immutable — Once submitted, decisions cannot be modified
  4. Timeout is logged — Every timeout event is recorded with fallback action
  5. Retention per domain — Medical: 7 years, Finance: 5 years, Legal: 10 years, General: 1 year

Integration with Other RFCs

RFCIntegration Point
RFC-0002require_human conflict strategy triggers HumanReviewRequest
RFC-0005REQUIRES_CONFIRMATION for inferred values can route to human review
RFC-0006State negotiation may escalate to human when ambiguous
RFC-0009EvaluationEnvelope can have status: "pending_human_review"
RFC-0010RefusalReason.code: "PENDING_HUMAN_REVIEW" for in-progress reviews
RFC-0011EMERGENCY_ESCALATION triggers emergency risk tier review
Appendix-Auditworkflow.human_review.queued and workflow.human_review.completed events

Acceptance Criteria

A system is compliant with RFC-0016 if:

  1. Human review triggers are deterministic and documented
  2. Evidence packages contain all required fields
  3. Decisions include rationale and attestation
  4. Timeout behavior is fail-closed for high-stakes domains
  5. Reviewer roles have explicit authority boundaries
  6. All human review activity is auditable
  7. Override decisions are separately logged and justified

Fail-Closed Requirements

Human review MUST fail closed:

ScenarioBehavior
No reviewer availableBLOCKED (not auto-approve)
Timeout without decisionon_timeout per domain (fail_closed for high-stakes)
Reviewer lacks authorityEscalate, do not auto-approve
Evidence tampering detectedBLOCKED + security alert
Reviewer attestation missingDecision rejected, re-review required

Prohibited: Any configuration that auto-approves on human review timeout in medicine, law, finance, or engineering domains.