RFC-0003: Model Selection & Training

Purpose

Define how models are selected, trained, and validated to ensure the simulator is fit for purpose before governance constrains its outputs.

This RFC addresses the upstream dependency: if the model is trained on non-authoritative data, downstream governance cannot compensate for systematic errors.

The Model Fitness Problem

CAA governs model outputs. But model quality determines:

Extraction accuracy — Can the model correctly identify ontology axes in user input?
Hallucination baseline — How often does the model fabricate plausible values?
Domain coverage — Does the model understand domain-specific terminology?
Instruction following — Does the model reliably follow derived system prompts?

A model unfit for the domain produces errors that governance must constantly reject, degrading user experience and increasing risk of false negatives.

Part I: Model Selection

Domain-Model Fitness Matrix

Every domain MUST declare minimum model requirements:

interface DomainModelRequirements {
  domain_id: string;
  ontology_id: string; // RFC-0001

  // Capability requirements
  minimum_capabilities: {
    context_window: number; // Minimum tokens
    structured_output: boolean; // JSON/tool calling support required
    instruction_following: "basic" | "strong" | "strict";
    multilingual?: string[]; // Required language support
  };

  // Performance requirements
  performance_thresholds: {
    extraction_accuracy: number; // Min accuracy on ontology axes (0-1)
    hallucination_rate: number; // Max acceptable hallucination rate (0-1)
    latency_p99_ms: number; // Max acceptable latency
  };

  // Evaluation requirements
  evaluation_dataset_id: string; // Reference to domain-specific eval set
  minimum_eval_score: number; // Threshold to pass evaluation

  // Cost constraints (optional)
  cost_constraints?: {
    max_cost_per_request_usd: number;
    max_monthly_budget_usd: number;
  };
}

Model Registry

Approved models MUST be registered before use:

interface ModelRegistryEntry {
  // Identity
  model_id: string; // Stable identifier (e.g., "claude-3-opus-20240229")
  provider: string; // e.g., "anthropic", "openai", "internal"
  model_family: string; // e.g., "claude-3", "gpt-4"

  // Capabilities
  capabilities: {
    context_window: number;
    max_output_tokens: number;
    structured_output: boolean;
    tool_calling: boolean;
    vision: boolean;
    instruction_following: "basic" | "strong" | "strict";
    languages: string[];
  };

  // Versioning
  version: string;
  release_date: string;
  deprecation_date?: string;

  // Governance
  registered_at: string;
  registered_by: string;
  approval_status: "approved" | "pending" | "deprecated" | "prohibited";

  // Domain approvals
  domain_approvals: DomainApproval[];
}

interface DomainApproval {
  domain_id: string;
  approved: boolean;
  eval_score: number;
  eval_date: string;
  eval_dataset_version: string;
  notes?: string;
}

Model Selection Policy

Workflows MUST declare model selection criteria:

interface ModelSelectionPolicy {
  workflow_id: string;
  domain_id: string;

  // Selection strategy
  strategy: ModelSelectionStrategy;

  // Fallback chain
  fallback_models: string[]; // Ordered list of model_ids
  fallback_behavior: "degrade" | "block"; // What to do if all models fail

  // Routing rules (for multi-model setups)
  routing_rules?: RoutingRule[];
}

type ModelSelectionStrategy =
  | "fixed" // Always use specified model
  | "capability_match" // Select based on request requirements
  | "cost_optimized" // Cheapest model meeting requirements
  | "latency_optimized" // Fastest model meeting requirements
  | "quality_optimized"; // Highest eval score

interface RoutingRule {
  condition: {
    axis?: string; // Route based on ontology axis
    complexity?: "low" | "medium" | "high";
    authoritative_intent?: boolean;
  };
  model_id: string;
}

Selection Example (Medical Domain):

const medicalModelPolicy: ModelSelectionPolicy = {
  workflow_id: "anticoagulation-guidance",
  domain_id: "medical-warfarin",

  strategy: "quality_optimized",

  fallback_models: [
    "claude-3-opus-20240229", // Primary: highest accuracy
    "claude-3-sonnet-20240229", // Fallback: still approved for domain
    // GPT-4 NOT in list: failed domain eval
  ],
  fallback_behavior: "block", // Don't degrade for medical

  routing_rules: [
    {
      condition: { authoritative_intent: false },
      model_id: "claude-3-haiku-20240307", // Fast model for non-authoritative
    },
  ],
};

Part II: Training Data Governance

The Contamination Problem

Models trained on internet data contain "linguistic knowledge" — patterns that sound correct but aren't verified. For consequential domains, training data MUST come from authoritative sources.

Contamination Risk by Domain:

Domain	Contamination Risk	Mitigation
Nutrition	High — internet full of diet myths	Train only on USDA/regulatory data
Medical	Critical — misinformation widespread	Train only on peer-reviewed sources
Legal	High — varies by jurisdiction	Train only on verified case law
Finance	Medium — regulations change frequently	Train with version-dated sources

Training Data Requirements

Fine-tuning datasets MUST satisfy provenance requirements:

interface TrainingDataset {
  dataset_id: string;
  domain_id: string;
  ontology_id: string;

  // Provenance
  sources: TrainingSource[];

  // Composition
  record_count: number;
  axis_coverage: Record<string, number>; // Axis → example count

  // Quality
  quality_metrics: {
    human_verified_percentage: number;
    oracle_derived_percentage: number; // Came from RFC-0002 oracles
    synthetic_percentage: number; // Generated examples
  };

  // Versioning
  version: string;
  created_at: string;
  checksum: string;
}

interface TrainingSource {
  source_id: string;
  oracle_id?: string; // Link to RFC-0002 oracle if applicable
  source_type: "oracle" | "curated" | "synthetic" | "external";
  record_count: number;

  // For non-oracle sources
  verification_method?: string;
  verified_by?: string;
  verification_date?: string;
}

Training Data Invariants:

oracle_derived_percentage MUST be ≥ 80% for authoritative domains
synthetic_percentage MUST be ≤ 10% for authoritative domains
All training examples MUST map to valid ontology axes
Training data versions MUST be reproducible from source oracles

Synthetic Data Constraints

Synthetic training examples (LLM-generated) are permitted with constraints:

interface SyntheticDataPolicy {
  permitted: boolean;
  max_percentage: number; // Of total training set

  // Generation constraints
  generation_constraints: {
    seed_from_oracle: boolean; // Must seed from real oracle data
    human_review_required: boolean;
    diversity_requirements: {
      min_unique_axis_combinations: number;
      max_repetition_rate: number;
    };
  };

  // Labeling
  synthetic_label_required: boolean; // Mark synthetic in dataset
}

Part III: Model Validation

Domain Evaluation Protocol

Before a model can be approved for a domain, it MUST pass evaluation:

interface DomainEvaluation {
  evaluation_id: string;
  model_id: string;
  domain_id: string;

  // Evaluation dataset
  dataset_id: string;
  dataset_version: string;

  // Test configuration
  test_config: {
    sample_size: number;
    sampling_strategy: "random" | "stratified" | "adversarial";
    temperature: number;
    num_runs: number; // For variance estimation
  };

  // Results
  results: EvaluationResults;

  // Metadata
  evaluated_at: string;
  evaluated_by: string;
  approval_decision: "approved" | "rejected" | "conditional";
  conditions?: string[];
}

interface EvaluationResults {
  // Extraction accuracy
  extraction: {
    overall_accuracy: number;
    per_axis_accuracy: Record<string, number>;
    confusion_matrix?: Record<string, Record<string, number>>;
  };

  // Hallucination detection
  hallucination: {
    hallucination_rate: number; // % of responses with fabricated data
    hallucination_by_axis: Record<string, number>;
    false_confidence_rate: number; // High confidence on wrong answers
  };

  // Instruction following
  instruction_following: {
    format_compliance: number; // % following output format
    constraint_adherence: number; // % respecting constraints
    refusal_appropriateness: number; // Correct refusals when data missing
  };

  // Latency
  latency: {
    p50_ms: number;
    p95_ms: number;
    p99_ms: number;
  };
}

Continuous Validation

Approved models MUST be continuously monitored:

interface ContinuousValidation {
  model_id: string;
  domain_id: string;

  // Monitoring config
  monitoring: {
    sample_rate: number; // % of production traffic to evaluate
    evaluation_frequency: "realtime" | "hourly" | "daily";
    alert_thresholds: {
      accuracy_drop: number; // Alert if accuracy drops by X%
      hallucination_spike: number; // Alert if hallucination rate exceeds X%
      latency_degradation: number; // Alert if p99 increases by X%
    };
  };

  // Drift detection
  drift_detection: {
    baseline_eval_id: string; // Reference evaluation
    drift_threshold: number; // Max acceptable drift from baseline
    revalidation_trigger: "automatic" | "manual";
  };
}

Part IV: Persona Configuration (Optional)

Persona Layer

Model persona (voice, personality) is a configuration layer distinct from governance:

interface PersonaConfiguration {
  persona_id: string;
  name: string; // e.g., "Goober"

  // Voice characteristics
  voice: {
    formality: "casual" | "professional" | "clinical";
    warmth: "warm" | "neutral" | "reserved";
    verbosity: "concise" | "balanced" | "detailed";
  };

  // Communication style
  style: {
    use_first_person: boolean;
    greeting_style?: string;
    sign_off_style?: string;
    emoji_permitted: boolean;
  };

  // Domain-specific overrides
  domain_overrides?: Record<string, Partial<PersonaConfiguration["voice"]>>;
}

Persona Constraints

Persona MUST NOT violate governance:

interface PersonaConstraints {
  // Persona cannot override governance
  governance_precedence: true; // Always true, not configurable

  // Forbidden behaviors
  forbidden: {
    soft_authority_in_personality: boolean; // Cannot embed medical advice in "friendly" tone
    hedging_circumvention: boolean; // Cannot use persona to avoid disclaimers
    authority_impersonation: boolean; // Cannot claim to be a doctor/lawyer/etc.
  };

  // Required behaviors
  required: {
    maintain_refusal_clarity: boolean; // Refusals must be clear regardless of persona
    preserve_uncertainty_markers: boolean; // Uncertainty cannot be hidden by warmth
  };
}

Persona Invariant:

Persona is cosmetic. Governance is structural. A warm, friendly persona MUST still emit clear refusals. A clinical persona MUST still include required disclaimers. Persona configuration MUST NOT affect authorization decisions.

Acceptance Criteria

A system is compliant with RFC-0002.5 if:

All models are registered in a Model Registry before use
Domains declare minimum model requirements
Model selection follows declared policy with fallback chain
Training data satisfies provenance requirements (≥80% oracle-derived for authoritative)
Models pass domain evaluation before approval
Continuous validation monitors drift and triggers revalidation
Persona configuration does not override governance constraints

Relationship to Other RFCs

RFC	Relationship
RFC-0001	Ontology defines axes for training data alignment
RFC-0002	Oracles provide authoritative training data
RFC-0003	Derived prompts assume model can follow instructions
RFC-0004	Extraction accuracy depends on model fitness
RFC-0007	Model selection logic is opaque to model
RFC-0011	Model drift is a form of system drift