RFC-0003: Model Selection & Training
Purpose
Define how models are selected, trained, and validated to ensure the simulator is fit for purpose before governance constrains its outputs.
This RFC addresses the upstream dependency: if the model is trained on non-authoritative data, downstream governance cannot compensate for systematic errors.
The Model Fitness Problem
CAA governs model outputs. But model quality determines:
- Extraction accuracy — Can the model correctly identify ontology axes in user input?
- Hallucination baseline — How often does the model fabricate plausible values?
- Domain coverage — Does the model understand domain-specific terminology?
- Instruction following — Does the model reliably follow derived system prompts?
A model unfit for the domain produces errors that governance must constantly reject, degrading user experience and increasing risk of false negatives.
Part I: Model Selection
Domain-Model Fitness Matrix
Every domain MUST declare minimum model requirements:
interface DomainModelRequirements {
domain_id: string;
ontology_id: string; // RFC-0001
// Capability requirements
minimum_capabilities: {
context_window: number; // Minimum tokens
structured_output: boolean; // JSON/tool calling support required
instruction_following: "basic" | "strong" | "strict";
multilingual?: string[]; // Required language support
};
// Performance requirements
performance_thresholds: {
extraction_accuracy: number; // Min accuracy on ontology axes (0-1)
hallucination_rate: number; // Max acceptable hallucination rate (0-1)
latency_p99_ms: number; // Max acceptable latency
};
// Evaluation requirements
evaluation_dataset_id: string; // Reference to domain-specific eval set
minimum_eval_score: number; // Threshold to pass evaluation
// Cost constraints (optional)
cost_constraints?: {
max_cost_per_request_usd: number;
max_monthly_budget_usd: number;
};
}
Model Registry
Approved models MUST be registered before use:
interface ModelRegistryEntry {
// Identity
model_id: string; // Stable identifier (e.g., "claude-3-opus-20240229")
provider: string; // e.g., "anthropic", "openai", "internal"
model_family: string; // e.g., "claude-3", "gpt-4"
// Capabilities
capabilities: {
context_window: number;
max_output_tokens: number;
structured_output: boolean;
tool_calling: boolean;
vision: boolean;
instruction_following: "basic" | "strong" | "strict";
languages: string[];
};
// Versioning
version: string;
release_date: string;
deprecation_date?: string;
// Governance
registered_at: string;
registered_by: string;
approval_status: "approved" | "pending" | "deprecated" | "prohibited";
// Domain approvals
domain_approvals: DomainApproval[];
}
interface DomainApproval {
domain_id: string;
approved: boolean;
eval_score: number;
eval_date: string;
eval_dataset_version: string;
notes?: string;
}
Model Selection Policy
Workflows MUST declare model selection criteria:
interface ModelSelectionPolicy {
workflow_id: string;
domain_id: string;
// Selection strategy
strategy: ModelSelectionStrategy;
// Fallback chain
fallback_models: string[]; // Ordered list of model_ids
fallback_behavior: "degrade" | "block"; // What to do if all models fail
// Routing rules (for multi-model setups)
routing_rules?: RoutingRule[];
}
type ModelSelectionStrategy =
| "fixed" // Always use specified model
| "capability_match" // Select based on request requirements
| "cost_optimized" // Cheapest model meeting requirements
| "latency_optimized" // Fastest model meeting requirements
| "quality_optimized"; // Highest eval score
interface RoutingRule {
condition: {
axis?: string; // Route based on ontology axis
complexity?: "low" | "medium" | "high";
authoritative_intent?: boolean;
};
model_id: string;
}
Selection Example (Medical Domain):
const medicalModelPolicy: ModelSelectionPolicy = {
workflow_id: "anticoagulation-guidance",
domain_id: "medical-warfarin",
strategy: "quality_optimized",
fallback_models: [
"claude-3-opus-20240229", // Primary: highest accuracy
"claude-3-sonnet-20240229", // Fallback: still approved for domain
// GPT-4 NOT in list: failed domain eval
],
fallback_behavior: "block", // Don't degrade for medical
routing_rules: [
{
condition: { authoritative_intent: false },
model_id: "claude-3-haiku-20240307", // Fast model for non-authoritative
},
],
};
Part II: Training Data Governance
The Contamination Problem
Models trained on internet data contain "linguistic knowledge" — patterns that sound correct but aren't verified. For consequential domains, training data MUST come from authoritative sources.
Contamination Risk by Domain:
| Domain | Contamination Risk | Mitigation |
|---|---|---|
| Nutrition | High — internet full of diet myths | Train only on USDA/regulatory data |
| Medical | Critical — misinformation widespread | Train only on peer-reviewed sources |
| Legal | High — varies by jurisdiction | Train only on verified case law |
| Finance | Medium — regulations change frequently | Train with version-dated sources |
Training Data Requirements
Fine-tuning datasets MUST satisfy provenance requirements:
interface TrainingDataset {
dataset_id: string;
domain_id: string;
ontology_id: string;
// Provenance
sources: TrainingSource[];
// Composition
record_count: number;
axis_coverage: Record<string, number>; // Axis → example count
// Quality
quality_metrics: {
human_verified_percentage: number;
oracle_derived_percentage: number; // Came from RFC-0002 oracles
synthetic_percentage: number; // Generated examples
};
// Versioning
version: string;
created_at: string;
checksum: string;
}
interface TrainingSource {
source_id: string;
oracle_id?: string; // Link to RFC-0002 oracle if applicable
source_type: "oracle" | "curated" | "synthetic" | "external";
record_count: number;
// For non-oracle sources
verification_method?: string;
verified_by?: string;
verification_date?: string;
}
Training Data Invariants:
oracle_derived_percentageMUST be ≥ 80% for authoritative domainssynthetic_percentageMUST be ≤ 10% for authoritative domains- All training examples MUST map to valid ontology axes
- Training data versions MUST be reproducible from source oracles
Synthetic Data Constraints
Synthetic training examples (LLM-generated) are permitted with constraints:
interface SyntheticDataPolicy {
permitted: boolean;
max_percentage: number; // Of total training set
// Generation constraints
generation_constraints: {
seed_from_oracle: boolean; // Must seed from real oracle data
human_review_required: boolean;
diversity_requirements: {
min_unique_axis_combinations: number;
max_repetition_rate: number;
};
};
// Labeling
synthetic_label_required: boolean; // Mark synthetic in dataset
}
Part III: Model Validation
Domain Evaluation Protocol
Before a model can be approved for a domain, it MUST pass evaluation:
interface DomainEvaluation {
evaluation_id: string;
model_id: string;
domain_id: string;
// Evaluation dataset
dataset_id: string;
dataset_version: string;
// Test configuration
test_config: {
sample_size: number;
sampling_strategy: "random" | "stratified" | "adversarial";
temperature: number;
num_runs: number; // For variance estimation
};
// Results
results: EvaluationResults;
// Metadata
evaluated_at: string;
evaluated_by: string;
approval_decision: "approved" | "rejected" | "conditional";
conditions?: string[];
}
interface EvaluationResults {
// Extraction accuracy
extraction: {
overall_accuracy: number;
per_axis_accuracy: Record<string, number>;
confusion_matrix?: Record<string, Record<string, number>>;
};
// Hallucination detection
hallucination: {
hallucination_rate: number; // % of responses with fabricated data
hallucination_by_axis: Record<string, number>;
false_confidence_rate: number; // High confidence on wrong answers
};
// Instruction following
instruction_following: {
format_compliance: number; // % following output format
constraint_adherence: number; // % respecting constraints
refusal_appropriateness: number; // Correct refusals when data missing
};
// Latency
latency: {
p50_ms: number;
p95_ms: number;
p99_ms: number;
};
}
Continuous Validation
Approved models MUST be continuously monitored:
interface ContinuousValidation {
model_id: string;
domain_id: string;
// Monitoring config
monitoring: {
sample_rate: number; // % of production traffic to evaluate
evaluation_frequency: "realtime" | "hourly" | "daily";
alert_thresholds: {
accuracy_drop: number; // Alert if accuracy drops by X%
hallucination_spike: number; // Alert if hallucination rate exceeds X%
latency_degradation: number; // Alert if p99 increases by X%
};
};
// Drift detection
drift_detection: {
baseline_eval_id: string; // Reference evaluation
drift_threshold: number; // Max acceptable drift from baseline
revalidation_trigger: "automatic" | "manual";
};
}
Part IV: Persona Configuration (Optional)
Persona Layer
Model persona (voice, personality) is a configuration layer distinct from governance:
interface PersonaConfiguration {
persona_id: string;
name: string; // e.g., "Goober"
// Voice characteristics
voice: {
formality: "casual" | "professional" | "clinical";
warmth: "warm" | "neutral" | "reserved";
verbosity: "concise" | "balanced" | "detailed";
};
// Communication style
style: {
use_first_person: boolean;
greeting_style?: string;
sign_off_style?: string;
emoji_permitted: boolean;
};
// Domain-specific overrides
domain_overrides?: Record<string, Partial<PersonaConfiguration["voice"]>>;
}
Persona Constraints
Persona MUST NOT violate governance:
interface PersonaConstraints {
// Persona cannot override governance
governance_precedence: true; // Always true, not configurable
// Forbidden behaviors
forbidden: {
soft_authority_in_personality: boolean; // Cannot embed medical advice in "friendly" tone
hedging_circumvention: boolean; // Cannot use persona to avoid disclaimers
authority_impersonation: boolean; // Cannot claim to be a doctor/lawyer/etc.
};
// Required behaviors
required: {
maintain_refusal_clarity: boolean; // Refusals must be clear regardless of persona
preserve_uncertainty_markers: boolean; // Uncertainty cannot be hidden by warmth
};
}
Persona Invariant:
Persona is cosmetic. Governance is structural. A warm, friendly persona MUST still emit clear refusals. A clinical persona MUST still include required disclaimers. Persona configuration MUST NOT affect authorization decisions.
Acceptance Criteria
A system is compliant with RFC-0002.5 if:
- All models are registered in a Model Registry before use
- Domains declare minimum model requirements
- Model selection follows declared policy with fallback chain
- Training data satisfies provenance requirements (≥80% oracle-derived for authoritative)
- Models pass domain evaluation before approval
- Continuous validation monitors drift and triggers revalidation
- Persona configuration does not override governance constraints
Relationship to Other RFCs
| RFC | Relationship |
|---|---|
| RFC-0001 | Ontology defines axes for training data alignment |
| RFC-0002 | Oracles provide authoritative training data |
| RFC-0003 | Derived prompts assume model can follow instructions |
| RFC-0004 | Extraction accuracy depends on model fitness |
| RFC-0007 | Model selection logic is opaque to model |
| RFC-0011 | Model drift is a form of system drift |