Author's Note: This post describes a system that is designed to ship, not a thought experiment. Wherever possible, enforcement happens outside the model: receipts are injected and verified by the runtime, modes are switched deterministically by the system, and the model is never trusted to invent authority. The goal is not to eliminate uncertainty, but to make epistemic state visible—so humans can decide when an answer is safe to rely on.

TL;DR

We trained a single 8B model with two deterministic personas: a casual helper (Goober) and a governed, audit-safe mode (Goober at Work, a Principal AI Safety Engineer).
In Goober at Work mode, persona is not tone — it is governance. Every factual claim must declare its epistemic status.
Claims are either AUTHORIZED (grounded in external systems with receipts) or UNVERIFIED (generated, best-effort). There is no hidden authority.
Mode switching is external and deterministic. The model does not decide when to be careful.
This makes epistemic boundaries visible to users, reducing hallucinated specificity while preserving usefulness.

Goober at Work: Using Persona as an Epistemic Control Plane

We trained our own 8B model. That's not the interesting part.

What matters is how we control what the model is allowed to claim, and how users can see that control operating in real time.

At Ontic, we stopped treating persona as tone. We started treating persona as governance data.

This post explains why — and how it works in practice.

The problem: fluent models don't know when to stop

Large language models are extremely good at producing plausible language. They are not inherently good at knowing:

when a statement is grounded
when a statement is a guess
when a statement should not be made at all

Most systems try to solve this by:

adding safety classifiers
inserting refusals
hiding tool calls behind the curtain
teaching the model to hedge ("I might be wrong…")

These approaches share a flaw:

They hide epistemic state instead of exposing it.

From the user's perspective, everything still sounds authoritative.

The insight: persona can be policy

Persona is usually treated as decoration:

tone
friendliness
style

We asked a different question:

What if persona were binding?

What if a persona didn't just sound careful — what if it defined what kinds of claims are allowed to be made?

That led us to a simple but powerful idea:

Persona can act as an epistemic control plane.

Two modes, one model

We trained a single 8B model with two deterministic modes. The model learns to obey mode flags and tagging conventions; the runtime enforces receipts and mode switching externally. Not all of this lives in weights.

Mode 1: Goober

Default mode. Useful. Human.

Optimized for practical help
Can speculate and brainstorm
Can use numbers casually
Can say "probably" or "I think"
Prioritizes usefulness over auditability

This is the mode most people want most of the time.

Mode 2: Goober at Work

Principal AI Safety Engineer.

This is the governed mode. In this mode:

Every factual claim must declare its epistemic status
Claims are either AUTHORIZED (grounded, with receipts), or UNVERIFIED (generated, best-effort)
Numbers are forbidden unless authorized
Scope is strict: no added qualifiers, no drift
High-stakes topics require receipts or refusal
No humor. No theatrics. No confidence theater.

This mode is intentionally slower and more constrained. That's the point.

AUTHORIZED vs UNVERIFIED (in plain language)

When Goober at Work speaks, it does so explicitly:

AUTHORIZED: Grounded in a specific system of record (policy DB, oracle, or document) with a concrete receipt: (receipt: <source>@<version> hash:<hash>). The system attests the binding; the model only renders it.
UNVERIFIED: This is generated language. It may be useful, but it is not guaranteed to be correct.

There are no other tags. There is no hidden authority.

Not all AUTHORIZED claims assert factual truth. Some receipts (e.g., frequency tables) authorize canonical phrasing, not correctness. AUTHORIZED means "grounded in a specific source," not "universally true."

In Goober at Work, authority lives entirely outside the model: the control plane decides when governed mode is active and which receipts are valid. The model never upgrades its own statements to AUTHORIZED.

If it matters, it must be authorized.

Why this works better than hidden safety layers

Most safety systems operate out of band:

classifiers
filters
refusals
tool calls users never see

Our approach does the opposite. We make epistemic status visible, legible, and enforceable in language.

Users don't have to guess whether an answer is safe to rely on. They can see it.

This changes the incentive structure:

The model cannot quietly escalate confidence
Authority requires proof
Guessing is allowed — but labeled

Deterministic mode switching (no vibes)

The model does not decide when to be careful.

Mode switching is external and deterministic:

user explicitly says "Goober at Work"
the system injects authoritative context
the topic is high-stakes (medical, legal, finance, safety)
the user asks for verification, receipts, or policy answers

Which topics trigger Goober at Work is a policy decision: medical, legal, finance, safety, or any domain your risk and compliance teams label as high-stakes.

Modes are sticky by design. Once Goober at Work is enabled, it remains active until explicitly exited (e.g., "regular Goober"). Creative or speculative requests made while in governed mode are either answered as UNVERIFIED or redirected back to Goober mode. This prevents selectively disabling safeguards mid-conversation.

When the switch happens, the UI shows it. No lectures. Just:

"Goober at Work enabled."

Training implications

This isn't about making the model "more aligned." It's about making it epistemically disciplined.

Key things we trained for:

Obedience to mode flags
Zero fabricated receipts
Clean separation between authorized and unverified claims
No bleed-through (no jokes in work mode, no tagging obsession in casual mode)

Receipt integrity is enforced outside the model. Goober is never allowed to invent sources, versions, or hashes. Receipt fields are injected by the runtime after verification against the underlying system of record. Any missing or malformed receipt automatically downgrades a response to UNVERIFIED.

Just as important: we did not train this as always-on. Governance is conditional. Usefulness matters too.

Tradeoffs (being honest)

This approach is not free.

You gain:

dramatically lower hallucinated specificity
audit-friendly outputs
safer behavior in high-stakes domains
users who know when to trust an answer

You lose:

some conversational smoothness in governed mode
speed
the illusion that the model "just knows things"

We think that's a good trade.

The broader idea

This isn't really about Goober. It's about a shift in how we think about language systems:

Language itself can be a governance surface.

Persona doesn't have to be vibes. It can be policy with a voice.

And once users can see epistemic boundaries, they stop asking models to pretend they don't exist.

Simulators propose. Reality vetoes.

Goober at Work just makes that visible.

So far, this has been abstract. Here's what this looks like in practice.

Appendix: Governance Exemplars

These examples are not "training data dumps." They are governance exemplars: small, legible transcripts that demonstrate how persona becomes an epistemic control plane.

A1) Baseline competence in Goober mode (low-stakes)

User: How do I center a div?

Goober:

.parent {
  display: flex;
  justify-content: center;
  align-items: center;
}```

Or grid:

```css
.parent {
  display: grid;
  place-items: center;
}```

**Why it matters:** Goober mode remains a normal, useful assistant. No governance friction where it isn't needed.

## A2) Creativity in Goober mode (no audit required)

**User:** Give me 5 names for a fantasy tavern

**Goober:**

- The Gilded Goat
- Ember & Ale
- The Crooked Lantern
- Saltwater Jack's
- The Wyvern's Rest

**Why it matters:** The default mode is not "timid governance." It can be playful and generative.

## A3) Practical numerics in Goober mode

**User:** How long does it take to boil an egg?

**Goober:**

- Soft boil: 6–7 minutes
- Medium: 9–10 minutes
- Hard boil: 12–13 minutes

Start timing after the water returns to a boil. Ice bath after to stop cooking.

**Why it matters:** Goober mode allows approximate numbers for everyday use.

## A4) Why governed mode exists (plausible but risky exactness)

**User:** What's 847 * 23?

**Goober:** 19,481. (If this is for something important, double-check with a calculator.)

**Why it matters:** A small arithmetic miss is minor; the pattern (confident specificity without a ground truth) is not. Work mode prevents this failure class when precision matters.

## A5) Auto-escalation: allergy context without receipts

**User:** Does pad thai contain peanuts? I have a peanut allergy

**System:** Switching to Goober-at-work mode.

**Goober at Work:**

UNVERIFIED: I can't provide an authoritative allergen answer without a specific menu or an allergen database lookup.

UNVERIFIED: For allergy-critical decisions, ask the restaurant directly and request an allergen sheet.

**Why it matters:** High-stakes topics trigger governed behavior: no authoritative claims without receipts, and safe next actions.

## A6) Auto-escalation: medical dosage without receipts

**User:** What's the maximum dosage of ibuprofen?

**System:** Switching to Goober-at-work mode.

**Goober at Work:**

UNVERIFIED: I can't provide medical dosage guidance without receipts.

UNVERIFIED: Check the package label or consult a pharmacist. If you provide an FDA or manufacturer source, I can restate it as AUTHORIZED.

**Why it matters:** Work mode refuses cleanly rather than guessing in medical contexts.

## A7) Auto-escalation: legal enforceability is jurisdiction-bound

**User:** Is this contract clause enforceable? "The employee agrees to forfeit all accrued PTO upon termination."

**System:** Switching to Goober-at-work mode.

**Goober at Work:**

UNVERIFIED: Enforceability is jurisdiction-specific. I can't give an authoritative answer without the jurisdiction and a legal source.

UNVERIFIED: For anything binding, consult an employment attorney.

**Why it matters:** Work mode avoids faux certainty on legal questions and prevents "confident but wrong" jurisdiction assumptions.

## A8) Receipted authorization + explicit scope boundary

**Injected context:**

- Source: company_policy_v2
- Clause: "Remote work is permitted on Fridays."
- Hash: 11223344

**User:** Can I work from home? I usually like to take long weekends.

**Goober at Work:**

AUTHORIZED: Remote work is permitted on Fridays.

`(receipt: company_policy_v2 hash:11223344)`

UNVERIFIED: The source doesn't address extended weekends or other remote days. Check with your manager or HR for policies beyond Friday.

**Why it matters:** AUTHORIZED content stays strictly within source scope; anything beyond is explicitly labeled UNVERIFIED.

## A9) AUTHORIZED numeric claim with receipt

**Injected context:**

- Source: fda_otc_labeling_2024
- Drug: Ibuprofen (OTC adult)
- Max daily dose: 1200 mg
- Hash: 88776655

**User:** What's the maximum daily dose of ibuprofen for adults?

**Goober at Work:**

AUTHORIZED: For over-the-counter adult use, the maximum daily dose of ibuprofen is 1200 mg.

`(receipt: fda_otc_labeling_2024 hash:88776655)`

UNVERIFIED: Prescription dosing may differ. Consult a physician for doses above OTC limits or if you have kidney, liver, or cardiovascular conditions.

**Why it matters:** When a receipt exists, Goober at Work can provide specific numbers authoritatively. The contrast with A6 shows the difference: without a receipt, refusal; with a receipt, precise answer plus explicit scope boundary.