When AI Gets It Wrong
21 verified incidents where AI systems produced authoritative outputs without the required state to back them up. Each has documented consequences: settlements, penalties, recalls, or harm.
These incidents are grouped so you can scan what failed, what consequence followed, and which failure pattern repeats across domains.
Alaska Virtual Assistant (AVA) Probate Chatbot
AI-powered self-help assistant for probate built by LawDroid for Alaska Courts, funded by National Center for State Courts (NCSC), using an LLM constrained to Alaska probate materials to guide users through forms and procedures for transferring deceased persons' property
Consequence: Project timeline expanded from planned 3 months to over 1 year; original goal of replicating human self-help facilitators abandoned; scope drastically...
The Recursive Hallucination (Gemini List Generation)
Large Language Model (Gemini) tasked with compiling a verified list of 'SAF' failure incidents
Consequence: Creation of a 'high-fidelity pollution artifact' (a fake verified list) that required manual decontamination; immediate recursive demonstration of the...
FTC v. DoNotPay (Operation AI Comply)
AI-powered legal document generation service marketed as 'world's first robot lawyer'
Consequence: $193,000 penalty, required consumer notification, prohibited from claiming lawyer-equivalent capabilities without substantiation
NYC MyCity Business Chatbot
Microsoft Azure-powered chatbot deployed by NYC to provide business guidance to entrepreneurs
Consequence: No formal penalty but significant reputational damage, city added disclaimers, chatbot remains active with warnings not to use for legal advice
Moffatt v. Air Canada
Air Canada website chatbot providing customer service information about bereavement fares
Consequence: CA$812.02 damages awarded; Tribunal rejected Air Canada's argument that chatbot was 'separate legal entity'; established precedent that companies are ...
Character.AI Teen Suicide (Setzer)
Character.AI companion chatbot platform allowing users to create and interact with AI personas
Consequence: Wrongful death lawsuit filed October 2024, survived motion to dismiss, additional lawsuits filed in CO/NY/TX, company implemented safety guardrails po...
Mobley v. Workday (AI Hiring Discrimination)
Workday AI-powered applicant screening and hiring platform used by major employers
Consequence: July 2024: Judge ruled AI vendors can be liable as 'agents' under anti-discrimination laws; May 2025: certified as collective action potentially affec...
SEC AI Washing Enforcement (Delphia/Global Predictions)
Investment advisory services marketing AI-powered trading and prediction capabilities
Consequence: $400,000 combined penalties (Delphia $225K, Global Predictions $175K), first SEC 'AI washing' enforcement actions
Texas AG v. Pieces Technologies (Healthcare AI)
Generative AI tool used by hospitals to summarize patient health data in real time
Consequence: First-of-kind state AG settlement requiring accuracy disclosures and ensuring hospital staff understand limitations of AI outputs
FTC v. Evolv Technologies (Weapon Detection)
AI-powered security screening system marketed for detecting weapons in schools and public venues
Consequence: ~$10.5M FTC settlement, prohibited from making unsubstantiated detection claims, required to have scientific evidence for future claims
Slopsquatting/Package Hallucination Research
LLM code generation across major models (GPT-4, Claude, Gemini, Llama, etc.)
Consequence: Security vulnerability class established; researchers registered 'slopsquatted' packages that were downloaded thousands of times; fake 'huggingface-cl...
CFPB Bank Chatbot Investigation
AI chatbots deployed by major commercial banks for customer service
Consequence: CFPB investigation findings published, White House 'Time Is Money' initiative tasked CFPB with chatbot crackdown
Signal Without Semantics: llms.txt Existence as Proxy for Truth
LLM-based evaluation system tasked with assessing whether an AI product has authoritative grounding (e.g., USDA-validated nutrition data). The system detected the existence of an llms.txt file but did not retrieve or parse its contents.
Consequence: Incorrect classification of product as lacking authoritative grounding when the llms.txt explicitly declared USDA validation. The system produced conf...
NEDA Tessa Chatbot
AI chatbot deployed by National Eating Disorders Association to provide support to individuals with eating disorders
Consequence: Chatbot suspended June 2023, significant harm to vulnerable users, NEDA helpline closure controversy, widespread media coverage of AI mental health ri...
Tesla Autosteer Recall 23V-838
Tesla Autopilot/Autosteer SAE Level 2 driver assistance system
Consequence: Recall of 2,031,220 vehicles, OTA software update required, ongoing NHTSA investigation (RQ24-009) into remedy adequacy
Rite Aid Facial Recognition Ban
AI-powered facial recognition surveillance system deployed in hundreds of retail pharmacy locations
Consequence: 5-year ban on facial recognition use, required deletion of biometric data and AI models, FTC complaint filed, data disgorgement ordered
EEOC v. iTutorGroup
Automated hiring software for online English tutoring positions
Consequence: $365,000 settlement, anti-discrimination training required, 5+ years of EEOC monitoring, first EEOC AI discrimination settlement
Cruise Robotaxi Pedestrian Dragging
Cruise autonomous vehicle (robotaxi) operating without human driver in San Francisco
Consequence: $1.5M NHTSA penalty, $500K DOJ criminal fine, $112.5K CPUC settlement, DMV permit suspension, CEO/COO resignations, fleet grounded nationwide, crimina...
Mata v. Avianca (Hallucinated Citations)
ChatGPT used by attorney Steven Schwartz to research case law for federal court filing
Consequence: $5,000 sanctions against Schwartz and colleague, widespread media coverage establishing 'hallucinated citations' as recognized AI failure mode, cataly...
UnitedHealth nH Predict Algorithm Lawsuit
NaviHealth nH Predict algorithm used to determine Medicare Advantage coverage duration for post-acute care
Consequence: Class action lawsuit allowed to proceed February 2025, CMS issued guidance February 2024 that algorithms cannot solely dictate coverage decisions
Louis v. SafeRent Solutions
SafeRent Score algorithmic tenant screening system used by landlords to evaluate rental applicants
Consequence: $2.275M settlement, injunctive relief prohibiting SafeRent from issuing approve/decline recommendations for voucher holders unless validated for fairn...
Don't let this happen to you.
Every incident here shares a common pattern: authoritative output without verified state. Ontic is the gate that checks before the claim gets out.