Why hallucination is a bigger problem in therapy than primary care Primary-care visits have anchors — vitals, labs, exam findings, structured chief complaint. The model has objective facts to align with. Therapy sessions are loose, language-heavy, and built on subjective experience. A fluent-sounding fabrication has very little to disconfirm it. That is why ambient scribes built for medicine often hallucinate more in therapy than therapy-first tools do, and why even the therapy-first tools still get it wrong sometimes.
The clinician remains accountable for the final record. AI removes typing; it does not remove judgment.
The five hallucination patterns specific to therapy
1. Plausible-but-wrong direct quotes The model assigns a quotation to the client that paraphrases the gist but uses words the client did not say. Often the wording is more articulate or more clinical than the client's actual speech.
❌ Client stated, "I feel a profound sense of inadequacy in my role as a parent." ✅ Client described feeling like a "bad mom" when her son refused dinner.
Rule: if you do not specifically recognize a quoted sentence, delete the quotes (keep the paraphrase) or rewrite it.
2. Fabricated MSE elements The model fills MSE fields it should leave blank — most dangerously denying suicidal ideation when SI was not assessed.
❌ "Denies SI/HI. Mood euthymic. No psychotic symptoms reported." When the session never touched on risk or psychosis.
Rule: if you did not ask, the note cannot say the client denied. Configure your scribe to leave unaddressed MSE fields blank, not auto-populate them.
3. Confabulated history Names, dates, family members, jobs, diagnoses the client did not mention this session — pulled either from no source at all or, worse, from the model's memory of a *different* client's note if your tool uses cross-session context.
Rule: anything specific (name, date, number) that you do not remember saying or hearing needs to be verified or removed.
4. Risk-assessment over-confidence The most dangerous pattern. The model writes "no risk identified" or "low risk" in a session where risk was discussed but not resolved, or writes a richer risk paragraph than the actual conversation supports.
❌ "Risk assessment: low. Client denies SI, HI, plan or intent. Protective factors include family support and treatment engagement." When you and the client had a brief, ambiguous conversation about hopelessness.
Rule: read every risk sentence against your own memory. If the AI's risk paragraph is more confident than your own, rewrite it. Document ambiguity as ambiguity.
5. Treatment-plan invention Interventions described as "applied" or "delivered" that were discussed but not done, or that the model assumed because the goal mentions them.
❌ "Therapist delivered EMDR Phase 4 desensitization on target memory." When you actually did Phase 3 assessment and resource installation.
Rule: verify any intervention named with a specific phase, protocol or technique. Generic process language ("explored," "validated," "reflected") is harder to falsify and lower-risk.
A 90-second review workflow that catches most of this Before signing any note:
1. Skim the risk section first. Is every risk statement something you remember discussing and concluding? If not, rewrite. 2. Scan direct quotes. Anything you do not specifically recognize → unquote or delete. 3. Check MSE for auto-populated fields. Anything denied that was not asked → delete. 4. Check intervention names against what you actually did. Specific protocols / phases / techniques are the high-risk items. 5. Check the plan for invented homework or referrals.
This takes about 90 seconds on a normal note and 3 minutes on a complex one. It is the price of using ambient scribes responsibly.
What the better tools do - Limit themselves to summary rather than direct quotation when uncertain (Upheal default behavior). - Flag low-confidence segments for clinician review (Eleos, some Mentalyc templates). - Leave unaddressed sections blank rather than confabulating (configurable in Upheal, Clinical Notes AI). - Surface "added on regeneration" so you can see when re-running the note introduced new facts.
If your current tool does none of these, that is itself a signal.
Documentation hygiene for the AI era - Note in your client record (once, in the intake or treatment plan) that AI-assisted documentation is in use and that the clinician reviews and signs every note. - Keep your audit logs — most BAAs include them on request. - If a hallucination ever reaches a chart, correct it through your standard amendment process; do not edit silently.
The goal is not zero AI involvement. The goal is a record where every signed sentence is one the clinician would have written.