Is Healthcare AI Actually Smart, or Just Good at Faking It?
Noah Vandal and Dr. Joseph Yoon discuss whether AI is actually smart, why it hallucinates, and how context changes healthcare AI results.
Is AI smart or dumb?
Why AI can look smart and dumb in the same week
If you've ever used a public chatbot to look up a symptom and received a wildly inaccurate answer, it is easy to write off artificial intelligence as overhyped pattern matching. In this episode of the AI and Healthcare Podcast, Dr. Joseph Yoon and Noah Vandal ask the blunt version of the question: is AI actually smart, or is it just dumb math that is good at sounding confident? The answer is not cleanly one or the other. Modern large language models are not human clinicians. They do not have bedside judgment, visual assessment, or lived understanding. But they do have access to statistical patterns across enormous bodies of text, and that can become useful when the model is given the right task, the right context, and the right safety boundary.
Why context changes the quality of healthcare AI
One of the most important points in the episode is that poor context produces poor output. If a patient says only, "I have a headache," both a doctor and an AI system need more information. Duration, severity, medications, medical history, recent injuries, fever, neurologic symptoms, and risk factors all change the interpretation. That is why healthcare AI should not be evaluated only by asking a generic chatbot isolated questions. The stronger use case is a constrained system that can work with relevant context: intake forms, prior history, medication lists, structured notes, clinical policies, and source-backed references. In that setting, AI is less like a magic oracle and more like a fast assistant that can organize information, surface patterns, and help a human decide what to review next.
What ontology means for medical records
Noah and Dr. Yoon also discuss ontology: the structure that defines how information relates to other information. In healthcare, the meaning of a symptom changes when it is connected to the patient's history, medications, labs, diagnoses, and recent encounters. Without that structure, an AI system may treat each fact like loose text. With it, the system can understand that a complaint, medication, condition, and care plan belong to the same patient story. That does not make the model a physician. It makes the data more usable, which can make the AI output more relevant and easier for a clinician to evaluate.
A real example where AI helped catch a time-sensitive condition
The episode includes a personal example from Dr. Yoon involving Ramsay Hunt syndrome. A rare presentation was missed in urgent care, but a detailed symptom description entered into an AI system surfaced the correct differential quickly enough to help the family seek appropriate antiviral treatment. That story should not be read as "AI replaces urgent care." It is better understood as an example of second-opinion support. A human clinician still matters. Physical examination still matters. But an always-available tool that can rapidly compare symptoms against a wide medical knowledge base may help patients and clinicians notice possibilities that deserve follow-up.
Why hallucinations still matter
The strongest argument against careless healthcare AI is hallucination. A system that confidently invents a source, diagnosis, or treatment path can cause harm if users treat it as authority. This is why the episode distinguishes general chatbot behavior from systems built for clinical support. Tools like OpenEvidence point toward a more grounded pattern: retrieval from medical literature, citations, and clinician-oriented workflows. Even then, the result should support professional judgment rather than replace it. In healthcare, the standard is not whether the AI sounds convincing. The standard is whether the workflow is safe, reviewable, and bounded.
How to think about AI as a second opinion
The practical framing is simple: AI is useful when it helps humans reason better, move faster, or notice something they might have missed. It is dangerous when it is treated as an independent authority without enough context or review. For SpeechSage, that distinction matters. Healthcare voice AI should stay within approved workflows, capture structured information, escalate when appropriate, and make the human handoff cleaner. The goal is not to make the AI seem smart. The goal is to make the healthcare workflow safer, clearer, and more responsive.
Related research
The Harvard emergency-room triage study discussed in the episode is a useful signal for where AI may help. According to the summary linked above, an advanced AI reasoning model performed strongly on text-based emergency diagnosis tasks, while the researchers still framed AI as a second-opinion tool rather than a replacement for bedside care. OpenEvidence is another relevant example because it is designed for clinicians and uses medical literature as grounding context. That matters because the more serious the domain, the less acceptable it is for a model to improvise unsupported answers. Yann LeCun's criticism of large language models also keeps the discussion honest. If LLMs lack a world model and physical understanding, then healthcare teams should be careful about what they delegate. AI may be powerful, but clinical deployment still needs structure, oversight, and humility.