Blog: Learn Machine Learning from a Google AI Engineer
Vector Embeddings vs. the DSM-5: The Mathematical Flaw in Mental Health AI
As an engineer, I look at the world through the lens of data architecture, relations, database schemas, and decision trees.
Lately, I’ve been reading Daniel Oberhaus’s book, The Silicon Shrink: How Artificial Intelligence Made the World an Asylum. In Chapter 4, Oberhaus critiques “digital phenotyping”—the idea that we can passively track smartphone keystrokes, typing cadences, and language choices to map human behavior back to DSM-5 (The American Psychiatric Association’s Diagnostic and Statistical Manual of Mental Disorders) diagnoses. His argument is that this tracking is fundamentally flawed because the underlying foundation—the DSM itself—is broken. If the diagnostic categories are arbitrary and unscientific, training AI to detect them is just automating subjectivity.
As I read, I wondered:
What if the linguistic boundaries of the DSM are so overlapping and redundant that the math of vector spaces exposes them as an artificial illusion?
So, being an applied AI engineer with too much time while waiting at an airport, a Google Cloud project, Antigravity for some vibe coding, and a healthy dose of professional skepticism, I decided to build an engineering proof. I wanted to map the entire DSM-5 into high-dimensional vector space, project it down to a 2D canvas, and see what the geometry of clinical language actually looks like.
The results are mathematically clear, clinically chaotic, and raise serious questions about the future of Psychiatric AI.
You can explore the live, interactive proof and run the simulations yourself here:
Learning the Hard Way: When Agents Build Agents (And the Culture Changes It Requires)
Unleash the Super-Prompt: Mastering Your Coding AI Workflow With Gemini
Unleashing Gemini CLI Power in GitHub Actions and Beyond