AI can diagnose diseases, compose poetry, and even drive cars—but it still struggles with a simple word: "no." This blind spot could lead to serious consequences in real-world applications, especially in healthcare.
A new study led by MIT PhD student Kumail Alhamoud, in collaboration with OpenAI and the University of Oxford, reveals that the inability to comprehend "no" and "not" may have profound implications, particularly in medical settings.
Negation—such as "no fractures" or "not enlarged"—is a critical linguistic function, especially in high-stakes environments like healthcare, where misinterpretation can result in significant harm. The research shows that current AI models—like ChatGPT, Gemini, and Llama—often fail to process negations correctly, tending to default to positive associations.
The core issue is not just a lack of data; it's the way AI is trained. Most large language models are designed to recognize patterns rather than perform logical reasoning. This means they might interpret "not good" as somewhat positive because they associate "good" with positivity. Experts believe that unless models are taught logic-based reasoning instead of merely mimicking language, they will continue to make minor yet dangerous errors.
"AI is very good at generating responses similar to what it has seen during training. However, it performs poorly when asked to produce something truly novel or beyond its training data," Franklin Delehelle, chief research engineer at Lagrange Labs, a zero-knowledge infrastructure company, told Decrypt. "Thus, if the training data lacks strong examples of 'no' or expressions of negativity, the model may struggle to generate those kinds of responses."
In the study, researchers found that vision-language models designed to interpret images and text tend to favor affirmative statements and often fail to distinguish between positive and negative captions.
"By synthesizing negation data, we offer a promising path toward more reliable models," the researchers stated. "While our synthetic data approach improves negation understanding, challenges remain, particularly with subtle negation nuances."
Despite ongoing progress in reasoning, many AI systems still face difficulties when handling open-ended questions or tasks requiring deeper comprehension or "common sense."
"All large language models—what we commonly refer to as AI today—are partially influenced by their initial prompts. When interacting with ChatGPT or similar systems, the system doesn't solely rely on your input. There’s also an internal or 'system' prompt preset by the company—which users cannot control," Delehelle told Decrypt.
Delehelle emphasized a fundamental limitation of AI: its reliance on patterns within the training data, which can sometimes affect—or even distort—its responses.
Kian Katanforoosh, adjunct professor of deep learning at Stanford University and founder of Skill Intelligence company Workera, stated that the challenge with negation stems from a fundamental flaw in how language models operate.
"Negation appears simple but is complex. Words like 'no' and 'not' change the meaning of a sentence, but most language models don’t work through logical reasoning—they predict what sounds plausible based on patterns." Katanforoosh explained, "This makes them prone to missing the point when dealing with negations."
Katanforoosh also pointed out, echoing Delehelle, that the way AI models are trained is at the heart of the problem.
"These models are trained to associate rather than reason. So when you say 'not good,' they still strongly link 'good' with positive emotions," he explained. "Unlike humans, they don’t always override these associations."
Katanforoosh warned that the inability to accurately interpret negation is not just a technical flaw—it could have serious real-world consequences.
"Understanding negation is foundational to comprehension," he said. "If a model can’t reliably grasp it, you risk subtle yet critical errors—especially in legal, medical, or human resources applications."
While expanding the training data might seem like an easy fix, he believes the solution lies elsewhere.
"The answer isn’t about more data but better reasoning. We need models capable of handling logic, not just language," he said. "That’s the frontier now: combining statistical learning with structured thinking."