Researchers from Samsung Electronics have created a compact AI model that demonstrates outstanding performance in specific reasoning tasks, challenging the long-standing industry belief that bigger models are always better.
The Tiny Recurrent Model (TRM), introduced this week, contains only seven million parameters—dramatically fewer than most other AI systems. Despite its small size, it outperforms powerful large language models such as Google's Gemini 2.5 Pro when solving complex reasoning puzzles like Sudoku.
Alexi Jolly-Col-De Martinno, a senior researcher at the Samsung Advanced Institute of Technology's Montreal AI Lab, published a paper on arXiv showing how smart design choices can surpass the effectiveness of simply increasing parameter count. The model uses a unique “recursive reasoning” mechanism, allowing it to engage in cyclical thinking, repeatedly addressing the same problem to refine its responses.
The paper titled "Less is More: Recursive Reasoning with Tiny Networks" explains how TRM was specifically engineered to tackle logical puzzles and reasoning-intensive challenges. While it lacks the versatility to chat, write stories, or generate images like other models, its specialized design enables it to solve certain difficult problems with higher accuracy than larger models.
For example, TRM achieved an 87% accuracy rate on Sudoku-Extreme, a benchmark testing an AI's ability to solve multiple Sudoku puzzles. It scored 85% on Maze-Hard, a task requiring rapid navigation through complex labyrinths. In the ARC-AGI-1 and ARC-AGI-2 tests, which involve abstract reasoning puzzles designed to evaluate general intelligence, it scored 45% and 8% respectively.
In these tests, TRM surpassed larger models. Gemini 2.5 Pro scored only 4.9% on ARC-AGI-2, while OpenAI's o3-mini-high scored 3%, DeepSeek Ltd.'s R1 achieved 1.3%, and Anthropic PBC's Claude 3.7 managed only 0.7%. TRM accomplished this with less than 0.01% of the parameters found in the most powerful large language models.
Recursive Reasoning Cycle
Instead of constructing a massive neural network, Samsung's researchers utilized a recursive technique similar to human thought processes. The model evaluates its own answer by asking: “Is this a good solution? If not, can I find a better one?” It then attempts to solve the problem again, refines its response, and repeats this process until satisfied.
To achieve this, TRM maintains two forms of short-term memory—it remembers the current solution and creates a notepad to track intermediate steps taken to improve it. At each stage, the model reviews the task, current solution, and prior notes to update the notepad, then generates an improved output based on this information.
This cycle repeats multiple times, progressively refining the response and eliminating the need for massive parameter counts typically required to handle extended reasoning chains. Instead, a small network with just millions of parameters suffices.
According to the research paper, TRM is programmed to "recursively improve latent and output states without assuming convergence." This means the model doesn't settle on an answer prematurely but continues looping until no further improvements can be made.
It employs an “adaptive stopping” mechanism that determines when to terminate, preventing infinite loops. The model also uses deep supervision, receiving feedback at multiple reasoning stages rather than just at the end. This, the authors explain, enables more efficient learning.
The Power of Minimalism
Jolly-Col-De Martinno emphasized in a blog post that this research is significant because it demonstrates that small, highly specialized models can excel in narrow, structured reasoning tasks—an advancement that could greatly impact the broader AI industry.
An obvious benefit is making powerful AI systems more accessible. The largest LLMs, which contain billions or even trillions of parameters, require massive, expensive GPU clusters to operate. These systems consume vast amounts of energy, limiting experimentation to only wealthy corporations and well-funded universities. In contrast, models like TRM, with just millions of parameters, can run on standard hardware with significantly lower power consumption.
This could open the door for more universities, startups, and independent developers to experiment with advanced AI models and accelerate innovation.
Nevertheless, Jolly-Col-De Martinno's team acknowledges that their findings don't signal the end of LLMs. TRM is effective only for well-defined grid-based tasks and is unsuitable for open-ended, text-based, or multimodal tasks. However, it represents a promising direction, and the researchers plan further experiments to adapt recursive learning models for new domains.