Samsung Researchers Develop Tiny AI Model That Outperforms Large Language Models in Reasoning Tasks AI NEWS

Home
AInews
Samsung Researchers Develop Tiny AI Model That Outperforms Large Language Models in Reasoning Tasks

Samsung Researchers Develop Tiny AI Model That Outperforms Large Language Models in Reasoning Tasks

2025-10-11

Researchers from Samsung Electronics have created a compact AI model that demonstrates outstanding performance in specific reasoning tasks, challenging the long-standing industry belief that bigger models are always better.

The Tiny Recurrent Model (TRM), introduced this week, contains only seven million parameters—dramatically fewer than most other AI systems. Despite its small size, it outperforms powerful large language models such as Google's Gemini 2.5 Pro when solving complex reasoning puzzles like Sudoku.

Alexi Jolly-Col-De Martinno, a senior researcher at the Samsung Advanced Institute of Technology's Montreal AI Lab, published a paper on arXiv showing how smart design choices can surpass the effectiveness of simply increasing parameter count. The model uses a unique “recursive reasoning” mechanism, allowing it to engage in cyclical thinking, repeatedly addressing the same problem to refine its responses.

The paper titled "Less is More: Recursive Reasoning with Tiny Networks" explains how TRM was specifically engineered to tackle logical puzzles and reasoning-intensive challenges. While it lacks the versatility to chat, write stories, or generate images like other models, its specialized design enables it to solve certain difficult problems with higher accuracy than larger models.

For example, TRM achieved an 87% accuracy rate on Sudoku-Extreme, a benchmark testing an AI's ability to solve multiple Sudoku puzzles. It scored 85% on Maze-Hard, a task requiring rapid navigation through complex labyrinths. In the ARC-AGI-1 and ARC-AGI-2 tests, which involve abstract reasoning puzzles designed to evaluate general intelligence, it scored 45% and 8% respectively.

In these tests, TRM surpassed larger models. Gemini 2.5 Pro scored only 4.9% on ARC-AGI-2, while OpenAI's o3-mini-high scored 3%, DeepSeek Ltd.'s R1 achieved 1.3%, and Anthropic PBC's Claude 3.7 managed only 0.7%. TRM accomplished this with less than 0.01% of the parameters found in the most powerful large language models.

Recursive Reasoning Cycle

Instead of constructing a massive neural network, Samsung's researchers utilized a recursive technique similar to human thought processes. The model evaluates its own answer by asking: “Is this a good solution? If not, can I find a better one?” It then attempts to solve the problem again, refines its response, and repeats this process until satisfied.

To achieve this, TRM maintains two forms of short-term memory—it remembers the current solution and creates a notepad to track intermediate steps taken to improve it. At each stage, the model reviews the task, current solution, and prior notes to update the notepad, then generates an improved output based on this information.

This cycle repeats multiple times, progressively refining the response and eliminating the need for massive parameter counts typically required to handle extended reasoning chains. Instead, a small network with just millions of parameters suffices.

According to the research paper, TRM is programmed to "recursively improve latent and output states without assuming convergence." This means the model doesn't settle on an answer prematurely but continues looping until no further improvements can be made.

It employs an “adaptive stopping” mechanism that determines when to terminate, preventing infinite loops. The model also uses deep supervision, receiving feedback at multiple reasoning stages rather than just at the end. This, the authors explain, enables more efficient learning.

The Power of Minimalism

Jolly-Col-De Martinno emphasized in a blog post that this research is significant because it demonstrates that small, highly specialized models can excel in narrow, structured reasoning tasks—an advancement that could greatly impact the broader AI industry.

An obvious benefit is making powerful AI systems more accessible. The largest LLMs, which contain billions or even trillions of parameters, require massive, expensive GPU clusters to operate. These systems consume vast amounts of energy, limiting experimentation to only wealthy corporations and well-funded universities. In contrast, models like TRM, with just millions of parameters, can run on standard hardware with significantly lower power consumption.

This could open the door for more universities, startups, and independent developers to experiment with advanced AI models and accelerate innovation.

Nevertheless, Jolly-Col-De Martinno's team acknowledges that their findings don't signal the end of LLMs. TRM is effective only for well-defined grid-based tasks and is unsuitable for open-ended, text-based, or multimodal tasks. However, it represents a promising direction, and the researchers plan further experiments to adapt recursive learning models for new domains.

Furbo Nanny

AI pet camera for real-time monitoring

Phoenix AI

Automated customer service and task management

Legora

AI lawyer assistant that can draft, review and research documents

Picmaker

Create social media content and publish it to multiple channels

Rosebud AI

Create a game using chat prompts

Mivi AI Buds

Multilingual AI earbuds with humanlike assistant

CodeRabbit

AI code review tool providing smart feedback

Recursive Reasoning Cycle

The Power of Minimalism

RECENT AI TOOLS

Tempus One

Furbo Nanny

Phoenix AI

Legora

Picmaker

RECENT AI NEWS

Apple App Store Launches Web Version

Hyundai and NVIDIA Invest $3 Billion to Build AI Factory with Blackwell GPUs

ChatGPT Will No Longer Provide Medical or Legal Advice

Lambda Signs Multi-Billion-Dollar AI Infrastructure Deal with Microsoft

Studio Ghibli and Other Japanese Publishers Demand OpenAI Halt Use of Their Works for Training

OpenAI Signs $38 Billion AI Infrastructure Deal with AWS

Microsoft Signs $9.7 Billion Deal with Australia's IREN to Boost AI Cloud Computing Capabilities

Bria Launches FIBO Foundation Model for Predictable and Brand-Safe AI Generation

RECENT AI TOOLS