Google DeepMind Launches AI Coding Agent AlphaEvolve

2025-05-28

Google DeepMind has published a paper detailing their AlphaEvolve coding agent. AlphaEvolve leverages large language models (LLMs) to discover and optimize algorithms across multiple domains, including hardware design, data center operations, and AI training.

AlphaEvolve employs a set of LLMs, such as Gemini Flash and Gemini Pro, to generate and evolve programs that solve user-defined problems. Specifically, users are required to specify an evaluation function that returns a set of scalar metrics. Google has applied AlphaEvolve to various problems in mathematics, engineering, and computer science with promising results. One example is the discovery of a more efficient 4x4 matrix multiplication algorithm by AlphaEvolve. Google also tested it on over 50 mathematical problems; AlphaEvolve rediscovered state-of-the-art solutions for 75% of the problems and found better solutions for 20% of them. According to Google,

While AlphaEvolve is currently mainly used in mathematics and computation, its general nature means it can be applied to any problem that can be described as an algorithm and automatically verified. We believe AlphaEvolve could bring transformative changes in materials science, drug discovery, sustainability, and broader technology and business applications.

The core idea of AlphaEvolve is using LLMs to generate and evolve code. The system maintains a database of candidate programs it generates and uses them as context input to the LLMs, along with prompts describing how to evolve the programs. Generated programs that perform well in evaluations are stored in the database, and this cycle continues until the best solution is found.

In addition to solving mathematical problems, Google also utilized AlphaEvolve to improve its own data center operations. It developed a new heuristic function for Google's Borg task orchestrator. AlphaEvolve’s solution outperformed those discovered by deep reinforcement learning, enabling Google to reclaim 0.7% of its global computing resources. AlphaEvolve also enhanced kernel tiling in its AI training process and FlashAttention operations, improving speeds by 23% and 32%, respectively.

In Hacker News discussions, users generally expressed positive opinions about AlphaEvolve and mentioned Google's recent AI achievements:

People often forget that Google was behind Mu Zero, which I think is the most important AI paper of the decade, not Transformer, because they effectively demonstrated how models can learn to search.

In a post on X, Oxford University AI researcher Simon Frieder criticized DeepMind for not fully open-sourcing its code:

DeepMind, while ensuring all published content is scientifically interesting, has been somewhat inconsistent in releasing full public code. For instance, AlphaFold2 was released without training scripts. AlphaGeometry was found to contain errors. In both cases, open-source alternatives were developed: OpenFold in the first case, Newclid in the second... Given this history, AlphaEvolve might contain hidden bugs, making me less confident in its results. In some cases, it may be easy to verify the correctness of its output, but not in all cases. Note that this is different from LLM hallucinations since we have an automatic evaluator that AlphaEvolve relies on here.

Although the model is not yet publicly available, academic researchers can apply for early access to AlphaEvolve.