AI2 Releases Tülu 3 Model, Narrowing the Gap Between Open and Closed Source

2024-11-25

The Allen Institute for Artificial Intelligence (Ai2) has recently unveiled its new model training series, Tülu 3, aimed at narrowing the application gap between proprietary and open-source models post-training. The release underscores the vast potential of open-source models in the enterprise sector.

The Tülu 3 model achieves performance levels comparable to closed-source models like OpenAI's GPT series, Anthropic's Claude, and Google's Gemini. It allows researchers, developers, and businesses to fine-tune open-source models while maintaining the model's core data and capabilities, bringing its performance close to that of proprietary models.

Upon launching Tülu 3, Ai2 provided complete datasets, data mixing techniques, training recipes, code, infrastructure, and evaluation frameworks. To enhance Tülu 3's performance, Ai2 developed new datasets and training methods, including a training approach based on reinforcement learning to solve verifiable problems.

Ai2 stated that their optimal model results from a sophisticated training process that integrates elements of proprietary methods, novel technologies, and established academic research. Ai2's success is attributed to meticulous data selection, rigorous experimentation, innovative methodologies, and improved training infrastructure.

The Tülu 3 model will be available in multiple scales to accommodate the diverse needs of various enterprises and researchers.

Regarding open-source models and enterprise applications, although the adoption rate of open-source models in businesses has previously lagged behind proprietary models, an increasing number of companies are opting to develop projects using open-source large language models (LLMs). Ai2 believes that by enhancing the fine-tuning capabilities of open-source models like Tülu 3, more enterprises and researchers will be drawn to open-source models, trusting that these models can deliver performance on par with closed-source models such as Claude or Gemini.

Ai2 pointed out that while large model training organizations like Anthropic and Meta claim to be open-source, their training data and methods lack transparency for users. Although the Open Source Initiative (OSI) recently released the first defined version of open-source AI, some organizations and model providers have not fully adhered to this definition in their licenses.

When selecting models, enterprises prioritize transparency but often choose the models that best fit their use cases, rather than focusing solely on research or data openness. Tülu 3 offers businesses more options, enabling them to select and fine-tune from open-source models.

Additionally, Ai2 has released other open-source models, such as OLMoE and Molmo, which are claimed to surpass leading models like GPT-4o and Claude in certain aspects.

The Tülu 3 model also allows businesses to mix and match datasets during the fine-tuning process. The training recipes provided by Ai2 help businesses balance datasets to build models with specific functionalities as needed, such as coding capabilities, precise instruction adherence, and multilingual communication.

Furthermore, the infrastructure code released by Ai2 enables enterprises to build corresponding pipelines when adjusting the model's scale. Additionally, Ai2's evaluation framework offers developers methods to specify model output settings.