AMD Launches First Fully Open-Source 1 Billion-Parameter Large Language Model, OLMo AI NEWS

Home
AInews
AMD Launches First Fully Open-Source 1 Billion-Parameter Large Language Model, OLMo

AMD Launches First Fully Open-Source 1 Billion-Parameter Large Language Model, OLMo

2024-11-08

AMD has officially launched its first fully open-source large language model (LLM) series with 1 billion parameters—AMD OLMo. Designed for diverse application scenarios, the model has been pre-trained on AMD's Instinct MI250 GPUs. Reportedly, the AMD OLMo model boasts robust reasoning, instruction following, and conversational capabilities.

This initiative aims to enhance AMD's position in the artificial intelligence industry and offer customers the opportunity to deploy these open-source models on AMD hardware. By making data, weights, training procedures, and code publicly available, AMD encourages developers not only to replicate these models but also to further innovate upon them. Beyond data center applications, AMD has enabled the OLMo model to be locally deployed on AMD Ryzen AI personal computers equipped with Neural Processing Units (NPUs), allowing developers to leverage AI models on individual devices effectively.

Multistage Pre-training

The AMD OLMo model was trained on an extensive dataset containing 13 trillion tokens, utilizing 16 nodes, each equipped with four AMD Instinct MI250 GPUs (totaling 64 processors). The training process for this model was divided into three phases:

The initial AMD OLMo 1B model was pre-trained on a subset of Dolma v1.7, a decoder-only transformer focused on next-token prediction to capture language patterns and general knowledge.
The second version involves supervised fine-tuning (SFT) of the AMD OLMo 1B model, initially trained on the Tulu V2 dataset and subsequently further trained on the OpenHermes-2.5, WebInstructSub, and Code-Feedback datasets to enhance its instruction-following capabilities and improve performance in scientific, coding, and mathematical tasks.
After fine-tuning, the AMD OLMo 1B SFT model was adjusted for human preferences using the UltraFeedback dataset, resulting in the final AMD OLMo 1B SFT DPO version, prioritizing outputs that align with typical human feedback.

Performance Results

In AMD's own tests, the AMD OLMo model demonstrated outstanding performance in standard benchmark tests for general reasoning abilities and multitask understanding, maintaining competitiveness against similar-sized open-source models such as TinyLlama-1.1B, MobiLlama-1B, and OpenELM-1_1B.

The two-stage SFT model exhibited a significant improvement in accuracy, with MMLU scores increasing by 5.09% and GSM8k by 15.32%, which fully demonstrates the effectiveness of AMD's training methodology.
The final AMD OLMo 1B SFT DPO model outperformed other open-source conversational models by at least 2.60% on average across various benchmark tests.

Regarding the instruction tuning results in conversational benchmark tests, especially when comparing AMD's OLMo 1B SFT and OLMo 1B SFT DPO models with other instruction-tuned models, AMD's models outperformed the closest competitors by 3.41% in the AlpacaEval 2 Win Rate metric and by 2.29% in the AlpacaEval 2 LC Win Rate metric. Additionally, in the MT-Bench test, which measures multi-turn conversational abilities, the SFT DPO model's performance was 0.97% higher than that of the nearest competitors.

Furthermore, AMD conducted tests in responsible artificial intelligence benchmarks, including ToxiGen (lower scores indicate better performance in measuring toxic language), Crow's Pairs (assessing bias), and TruthfulQA-mc2 (evaluating the veracity of responses). The results indicate that the AMD OLMo model performs comparably to similar models in handling ethical and responsible AI tasks.

The launch of AMD OLMo signifies a significant advancement for AMD in the field of artificial intelligence, providing users and developers with more diverse and efficient AI solutions.

Sapia

Sapia - AI hiring agent for fair recruitment processes

Magic Motion

Magic Motion - AI transforms text into engaging 3D animations

Recall

Recall - AI summarizer for streamlined knowledge management

Rocket.new

Rocket.new - AI analyzes and summarizes call conversations

Qodo AI Platform

Qodo AI Platform - AI tool for ensuring code quality and integrity

Zev AI

Zev AI - AI coding assistant for seamless integration

Kepl-AI Scanner

Kepl-AI Scanner - AI scanner for quick object recognition

RECENT AI TOOLS

Final Round AI

Sapia

Magic Motion

Recall

Rocket.new

RECENT AI NEWS

Decagon, a Chatbot Startup, Raises $131 Million in Funding to Create Personalized AI Agents for Every Consumer

Google Contributes Agent2Agent Protocol to Linux Foundation

Google introduces AI-powered proxy mode in Android Studio

Google Launches On-Device Gemini AI Model

Google Introduces AI Mode in India

Salesforce Launches Agentforce 3 to Enhance AI Agent Visibility and Connectivity

Massive Leak Reveals Design of Google Pixel 10 Pro XL

Leaked Information Indicates Grok Could Soon Gain the Ability to Edit Spreadsheets

RECENT AI TOOLS