OpenAI has introduced two key fine-tuning options for API customers: Reinforced Fine-Tuning (RFT) and Supervised Fine-Tuning (SFT). RFT optimizes outputs using chain-of-thought reasoning and user-provided graders, now available for verified organizations utilizing the o4-mini model. Additionally, all paid API tiers (Tier 1 and above) can now apply supervised fine-tuning to GPT-4.1 nano, the company’s fastest and most cost-effective model to date.
Key Highlights
- RFT on o4-mini: The reinforced fine-tuning previewed in December is now officially available for verified organizations using the o4-mini inference model.
- Enhanced Reasoning: RFT leverages task-specific grading and chain-of-thought processes to push model performance further in complex domains.
- SFT for GPT-4.1 nano: Supervised fine-tuning is now accessible for the fastest and most affordable GPT-4.1 nano, enabling customizations for all paid tiers.
RFT was first previewed in December as part of OpenAI’s alpha research initiative, where select partners tested its chain-of-thought optimization. Behind the scenes, developers provide both datasets and task-specific grading functions; the model then iteratively refines its reasoning to maximize rewards based on grader feedback. This marks OpenAI’s first implementation of custom reasoning training for its o-series models.
Several startups have already utilized RFT during its closed preview phase. Thomson Reuters adopted it to enhance legal document analysis, while Ambience fine-tuned agent assistants for customer support. Other notable mentions in OpenAI’s case study guide include ChipStack, Runloop, Milo, Harvey, Accordance, and SafetyKit, showcasing RFT’s versatility across industries.
Traditional supervised fine-tuning trains models to mimic example outputs, whereas RFT teaches models how to approach problems, leading to more robust performance in complex tasks. Early benchmarks indicate that RFT-adjusted models achieve higher data efficiency, requiring fewer samples to match supervised methods and demonstrating superior error-correction capabilities when faced with novel prompts.
In addition to RFT, OpenAI now supports classic supervised fine-tuning for GPT-4.1 nano, its smallest and fastest model, across all paid API tiers. With a context window of 1 million tokens, GPT-4.1 nano delivers strong results on benchmarks like MMLU (80.1%) and GPQA (50.3%), while significantly reducing latency and costs.
Developers can now upload labeled datasets to train nano models tailored to their use cases, achieving customized classification, extraction, or domain-specific conversational agents with unmatched speed and affordability. Microsoft’s Azure AI Foundry and GitHub integrations will soon support this fine-tuning pathway, further democratizing access to model customization.
These fine-tuning options reflect OpenAI’s commitment to highly customizable AI solutions, catering to both deep-domain experts and cost-conscious teams. RFT opens doors for advanced reasoning applications in fields such as law, medicine, or science, while SFT on nano addresses simpler classification or generation tasks without significant expense.
By offering RFT on o4-mini and SFT on nano, OpenAI spans the spectrum of model customization—from lightweight, fast nano models to intricate reasoning pipelines on o4-mini. This dual approach is poised to accelerate AI adoption across industries, benefiting startups and large enterprises alike.
OpenAI’s roadmap includes expanding RFT access beyond verified organizations and introducing SFT support for GPT-4.1 mini and full-sized GPT-4.1 shortly. As the ecosystem evolves, we can anticipate a diverse array of expert models, each fine-tuned for specific tasks, transitioning the paradigm from general-purpose AI to tailored, domain-specific intelligence.