Meta Announces API and Safety Tools at First-Ever LlamaCon Event

2025-05-14

At Meta's first LlamaCon event, the company unveiled several new tools for building its Llama AI models: a limited preview of the Llama API, which allows developers to experiment with different models, and new Llama protection tools designed to safeguard AI applications.

LlamaCon was a one-day virtual event featuring a keynote speech by Chief Product Officer Chris Cox, along with two one-on-one discussions between Meta CEO Mark Zuckerberg and other CEOs: Satya Nadella of Microsoft and Ali Ghodsi of Databricks. In addition to announcing the API, Meta also revealed partnerships with Cerebras and Groq to bring fast inference capabilities to the API. They also announced the integration of LlamaStack with NVIDIA NeMo microservices. Meta introduced new open-source AI protection tools: Llama Guard 4, LlamaFirewall, and Llama Prompt Guard 2. According to Meta:

We are committed to being a long-term partner for businesses and developers, providing a seamless transition path from closed models to open models. Llama is affordable and easy to use, enabling more people to benefit from AI regardless of their technical expertise or hardware resources. We believe AI has the potential to transform industries and improve lives, which is why we are excited to continue supporting the growth and development of the Llama ecosystem for the benefit of everyone. We can't wait to see what you will build next.

These Llama protection tools are a set of safeguards that developers can use to make their AI applications safer. The LlamaCon release includes Llama Guard 4, a content moderation model; Prompt Guard 2, a tool to prevent jailbreaking and prompt injection; and LlamaFirewall, an orchestration component for integrating multiple protection tools into AI applications.

The Llama API has been released as a free preview. Meta claims it offers "simple one-click API key creation and an interactive playground." Available models include the recently released Llama 4 Scout and Maverick MoE models. The release also includes Python and Typescript SDKs, with API compatibility with OpenAI’s SDK, "making it easy to convert existing applications."

The API includes resources for fine-tuning and evaluating custom models. Meta states that they will not use any uploaded prompts or model outputs in their own training. They also mention that developers can download and run their custom models anywhere.

On Hacker News, some users expressed disappointment in the Llama license restrictions during discussions about LlamaCon, noting that it isn’t fully open source. Regarding the API announcement, another user commented:

It feels like Meta is entering the cloud services business but within the AI space. They resisted entering the cloud business for so long, but with the success of AWS/Azure/GCP, I think they realized that relying solely on social networks won’t keep them at the top unless they own a platform (hardware, cloud).

On Reddit posts about these announcements, users reacted positively to the news of rapid inference:

I think the future definitely lies in speed. When you can output hundreds or even thousands of tokens per second, you can do some crazy things.