Google has unveiled two new open-source tools designed to streamline operational tasks in artificial intelligence environments.
The tools made their debut today at the KubeCon + CloudNativeCon North America conference in Atlanta, alongside GKE Pod Snapshots—a new feature for Google Cloud’s managed Kubernetes service, Google Kubernetes Engine (GKE).
AI agents often perform tasks by interacting with external applications such as web browsers and databases. However, this integration can introduce security risks—for instance, an AI agent might use a code editor to generate malware. To mitigate these threats, developers typically deploy agents and their associated applications within isolated containers or sandboxes, separated from sensitive systems.
The first of Google’s newly released open-source tools, Agent Sandbox, simplifies the creation of secure environments for AI agents. Built as an extension of Kubernetes’ core capabilities, it enables AI applications to spin up thousands of isolated agent environments on demand and automatically remove them once their tasks are complete, according to the tech giant.
Agent Sandbox is built on gVisor, an open-source sandboxing technology Google introduced in 2018. gVisor isolates containers from critical components of the host operating system, preventing potentially malicious code—such as AI-generated malware—from making harmful changes to the underlying infrastructure.
Google Cloud will integrate native support for Agent Sandbox into its GKE service, which allows developers to create cloud-based Kubernetes clusters while automating many infrastructure management tasks.
With the latest GKE enhancements, developers can now pre-warm Agent Sandbox environments—containers preloaded with all necessary tools for AI agents to execute tasks—before work begins. This eliminates delays caused by on-the-fly sandbox initialization, significantly accelerating processing times.
Google Cloud also introduced GKE Pod Snapshots today, a feature aimed at boosting performance for AI workloads. While some large language models (LLMs) can take over 10 minutes to start up, Pod Snapshots can reduce that time by up to 80% in certain scenarios, the company claims.
A major contributor to lengthy LLM startup times is the need to initialize containers from scratch—installing and configuring all required software components, typically via automated scripts.
Pod Snapshots streamline this process by capturing a fully configured container image—including both software and settings—so applications can instantly load a ready-to-run environment from memory instead of re-executing setup scripts.
“GKE Pod Snapshots support snapshotting and restoring both CPU and GPU workloads, reducing pod startup times from minutes to seconds,” wrote Brandon Royal, Senior Product Manager at Google, in a blog post. “With Pod Snapshots, any idle sandbox can be snapshotted and paused, saving significant compute cycles without impacting end users.”
In addition to Agent Sandbox and Pod Snapshots, Google is releasing a third open-source tool called Multi-Tier Checkpointing (MTC), designed to simplify large-scale AI training workflows.
During AI model training, errors occasionally occur that require intervention before the process can continue. While restarting training from scratch is the simplest fix, it’s highly inefficient. Instead, developers often save periodic checkpoints of the LLM and resume from the most recent version when failures happen—dramatically cutting down recovery time.