Devlery
Blog/AI

One API Call Now Boots Linux, the Line Google Drew with Gemini Agents

Google Managed Agents extends the Gemini API from model calls into sandboxed execution, shifting where agent infrastructure begins.

One API Call Now Boots Linux, the Line Google Drew with Gemini Agents
AI 요약
  • What happened: Google introduced Gemini API Managed Agents in Public Preview.
    • A single call can run the Antigravity agent in a Google-managed Linux sandbox with reasoning, tool calls, code execution, file management, and web browsing.
  • Why it matters: The center of agent competition is moving from model quality alone to the execution environment API.
  • Watch: Preview status, default outbound networking, interaction retention, and 100k-3M token tasks all become product architecture assumptions.
    • Teams need to review permissions, cost limits, state deletion, trace verification, and provider lock-in before treating this as a production runtime.

Google has put Managed Agents inside the Gemini API. At first glance, the announcement can sound like another familiar promise to make agents easier to build. The more important shift is not a claim that the model itself suddenly became smarter. It is that the unit of an API call is changing. A developer is no longer only sending text to a model and waiting for a response. They can call a Google-managed Linux execution environment and an agent loop together.

Google's official announcement went live on May 19, 2026. In its I/O developer highlights the same day, Google grouped Gemini 3.5 Flash, Antigravity 2.0, the Antigravity CLI and SDK, AI Studio updates, and Managed Agents under one agentic developer story. The message is direct: Google wants developers to move from prompting a model toward consuming a work environment where the model can create files, execute code, and browse the web.

The default path described by Google is simple. Managed Agents in the Gemini API run the Antigravity agent. That agent is based on Gemini 3.5 Flash and operates inside an isolated Linux sandbox managed by Google. Within that environment it can reason, plan, call tools, execute code, manage files, and browse the web. Google's announcement emphasizes the phrase "single call." In practice, that means developers can start a worker through one API call instead of assembling sandbox infrastructure, an agent harness, file state, and browsing tools themselves.

That matters because the real bottleneck for agent products is increasingly outside the model call. A simple chatbot can live on model inputs and outputs. A working agent has to read files, run code, fetch sources from the web, store intermediate artifacts, restart after failures, produce logs, restrict privileges, and explain what happened. Until now, teams often built that layer themselves or assembled it from LangGraph, CrewAI, Temporal, custom workers, browser automation, and container sandboxes.

Managed Agents pulls part of that layer into the Gemini API product. Google's Agents Overview documentation describes managed agents as a configurable agent harness. In the documented flow, one API call provisions a Linux sandbox and lets the agent autonomously perform code execution, file management, and web browsing. Users can run the Antigravity agent as-is or create custom agents with their own instructions, skills, and data.

The notable details are AGENTS.md and SKILL.md. Google says developers can define an agent's instructions and skills through markdown files rather than writing complex orchestration code first. That resembles the repository-based instruction file culture already used by Claude Code, Codex, and local agent CLIs. The difference is that the file no longer stays inside a local development workflow. It becomes part of an API primitive running in a Google-managed sandbox.

Ramp testimonial image included in Google's Managed Agents announcement

Google's announcement includes testimonial imagery from early users such as Ramp, Resemble AI, Klipy, and Stitch. These examples do not all say the same thing, but they point in the same direction. When the execution loop and sandbox move into the platform, developers can spend less time asking how an agent will run and more time deciding which domain behavior they want to ship. In the best case, that speeds up iteration. In the worst case, it hands control of the execution layer to one platform.

Developer input: goal, files, instructions, AGENTS.md, SKILL.md

Gemini Interactions API: server-side state and typed execution steps

Antigravity agent harness: reasoning, planning, tool calling

Google-managed Linux sandbox: code, files, web, resumable environment

In this architecture, the Interactions API is not a small side feature. Google's Interactions API overview introduces it to developers coming from generateContent and centers the API around server-side history, background tasks, agentic workflows, and typed execution steps. One interaction is a full turn of a conversation or task. Inside it, model thinking, tool calls, tool results, and final output appear as ordered steps. The API is trying to expose the structured trace needed to rebuild agent execution in a user interface or observability tool.

State management is another important change. According to Google's documentation, the Interactions API uses store=true by default, and later calls can continue the conversation or task context through previous_interaction_id. Paid Tier interaction retention is 55 days, while Free Tier retention is 1 day. Developers can set store=false, but doing so limits background execution and follow-up interaction chaining. For an agent product team, that is not a minor option. It touches privacy, customer data, internal code, and audit policy.

The Managed Agents limits are just as important as the quick start. Environments are permanently deleted after 7 days of inactivity. VMs spin down when inactive and restore state on the next request. The default environment is Ubuntu-based and includes Python 3.12 and Node.js 22. The documented limit is 1,000 managed agents. The pricing note is even more practical: Google says a single interaction can commonly consume 100k to 3M tokens, and environment compute is not billed during the preview. That last phrase implies compute pricing can become part of product cost after preview.

The most sensitive area is networking. Google's docs say outbound network access in managed agent environments is unrestricted by default. Developers can use allowlists to restrict access to specific domains or wildcard patterns, but adopting the default for internal automation would be risky. The same applies to external API credentials. The docs recommend least-privilege service accounts, short-lived tokens, and credential rotation. The operational design has to start from a simple fact: credentials available to the agent are credentials the agent can actually use.

That gives Managed Agents two faces. On one side, it removes repetitive agent infrastructure work. Developers can write less code for sandbox startup, filesystem persistence, browser wiring, long-running task supervision, and execution tracing. On the other side, it reduces direct control over the execution layer. Sandbox image details, network boundaries, log retention, cost model, cold starts, and background task semantics become platform API contracts. That is a predictable tradeoff, not a hidden flaw.

VentureBeat's secondary analysis lands on that balance. It frames Google's approach as an attempt to compress weeks of agent deployment work into one API call, while giving up some execution-layer control. It also sees this as part of a larger shift in which agent orchestration moves from framework code into model provider platforms. That reading adds useful balance to the official announcement.

Google is not alone in this direction. Anthropic has been connecting models to tools and work apps through Claude Code, MCP, Claude Managed Agents, and Microsoft 365 add-ins. AWS has presented Bedrock AgentCore as managed infrastructure for runtime, browser, code interpreter, identity, and observability. OpenAI has been pushing long-running coding work toward an operational surface through Codex, sandboxing, remote control, and enterprise access-token patterns. GitHub Copilot coding agent already uses an asynchronous worker model built around GitHub Actions.

Google's distinguishing move is vertical integration around Antigravity. In the I/O developer highlights, Antigravity 2.0 spans a standalone desktop app, CLI, SDK, and Enterprise Agent Platform connection. Managed Agents makes that Antigravity harness available through the Gemini API and AI Studio. Google is not merely selling an IDE, CLI, API, sandbox, and model as separate pieces. It is trying to bind them into one execution experience. The Gemini 3.5 Flash performance story supports that broader package.

For builders, the practical question is not "should we use this?" It is "which layer do we want to operate ourselves?" For internal tools and fast prototypes, a service like Managed Agents can be powerful. A product team can quickly test data analysis agents, research agents, code transformation agents, or QA automation agents. If the API exposes state and execution steps, it also becomes easier to show progress to users. Starting from custom templates in Google AI Studio lowers the entry barrier further.

In regulated environments, SaaS products handling sensitive customer data, or systems touching internal source code and credentials, the checklist gets longer. Teams need to confirm whether outbound allowlists have replaced the default network behavior, whether interaction storage can be disabled for a given workflow, what retention policy is created by background tasks and resumed state, how far each credential scope extends, and where failed code execution results appear in logs. The constraints among store=false, previous_interaction_id, and background=true can change the architecture, not just a configuration line.

Cost is not simple either. The phrase "single call" is useful for developer experience, but internally that call can contain many reasoning loops and tool invocations. That is why Google's docs mention a 100k-3M token range. Estimating agent work with ordinary chat completion instincts can miss by a lot. When an agent browses the web, creates files, runs code, fixes failures, and executes again, one user request can turn into a long work graph.

Review areaSignal in Google's docsProduct team question
State retentionPaid 55-day and Free 1-day interaction retentionWill customer data or internal code be stored?
NetworkingOutbound unrestricted by default, allowlists availableWhich domains can the agent reach?
CostA single interaction can consume 100k-3M tokensWhat budget and stop conditions exist per task?
LifetimeEnvironment deleted after 7 days of inactivityWhere will long-running work and reproducibility live?

One especially interesting point is that Google places this feature next to custom agents and coding-agent setup. Gemini Docs guide developers to MCP and skill installation so a coding agent can consult current documentation. Managed Agents also lets developers define behavior through AGENTS.md and SKILL.md. The boundary between human-readable README files, automation-readable workflows, and agent-readable instruction files is getting thinner. A document inside a repository can become the behavior contract for a remote execution environment.

This is relevant for teams far beyond Google's own ecosystem. Until recently, adding an AI agent to a product meant choosing a model API and then designing a worker queue, browser runtime, filesystem sandbox, secrets handling, log viewer, retry policy, and cancellation path. If managed-agent services mature, early products can borrow much of that through an API. The later migration back to a self-operated runtime will require separating agent instructions, tool schemas, state formats, execution traces, and cost models. The faster a team starts, the easier it is to postpone the exit plan.

So the core of this announcement is not simply that Google launched another agent. Google has expanded the Gemini API purchase unit from model tokens into an executable work environment. That draws a meaningful line in the AI infrastructure market. If model providers absorb the runtime, developers can ship faster, but switching providers becomes more expensive because the semantics of execution are part of the product.

The practical conclusion for now is conservative. Managed Agents looks best suited first for prototyping, internal automation, non-sensitive data analysis, and quick validation of tool-oriented agents. For production workloads, teams should check the Public Preview status, the Beta nature of the Interactions API, possible breaking changes, default network behavior, interaction retention, and cost ceilings before committing. Google's own guidance says sensitive workflows should review agent actions and outputs. Once an agent has an execution environment, the thing to inspect is no longer just answer quality. It is the actual work being performed.

The agent-era API is no longer just a function that answers questions. It is becoming closer to a small operating environment that creates files, runs shells, opens browsers, and retains state. Gemini Managed Agents is one of the clearest announcements in that direction. In a world where one call can boot Linux, the first design question changes. It is less "what can the agent do?" and more "how far should we let it go?"