Falco Steps In Before Coding Agents Call Tools

Prempti is a new Falco experiment that evaluates coding-agent tool calls before Claude Code and similar agents execute them.

AI 요약

What happened: The Falco team introduced Prempti, a policy layer that judges coding-agent tool calls before execution.
- It can return allow, deny, or ask for file reads, file writes, shell commands, web fetches, and MCP calls.
Why it matters: Agent security is shifting from "trust the model" toward policy decisions at the moment an agent is about to act.
Watch: Prempti is a tool-call policy layer, not a sandbox or syscall security boundary.
- To see what a malicious binary does after launch, teams still need Falco eBPF/kmod, sandboxing, and least-privilege execution.

The Falco team has introduced Prempti. On the surface, it looks like a small security tool for coding agents such as Claude Code. The more interesting part is that it reframes the security question around coding agents. Instead of asking only whether the model will make safe choices, Prempti asks who gets to decide immediately before the model's proposed action reaches the developer's machine.

Prempti sits next to a coding agent and intercepts tool calls before they run. When the agent wants to read a file, write a file, run a shell command, fetch a URL, or call an MCP tool, Prempti evaluates that request against familiar Falco YAML rules. The verdict is one of three outcomes: allow, deny, or ask. Allowed calls proceed. Denied calls are blocked, and the agent receives a structured explanation of why the action was rejected. Calls that need context can be routed to the user for approval. The CNCF member post describes the project as a policy and visibility layer for AI coding agents.

Agent tool calls being evaluated in a Prempti demo

This matters because coding agents are no longer just smarter autocomplete widgets beside the editor. Claude Code, Codex, Gemini CLI, Cursor-style agent modes, and similar systems operate inside the user's terminal and project directory. They read files, write patches, install dependencies, run tests, call the network, and sometimes interact with cloud or package-manager CLIs. Most of those actions happen under the user's own permissions. If an agent accidentally reads .env, touches ~/.ssh, follows a hostile README into a curl | bash pattern, or rewrites a config outside the requested scope, a friendly final answer is not enough evidence that the workflow was safe.

Falco's own announcement asks the question directly: do you know what a coding agent is doing on your machine? A developer can inspect the final answer and the final diff, but many workflows do not leave a structured trail of every file the agent opened, every command it attempted, or every boundary it crossed. Prempti tries to capture that missing layer at the tool-call boundary. That is a useful place to stand because it is higher-level than raw syscalls but closer to execution than chat transcripts.

Why Falco Fits This Problem

Falco is already known as a CNCF graduated project for runtime security across containers, Kubernetes, and host environments. Its core model is simple: observe events, evaluate them against rules, and surface risky behavior. Prempti brings that model to the coding-agent lifecycle. Instead of starting from container events or system calls, it introduces a coding_agent event source and fields such as tool.name, tool.input_command, tool.file_path, and agent.cwd.

That choice is practical. If every coding agent implements approvals, hooks, and policy in its own language, security and platform teams need to learn a different control plane for each tool. A Falco rule gives them a more familiar way to say which actions should be blocked, which should require confirmation, and which should simply be logged. Prempti could have invented a new risk score or a bespoke policy language. Using Falco YAML makes the experiment easier to reason about for teams that already operate detection rules, exceptions, and noisy-alert tuning.

One example from the Falco blog blocks commands that pipe remote content directly into shell interpreters. Patterns such as curl | bash, wget | sh, or process substitution into bash have been part of supply-chain and remote-code-execution incidents for years. A developer who sees that command in isolation may pause. The problem is that an agent can attempt it deep inside a long task while the human is no longer reading every step. Prempti moves that judgment from an attention problem into a policy decision.

- rule: Deny pipe to shell
  desc: Block piping content to shell interpreters
  condition: >
    tool.name = "Bash"
    and (tool.input_command contains "| sh"
         or tool.input_command contains "| bash"
         or tool.input_command contains "| zsh")
  output: >
    Falco blocked piping to a shell interpreter (%tool.input_command)
  priority: CRITICAL
  source: coding_agent
  tags: [coding_agent_deny]

The output is not only a warning for a human security console. Prempti returns an explanation that the agent can understand. After a denial, the agent can learn that a specific action is outside policy, choose a safer route, or ask the user for a more explicit decision. In that sense, the guardrail becomes an interface between the policy engine and the agent, not just a silent breaker.

The Small Difference Between Allow, Deny, And Ask

Prempti's verdict model is intentionally small. In coding-agent operations, though, the difference between these three outcomes is significant.

Verdict	Execution result	Good fit
`allow`	The tool call runs as requested.	Routine project reads, test commands, and safe file edits.
`deny`	The call is blocked and the agent receives the reason.	Access to `~/.ssh`, `~/.aws`, `.env`, pipe-to-shell commands, or suspicious exfiltration attempts.
`ask`	Execution waits for the user's approval or rejection.	Dockerfile edits, access outside the working directory, deployment commands, or other context-sensitive work.

This creates a useful middle ground between blocking everything and trusting the model. Anyone who uses coding agents for real work quickly runs into this tradeoff. Some risky-looking actions are legitimate in context. An agent may need to install a dependency, read a shared config outside the repository, start a local server, or change an infrastructure file as part of the requested task. A blanket deny policy would make the agent much less useful. A blanket allow policy leaves the audit trail and permission boundary too weak. ask gives the human a narrow place to apply context without turning every action into a modal interruption.

Prempti also supports two operating modes. Monitor mode evaluates and logs tool calls without enforcement. The Falco and CNCF posts both present that mode as the right starting point: observe several agent sessions, see what the agent actually touches, tune noisy rules, and then move to enforcement. Guardrails mode is the default enforcement path, where deny and ask verdicts actually affect execution.

What The Default Rules Say About The Threat Model

Prempti's default ruleset is a compact map of the risks its authors see around local coding agents.

The first risk is the working-directory boundary. A coding agent may be invited to modify a project, but the permissions it inherits can extend across the user's home directory. Reading or writing outside the repository is not always malicious, but it should usually be visible and sometimes require approval.

The second risk is sensitive paths. Files under /etc/, ~/.ssh/, ~/.aws/, cloud credential directories, and .env are different from source files. There are rare cases where an agent might need to inspect environment setup, but the default should be conservative. This becomes more important when prompt injection enters through documentation, dependency metadata, or issue text that tells the agent to check keys or read a configuration file.

The third risk is sandbox disablement. Modern coding-agent products increasingly add sandboxes, approval prompts, and network restrictions. If an agent changes settings to reduce those restrictions, that action itself should be treated as a policy event. Guardrails are not only about the dangerous command. They are also about attempts to switch off the mechanism that would have judged the command.

The fourth risk is the connection layer around agents. Prempti's rules cover areas such as MCP server configuration poisoning, slash-command file injection, hook injection, git hooks, package registry redirects, and AI API base URL overrides. That list reflects a broader change in the agent ecosystem. A coding agent is not merely a program that edits local files. It is an execution surface that combines tools, plugins, skills, remote servers, package managers, and model APIs. Attackers can aim at those connections instead of the model text itself.

The Key Limit: Prempti Is Not A Sandbox

One of the stronger parts of the Prempti announcement is that it does not hide the boundary of the approach. Falco's blog explains that Prempti sees declared tool calls. It does not directly observe the system calls produced by whatever the tool call launches. If an agent writes malicious C code, compiles it, and runs the resulting binary, Prempti can evaluate the gcc main.c -o main call and the ./main call. It cannot, by itself, tell you which files the binary opens or which network connections it makes after launch.

That distinction matters. Prempti should not be read as containment. The project README frames it as a cooperative policy and visibility layer, not as a mechanism for containing a determined adversarial agent. For a stronger boundary, teams still need system hardening, least-privilege execution, sandboxing, and, on Linux, Falco's eBPF or kernel-module visibility for syscall-level behavior.

The useful part is that tool-call policy sits at a layer with more intent than raw syscalls. An open() event is precise, but it does not tell you much about the prompt, task, or agent plan that led to it. A chat transcript has context, but it may omit concrete actions. The tool call is the middle layer: structured enough for policy, close enough to execution to matter, and still tied to the agent's plan.

Claude Code First, Codex Later

Prempti currently describes Claude Code support across Linux, macOS, and Windows, with both x86_64 and ARM families mentioned in the project materials. Codex integration is presented as planned or on the roadmap. That ordering is unsurprising. Claude Code has an active hooks and skills ecosystem, and terminal-based agent workflows are already part of many developer routines.

One especially interesting detail is that Prempti also ships a Claude Code skill for writing Falco rules. Users can add the Falco Prempti plugin from the marketplace and install prempti-falco-rules@prempti-skills. That means the agent can help write policies that constrain its own future behavior. Requests such as "block git push," "deny reads outside the working directory," or "ask before Dockerfile edits" become rule-authoring prompts.

That is a small but symbolic loop. The next stage of coding tools is not only about agents writing more code. It is about agents making their operating boundaries reviewable, teams reviewing those boundaries, and failures feeding back into policy. If a security team cannot directly control every model decision, it can still put a reviewable policy engine at the entrance to the tools the model wants to use.

Community Reaction Started With A Small Question

Prempti has not produced the kind of broad discussion that follows a major model launch. The early community reaction is still useful because it focuses on operational details. In the r/devsecops discussion, one user asked whether this was essentially a simple pre-tool hook plus a list of rules. The author answered that, from the agent side, it is close to a PreTool hook, but Falco brings a mature rule engine, custom queries, exceptions, noise-to-signal tuning, and the possibility of reusing centralized Falco infrastructure.

That answer captures both the strength and the modesty of the project. The core idea is simple: intercept before the agent acts, evaluate against policy, and return a verdict. But in security, a simple idea is not the same as an operable system. Teams need shared rules, managed exceptions, logs, tuning, and integration with existing security workflows. Reusing Falco is a way to reduce that operational gap.

In r/ClaudeCode, the reaction included the familiar concern that current workflows often amount to trusting the model and hoping to notice a bad tool call in time. Other commenters pointed out that regular Falco or another syscall-level layer is still needed to stop behavior after a process starts. That balance is the right one. Prempti is not the whole answer to coding-agent security. It is a fast policy layer that fits into a workflow that is currently too permissive and too weakly observed.

What Development Teams Should Inspect Now

Prempti is still an experimental preview, and teams do not need to rush it onto every developer machine tomorrow. The direction it points to is worth taking seriously.

First, teams should inventory the permissions their agents actually use. File reads, file writes, shell commands, network access, MCP tools, browser tools, Git remotes, cloud CLIs, package managers, and secret stores are separate capabilities. Each one needs a default: allow, ask, deny, or log. The old rule of thumb, "it is fine if a developer can do it manually," does not transfer cleanly to agents. Agents are faster, more repetitive, and more willing to follow instructions embedded in surrounding text.

Second, approval UI is not enough. If users must read every tool call, the agent loses much of its value. If approval prompts become too frequent, users start approving by habit. A useful policy design separates normal actions from high-risk actions and context-dependent actions. Prempti's suggestion to start in Monitor mode is practical because it lets teams tune rules based on real sessions rather than imagined workflows.

Third, audit logs need a clear purpose. When an agent modifies the wrong file, the final diff is only part of the story. Teams may need to reconstruct which prompt, file reads, commands, denials, and approvals led to the result. Prempti's audit trail is one piece of that incident-reconstruction story.

Fourth, policy should be designed with sandboxing, not instead of it. Prempti can evaluate declared tool calls. Process internals, child processes, network egress, and credential scope belong to other layers. As coding agents move from personal experiments into team workflows, they should be treated less like convenience features and more like automated actors with development privileges.

A Small Project That Shows A Larger Shift

Prempti is not a huge platform announcement. Its scope is intentionally narrow, and its adoption is still early. That is part of why it is interesting. Small projects often make a market shift easier to see. Coding-agent competition started with model quality, IDE integration, price, context windows, and benchmark scores. Now execution permissions, policy, observability, auditing, and sandboxing are moving onto the same stage.

The design detail to watch is the LLM-friendly explanation. Traditional security tools alert humans. Prempti returns policy results to both the human workflow and the agent. Future agent-security layers will likely do more than block. They will teach agents what the boundary is, let them re-plan around it, and make policy part of the working context.

There is a risk in that idea too. An agent that "understands" policy might also produce more convincing arguments for bypassing it. That is why enforcement has to stay outside the model. Prempti's important choice is that the verdict comes from a Falco policy engine before execution, not from the model's own judgment.

The core news is less Prempti itself than the control point it names. When a coding agent is about to read a file or run a command, are we ready to treat that action as a reviewable policy event? Many teams still answer with chat logs, code review, and local trust. Prempti suggests that answer may not hold for long.

As agents take on more work, choosing a better model will not be enough. Teams need to decide which actions run automatically, which actions require confirmation, and which actions never run. Those decisions should be recorded, testable, and separate from the model's persuasion. That is why Falco stepping in before the tool call is a signal worth watching.