Full repo scanning, the SAST gap AWS Security Agent is targeting
AWS Security Agent full repository code review targets trust boundaries and data flows that traditional SAST often misses.
- What happened: AWS previewed full repository code review for
AWS Security Agent.- The agent reads the repository, models entry points, trust boundaries, data flows, and authorization invariants, then searches for vulnerabilities.
- Why it matters: Security review is widening from file-level pattern matching to repository-scale architecture reasoning.
- Builder impact: Teams connect a GitHub repository or S3 source, run a 30-60 minute review, then inspect findings and remediation PRs.
- Watch: This is still a preview, and public data on false positives, missed findings, and customer-scale benchmarks remains limited.
AWS announced a preview of full repository code review for AWS Security Agent on the AWS Security Blog on May 12, 2026. At first glance, the name sounds like a familiar code scanner with a wider scope switch. The claim is more specific than that. AWS says the feature does not simply scan one file or match a known vulnerability pattern. It first tries to understand the application structure, trust boundaries, data flows, and authorization invariants across the whole repository, then searches for and validates likely vulnerabilities.
That is why this update is interesting. The story is not just that AI can read more code. It is about where security review still gets stuck. Traditional SAST is fast, consistent, and useful against known patterns. It is good at hardcoded secrets, obvious SQL injection sinks, missing escaping, dangerous crypto calls, and banned APIs. But many real security failures are not a single suspicious line. They are inconsistencies across a system: one of five paths skips validation, the same field is encoded in one rendering context and raw in another, or one endpoint quietly loses the authorization annotation that the rest of the service relies on.
AWS is positioning full repository code review directly in that gap. The announcement contrasts the feature with "traditional static analysis tools" and says it reasons about application architecture, trust boundaries, and data flows. It also uses a bolder phrase: AWS Security Agent can find vulnerabilities across a code base and build a working exploit. That sentence is attractive, but security teams should read it carefully. What matters is not whether an agent can produce an impressive finding. What matters is what evidence it leaves behind, what it could not verify, and whether a reviewer can trust the boundary between code evidence and environment assumptions.
What SAST misses because it moves fast
SAST is still necessary. It is cheap to run, easy to wire into CI, and strong at repeatable policy enforcement. For every pull request, rules-based tools may be the better fit for catching secrets, dependency issues, known injection patterns, and forbidden APIs. The unresolved space sits between what SAST does well and what human security reviewers often find manually.
AWS's examples make that gap concrete. In one SQL injection scenario involving a stored procedure, a conventional scanner might identify a specific EXECUTE IMMEDIATE call. AWS says the full repository review connected a broader set of facts: a central validation function failed to reject single quotes across all five regex profiles, single quotes mattered for the target database engine, and another stored procedure skipped the validation function entirely. In that case, the fix is not just adding escaping at one call site. The fix is strengthening the validation design.
The XSS example has the same shape. A value is processed with Encode.forHtml() in one context, but the same file has another path that inserts it into a field without HTML encoding. A pattern matcher may see an encoding function and lower the alarm. A human reviewer asks the more useful question: why is the same value protected in one route and unprotected in another? AWS Security Agent's full repository review is trying to automate that question.
| Review mode | Strong at | Weak at | Signal for teams |
|---|---|---|---|
| Traditional SAST | Known vulnerable patterns, secrets, banned APIs, repeated checks | Business flow, trust boundaries, cross-file inconsistency | Keep it as the baseline CI gate |
| PR security comments | Change-focused feedback, developer workflow integration | Existing code and architecture context | Useful for fast blocking of small changes |
| Full repo agent review | Entry points, data flow, authorization invariant reasoning | Runtime, cost, false-positive explanation, source access boundaries | Best fit as pre-screening before security review |
| Manual security review | Design judgment, threat models, organization context | Cost, queue time, limited review surface | Reserve it for high-risk design and final judgment |
The four stages show what agentic security review is becoming
AWS describes the workflow in four stages. The first is profiling. The scanner reads the whole repository and models entry points, trust boundaries, data flows, authorization invariants, and existing defenses. This matters because coverage is no longer just "how many files did the scanner read?" The better question becomes "what attack surface did the system infer?"
The second stage is search. An orchestrator reads the security profile and sends specialized agents into high-risk components. Each agent gets a specific module, threat context, and adversarial question. The important detail is that agents can follow imports or callers outside the initial scope. Vulnerabilities rarely stay inside one file. Validation may live in middleware, authorization may sit in decorators, and the dangerous sink may be inside the service layer. A repository-scale review has to follow those paths.
The third stage is triage and deduplication. Candidate findings that share the same sink or root cause are merged, and low-confidence noise is filtered out. This is especially important for AI security tools. Models are good at generating plausible hypotheses, but a plausible hypothesis is not yet a useful security result. For a security team, two validated findings can be more valuable than ten speculative alerts. A noisy tool eventually gets ignored.
The fourth stage is independent validation. A separate validator reads the source again and traces the full attack chain. It considers both why the finding might be exploitable and why it might not be. AWS says findings include Verified and Could not verify sections. That split may be the most practical part of the announcement. If a security agent cannot fully know the deployment environment, network segmentation, and runtime behavior, it needs to separate code-backed evidence from environment-dependent assumptions.
This structure resembles Microsoft's MDASH work. MDASH emphasized discovery, validation, and proof across Windows networking and authentication stacks. AWS Security Agent points at a more general product path for development teams. If Microsoft's version is closer to a massive proprietary operating-system code base and the Patch Tuesday loop, AWS's version is closer to customer repositories, GitHub workflows, S3 sources, and remediation PRs. Both point in the same direction. The security AI race is less about the model name and more about turning a finding into a verifiable unit of work.
The developer workflow is familiar on purpose
The official quickstart shows a surface area that most teams will recognize. First, a team creates an Agent Space in the AWS Console. An Agent Space is scoped to an application under test and can be shared by multiple users. Access can start with IAM-only permissions. If the team wants users to access the Security Agent web application without the AWS Management Console, it can enable IAM Identity Center integration.
Then the team connects a GitHub integration. Repositories can be selected broadly or narrowly. Code review comments and automatic remediation can be enabled or disabled per repository. If the source lives in an S3 bucket, AWS also supports an S3 source. The code review configuration analyzes both general vulnerability findings and organization-specific security requirements.
Execution happens in the Security Agent web application. A user creates a code review, chooses a GitHub repository or S3 source, and selects a service role. If automatic code remediation is enabled, the agent can request a fix PR for a finding. AWS documentation says a code review usually takes 30-60 minutes depending on the size of the code base. After completion, the Findings tab shows the description, code locations, and risk reasoning. The Remediate code action can then create a fix PR.
This changes where security review can sit in the workflow. Traditionally, a security team schedules a penetration test or architecture review at a specific time, while developers wait or self-check. AWS is proposing a flow where a full repo agent review runs before the deeper human review so obvious and semi-obvious issues can be cleared earlier. The announcement frames this as a complement to existing security tooling, not a replacement.
Finding format will decide whether teams trust it
The most dangerous failure mode for an AI security tool is not merely a wrong finding. It is a wrong finding delivered with confident language and weak evidence. Developers already live with alert fatigue. If a finding does not separate actual exploitability, deployment conditions, required assumptions, and sufficient remediation, the tool becomes another noisy report.
AWS appears to understand that risk, which is why the finding structure gets attention. Problem explains what the code does wrong and points to files and lines. Impact describes what an attacker could gain, including deployment context where possible. Verified and Could not verify separate what the code confirms from what depends on the environment. Remediation is meant to suggest concrete code changes rather than generic guidance. Severity and confidence are also separate. Severity describes the damage if exploitable. Confidence describes how much of the attack chain the agent confirmed from source.
That distinction matters. Imagine an internal admin endpoint that appears to lack an authorization check. From code alone, it looks risky. In production, it might sit behind a private network, mTLS, and an identity-aware proxy. Severity could still be high, but confidence depends on environment evidence. If the same missing check appears on an internet-exposed endpoint and route configuration confirms it, confidence rises. A security agent that cannot express this difference will struggle to save human reviewer time.
Why the whole repository matters
Full repository review is expensive. It reads more files, traces more call graphs and data flows, and has to deduplicate more candidate findings. But vulnerabilities do not follow repository boundaries politely. User input enters at a controller, moves through DTO conversion, passes authorization or validation in a service layer, and becomes a query or template render elsewhere. If encoding, validation, and authorization are each in different files, a scanner looking at one file at a time will miss structural failures.
The bigger issue is organization knowledge. Security teams know things like "this library is our approved authorization layer," "this logging field contains personal data and must be masked," or "this S3 bucket prefix is a tenant boundary." AWS Security Agent documentation says teams can define organization-specific security requirements in the console and apply them to design review and code review. That is more practical than a generic vulnerability checklist. In many teams, "did this violate our authentication, logging, or data access standard?" matters more than the generic label on the bug.
Repository-scale review also matters when teams inherit code or bring in an open source component. Before accepting a code base, a human reviewer has to understand its structure. Which endpoints exist? Which data is sensitive? Which validations are centralized? Which routes are exceptions? AWS presents the scanner's ability to build a security model without institutional knowledge as an advantage. That claim still needs customer evidence and benchmarks, but the problem statement is real.
Why the community response is still measured
The update makes strong claims, but public community reaction is not yet broad. I did not find a direct Hacker News discussion. A Reddit post in r/Cloudvisor briefly characterized full repository review as a meaningful improvement over single-file or snippet-level checks. Earlier feedback in r/aws about the AWS Security Agent preview described an ERR_ACCESS_DENIED issue when target URL and authentication URL configuration did not line up. That case is less about this specific code review feature and more about an operational boundary that security agents always hit.
Security agents have to satisfy two demands at once: act freely enough to simulate an attacker, and stay strictly inside the customer's allowed environment. If authentication URLs, redirect domains, staging hosts, callback paths, API gateways, or private network access are misconfigured, an agent may fail to reproduce the attack chain. Worse, it could probe the wrong surface if guardrails are loose. As AWS Security Agent expands code review to the repository level, configuration quality becomes more important, not less.
Data boundaries are another question. Full repository review reads a lot of code and documentation. GitHub repositories, S3 sources, service roles, and CloudWatch log groups can all be involved. Teams need to understand which code moves where, which model or runtime can access it, and what sensitive information may appear in findings or remediation PRs. Giving a security tool access to the whole source tree is not just a productivity choice. It is a supply chain, permission, and audit decision.
This is not the end of SAST
Reading this as "SAST is dead" would be too simple. The more realistic conclusion is that security review is becoming layered. The first layer is fast rules-based scanning for secrets, dependencies, known sinks, licenses, and infrastructure misconfiguration. The second is PR-centered developer feedback that catches new risk when a change is still fresh. The third is repository-scale agent review before a release, acquisition, high-risk service change, or architecture review. The final layer is human security review, where threat models, business context, risk acceptance, and product policy still belong.
AWS Security Agent's full repository code review is an attempt to productize that third layer. The announcement suggests running it before a penetration test or security review so obvious and semi-obvious issues surface earlier. That is a sensible target. If security specialists spend less time on validation gaps that a machine can trace and more time judging whether a permission model actually fits the product, the tool has value.
The preview label still matters. Public information is not enough to judge language and framework coverage, false positive rates, false negative patterns, model selection, source retention, cost and runtime in large monorepos, or the quality of automatic remediation PRs. AWS's 30-60 minute estimate also varies with code base size. Security teams considering this workflow should start with a lower-risk repository, build a baseline, and compare the results against existing SAST and penetration test findings.
Questions teams should ask before adopting it
The first question is where the tool belongs. Running a full repository review on every PR may not make sense for cost or latency. Running it on a release candidate, a major authentication or payment change, an acquisition code review, or a new public API may be much more useful. Teams should decide whether the agent review is a synchronous CI gate or an asynchronous report before security review.
The second question is finding ownership. If automatic remediation opens a PR, who reviews it? Can the development team merge it directly, or does security need to approve it? Do low-confidence findings become backlog items? How should the team handle a high-severity but low-confidence result? The moment an AI security tool starts producing findings, the operating policy matters.
The third question is whether organization-specific requirements are precise enough. AWS Security Agent's advantage is not just a generic checklist. It can apply local security requirements. But if those requirements are stale or vague, the agent's output will be vague too. "Use proper authorization" is hard to verify. "External user data access must pass the approved middleware and tenant scope check" is closer to a testable requirement.
The fourth question is how it fits with existing tools. A team may already use Semgrep, Snyk, CodeQL, GitHub code scanning, Endor AURI, or internal linters. Adding AWS Security Agent means mapping duplicate findings, severity, suppression, and ticket routing. Even a strong security tool fails if it collides with the workflow developers already use.
The next benchmark for security agents
Over the last few months, AI security competition has moved quickly. Anthropic has emphasized vulnerability discovery through Claude Code Security and related model workflows. Microsoft showed MDASH as a multi-agent scanning harness tied to Patch Tuesday. Endor Labs is watching the package and dependency risks that AI coding agents introduce through AURI. AWS is now moving closer to the customer repository and developer workflow: read the whole source tree, structure the finding, and connect the result to remediation PRs.
The shared direction is clear. The security AI question is shifting from "can the model find a vulnerability?" to "can the system turn that finding into evidence, permissions, workflow, and a fix?" Full repository review is an important piece of that shift because application security is no longer just a pattern in one file. It is a system problem across data, authorization, and deployment boundaries.
AWS Security Agent's new capability is still in preview, and it needs independent validation plus real-world usage data. But the direction is practical. There has long been a gap between what SAST catches quickly and what humans inspect deeply. AWS is trying to put an agent into that gap. Whether it works will depend less on model spectacle and more on finding quality, permission boundaries, developer workflow integration, and the discipline to state clearly what the agent could not verify.