Devlery

Devlery - AI news for builders

DEVLERYDEVLERYDEVLERY

Devlery blog

AI news for builders.

2.03x Token Throughput, EAGLE 3.1 Fixes Speculation Drift

2.03x Token Throughput, EAGLE 3.1 Fixes Speculation Drift

vLLM EAGLE 3.1 targets attention drift in speculative decoding, with early gains for long-context and coding-agent serving workloads.

AI Broke an Erdos Conjecture, and the Real Story Is the Verification Loop

AI Broke an Erdos Conjecture, and the Real Story Is the Verification Loop

OpenAI’s unit-distance counterexample shows that AI research automation depends less on answer generation than on proofs experts can inspect.

The 90-day review stalled, and AI model launches changed

The 90-day review stalled, and AI model launches changed

Trump’s delayed AI executive order shows frontier model launches being reshaped around speed, security evaluation, and critical infrastructure readiness.

Datasette Agent shows why narrow AI agents matter

Datasette Agent shows why narrow AI agents matter

Datasette Agent connects SQLite exploration with LLMs, plugin tools, permissions, and sandbox execution in a narrow but practical agent experiment.

The 1,000-session wall, and why agent products need analytics

The 1,000-session wall, and why agent products need analytics

Voker’s Launch HN shows how agent operations are moving beyond trace debugging toward product analytics for intents, corrections, and resolutions.

The 3-second approval device trying to hold agent authority

The 3-second approval device trying to hold agent authority

Foundation Passport Prime is an experiment in moving final approval for AI agents out of the browser and into dedicated hardware.

One API call, Google opens the serverless agent runtime

One API call, Google opens the serverless agent runtime

Gemini API Managed Agents hides sandboxing, state, and tool loops behind an API, moving agent competition into runtime infrastructure.

AWS MCP Packs 15,000 APIs Into a New Boundary for Cloud Agents

AWS MCP Packs 15,000 APIs Into a New Boundary for Cloud Agents

AWS Agent Toolkit and the AWS MCP Server GA show how coding agents can reach cloud accounts through IAM, CloudWatch, and CloudTrail.

Docusign MCP Beta Turns Agreements Into Agent Tools

Docusign MCP Beta Turns Agreements Into Agent Tools

Docusign Iris Agents and its MCP beta show how agreement data can become a callable work surface for Claude, Gemini, and ChatGPT.

The 15x token bill and the return of the AI-native cloud

The 15x token bill and the return of the AI-native cloud

DigitalOcean AI-Native Cloud shows why agent costs are shifting from GPU rental to inference routing, data, state, and operations.

The 24-Hour Agent Permission Problem in Front of 900M Gemini Users

The 24-Hour Agent Permission Problem in Front of 900M Gemini Users

Google Gemini Spark brings background agents, MCP connections, and approval boundaries into a mass-market consumer AI surface.

99.82% Cache Hits, the New Variable in Coding Agent Costs

99.82% Cache Hits, the New Variable in Coding Agent Costs

The Reasonix debate shows that coding agent costs depend not only on model pricing, but on harness design that keeps prefix cache intact.