Blog

Notes and analysis on AI development.

Only 20% of social scientists use coding agents, exposing a research productivity gap

Anthropic surveyed 1,260 quantitative social scientists: 81% have used AI for research, but only 20% regularly use coding agents.

May 28, 2026[AI]

Claude Code adds hundreds of subagents as Opus 4.8 keeps price pressure on

Anthropic released Claude Opus 4.8 and Claude Code dynamic workflows, tying model pricing, subagent orchestration, and agent API control into one developer story.

May 28, 2026[AI]

CoreWeave Puts Agent Improvement Loops in the Cloud With 40% Cost Claim

CoreWeave bundled Serverless RL, W&B Weave, Sandboxes, and MCP into a cloud stack for improving production AI agents.

May 28, 2026[AI]

Mistral Vibe now reaches PRs, physics AI, and its own inference site

Mistral AI Now Summit bundled Vibe agents, industrial physics AI, and a 10MW inference data center into one enterprise AI stack.

May 28, 2026[AI]

Anthropic published Claude containment details, and 24 AWS key thefts explain the risk

Anthropic detailed the isolation design behind Claude Code and Claude Cowork. The numbers turn agent security from approval prompts into sandbox, VM, and egress policy.

May 28, 2026[AI]

2.0% Violation Rate Turns AI Behavior Specs Into Auditable Contracts

A new paper turns Claude Constitution and OpenAI Model Spec into testable audit targets, showing how model policies are becoming benchmarks.

May 28, 2026[AI]

Copilot Memory now has an off switch, and agent memory becomes a permissions problem

GitHub added deletion guidance, repository-level off switches, CLI controls, and scope prompts to Copilot Memory. The update turns coding-agent memory into a governance surface.

May 28, 2026[AI]

YouTube Will Auto-Label AI Videos as Detection Moves to the Platform

YouTube is moving AI-generated video labels onto the player surface and adding automatic detection for realistic AI media. Here is what changes for creators, viewers, and AI product teams.

May 28, 2026[AI]

Decepticon 1.1.3 Tests the Guardrails for Autonomous Red-Team Agents

Decepticon 1.1.3 shows that red-team agents are now competing on rules of engagement, sandboxing, graphs, release integrity, and auditability.

May 28, 2026[AI]

Takane gains 28 points as Fujitsu narrows safe self-evolving agents

Fujitsu self-evolving multi-AI agents show how enterprise LLMs may keep improving through verified feedback, design-search loops, and operating controls.

May 28, 2026[AI]

CopilotKit Puts $27M Behind the Agent UI Layer

CopilotKit’s $27M Series A turns AG-UI into a signal that agent competition is moving from models and tools into user-facing interface protocols.

May 28, 2026[AI]

Model Choice Becomes Org Policy Before Copilot AI Credits

GitHub Copilot targeted model rules arrive just before the June 1 AI Credits switch, turning model choice into an org-level cost and security control.

May 28, 2026[AI]