AI
Two H100s Are Enough, Command A+ Targets Private Agents
Cohere Command A+ lowers the bar for private AI agents with Apache 2.0 open weights and a two-H100 deployment target.
AI
Cohere Command A+ lowers the bar for private AI agents with Apache 2.0 open weights and a two-H100 deployment target.
AI
Google Co-Scientist and Gemini for Science shift AI research tools from answer generation toward hypothesis loops that humans can test.
AI
NVIDIA released tri-mode diffusion LLMs that switch between AR, diffusion, and self-speculation generation in one checkpoint.
AI
Gemini 3.5 Flash is no longer just a fast chatbot model. It reframes Flash as an agent execution engine and changes how developers calculate cost.
AI
DeepSeek is turning V4-Pro API discount pricing into the new baseline, forcing agent builders to recalculate inference cost and routing strategy.
AI
Anthropic is widening the moral formation conversation around Claude while testing an ethical reminder tool inside the model runtime loop.
AI
OpenAI’s counterexample to the Erdős unit-distance conjecture shows both the promise of AI research automation and the reproducibility gap left by an unnamed model.
AI
AWS SageMaker AI now supports OpenAI-compatible inference endpoints, moving enterprise LLM friction from model deployment toward API surfaces, IAM, and routing layers.
AI
The DuplexSLA paper reframes real-time voice-agent latency around a 160ms action channel where speech, planning, and tool calls share one timeline.
AI
Gemini 3.5 Flash pushes speed and agent performance, but Copilot’s 14x request multiplier and early quota complaints expose the new cost bottleneck.
AI
Cohere Command A+ reframes open model competition around agentic inference cost, private deployment, Apache 2.0 licensing, and GPU math.
AI
OpenAI reportedly offered YC startups $2 million in API tokens through an uncapped SAFE, turning inference compute into a new investment instrument.