LLM

135 posts

AI AI Agent AI Infrastructure Developer Tools AI Coding Security AI News MCP

JetBrains Opens Mellum2 to Cut Coding Agent Call Costs

JetBrains Mellum2 is a 12B MoE model that activates 2.5B parameters per token. It adds a private coding model option for IDE and agent workflows.

June 2, 2026

NVIDIA Bundles a 550B Open Model With an Agent Runtime Stack

NVIDIA introduced Nemotron 3 Ultra alongside NemoClaw, OpenShell, and CUDA-X agent skills, pushing open agent competition into the runtime layer.

June 2, 2026

OpenAI models and Codex are now generally available on Amazon Bedrock

OpenAI GPT-5.5, GPT-5.4, and Codex are generally available on Amazon Bedrock, shifting authentication, billing, and feature boundaries into AWS.

June 1, 2026

NVIDIA opens a 32B robotaxi model and closed-loop training stack

NVIDIA Alpamayo 2 Super ties a 32B VLA teacher model to AlpaGym, OmniDreams, CoC auto-labeling, and agent skills for L4 robotaxi development.

June 1, 2026

QVAC 0.12.0 brings TurboQuant to local KV cache pressure

Tether announced QVAC SDK 0.12.0 with TurboQuant support. The useful question is how KV cache compression changes local long-context AI.

June 1, 2026

AgentCore A/B testing turns prompt edits into release experiments

AWS AgentCore Optimization preview productizes agent quality loops with trace-based recommendations, batch evals, and A/B testing.

June 1, 2026

Cohere Command A+ ships as an open-weight MoE for two H100s

Cohere Command A+ combines Apache 2.0 open weights, a 218B MoE design, 25B active parameters, 128K context, and enterprise deployment options.

June 1, 2026

SageMaker Adds OpenAI API Support for AWS-Hosted Models

AWS SageMaker now supports /openai/v1 endpoints, lowering the migration cost for OpenAI SDK, LangChain, Strands Agents, and AI gateways.

June 1, 2026

MiniMax M3 brings 1M context to open-weight coding models

MiniMax M3 combines 1M context, multimodality, and coding-agent benchmarks, but its weights and technical report are still pending verification.

June 1, 2026

Mistral Search Toolkit separates RAG failures from model failures

Mistral Search Toolkit public preview treats RAG and agent-search failures as retrieval, pipeline, and evaluation problems.

June 1, 2026

RTX Spark Debuts as a Local AI PC for 120B LLMs

NVIDIA and Microsoft introduced RTX Spark, a Windows PC category for local agents with 120B LLMs, 128GB unified memory, and OpenShell.

June 1, 2026

CoreWeave Agentic AI Turns Inference Logs Into Training Signals

CoreWeave introduced agentic AI integrations that connect inference, W&B Weave observability, serverless RL, and coding-agent tooling into one improvement loop.

June 1, 2026