OpenJarvis 1.0 brings personal AI onto an Ollama PC

Ollama now supports OpenJarvis v1.0. The release shows how local personal AI changes cost, latency, and data boundaries.

AI 요약

What happened: Ollama announced support for OpenJarvis v1.0 on May 28, 2026.
- OpenJarvis is a local-first personal AI framework from Stanford Hazy Research and Scaling Intelligence researchers.
The numbers: The OpenJarvis paper reports roughly 800x lower marginal API cost and 4x lower latency for on-device spec execution.
Why builders should care: Personal AI design moves beyond a model picker into Engine, tools, memory, and evaluation specs.
Watch the caveat: A naive local model swap can reduce benchmark accuracy by 25-39 percentage points, so the whole stack has to be tuned.

Ollama announced on May 28, 2026 that OpenJarvis v1.0 can now run with Ollama. OpenJarvis is an open source framework for personal AI agents that run first on the user's own hardware and call the cloud only when needed. The Ollama post says the macOS and Linux installer detects Ollama automatically, while Windows users should use WSL2 or the desktop app path.

Official OpenJarvis horizontal logo

This is larger than one more local app appearing in Ollama. The OpenJarvis paper starts from a personal AI setting where an assistant keeps working across email, calendars, files, browsers, and code. If that context is sent to a cloud model on every turn, cost, latency, data exposure, and offline behavior all become product constraints. Ollama support turns a research-oriented local-first agent stack into something ordinary local model users can install and try.

Ollama's announcement connects OpenJarvis to the "Intelligence Per Watt" work from Stanford Hazy Research and Scaling Intelligence. The GitHub README uses the same framing. It says local language models can now handle 88.7% of single-turn chat and reasoning queries, and that intelligence efficiency improved 5.3x from 2023 to 2025. OpenJarvis does not start from "small models are now good enough" as a generic claim. Its narrower question is how improved local models become a practical personal AI product stack.

The paper first rejects the simplest migration path: swapping the model and leaving the rest of the personal AI stack intact. The researchers write that replacing Claude Opus 4.6 with a local model such as Qwen3.5-9B inside an existing personal AI stack drops accuracy by 25-39 percentage points on personal AI tasks such as PinchBench and GAIA. Their explanation is that prompts, tool descriptions, memory settings, and runtime choices are often tuned around a specific cloud model. A prompt optimizer alone closes only about 5 percentage points of the local-cloud gap, according to the paper.

OpenJarvis decomposes a personal AI system into five primitives: Intelligence, Engine, Agents, Tools & Memory, and Learning. Each primitive becomes a field in a typed spec that can be edited independently. The practical difference is that builders are no longer choosing only a model ID. They are choosing an inference engine, an agent loop, tool and memory configuration, and what the system should learn from usage logs, then measuring all of those choices against accuracy, cost, latency, and energy constraints.

Area	Naive local model swap	OpenJarvis spec approach
Tuning unit	Mostly model name and prompt	Intelligence, Engine, Agents, Tools & Memory, Learning
Paper risk	25-39 percentage-point accuracy drop on personal AI tasks	Search that accepts only non-regressing spec edits
Evaluation target	Usually accuracy and response quality	Accuracy, cost, latency, and energy constraints

The Ollama integration connects this design to an execution environment many local model developers already use. Ollama's example flow pulls a model with jarvis model pull qwen3.5:35b, then asks a question with jarvis ask -m qwen3.5:35b "Your prompt". The default model lives in the [intelligence] section of ~/.openjarvis/config.toml, and preferred_engine = "ollama" selects Ollama as the engine. A personal AI stack that previously revolved around cloud API keys can now be configured around a local inference daemon.

The OpenJarvis GitHub README showed about 5,048 stars, 1,142 forks, and an Apache-2.0 license when the Korean article was checked on May 29, 2026. The latest desktop release referenced in the research note was desktop-v1.0.2, published on May 25, 2026. The README documents macOS, Linux, and Windows WSL2 installation paths, with a desktop binary option for Windows users. The absence of a native Windows CLI path still matters for enterprise rollout and non-developer adoption, where WSL2 can be an operational hurdle.

The built-in presets make clear that OpenJarvis is aiming at agent workflows rather than a chat box. morning-digest-mac uses mail, calendar, and news inputs to produce a morning brief. deep-research indexes both web pages and local documents and returns sourced answers. code-assistant writes and executes Python code on the local machine. scheduled-monitor handles long-running scheduled monitoring, while chat-simple provides lightweight conversation without tools.

For developers, the preset list immediately raises a permissions question. Local personal AI reduces cloud transmission, but it also moves file I/O, code execution, OAuth connections, and memory indexing onto the user's own PC. "The data does not leave the machine" is a real advantage, yet it shifts responsibility toward local permission design. Approval UI, execution logs, recovery from failed actions, and secret scanning become part of the product checklist for teams building on top of OpenJarvis.

The most aggressive numbers in the paper come from LLM-guided spec search. The researchers describe a search-time process where a frontier cloud model proposes edits to the spec, and the system accepts only non-regressing edits. The final spec then runs on-device at inference time. With that approach, on-device specs matched or exceeded cloud accuracy on 4 of 8 benchmarks and came within 3.2 percentage points of the best cloud baseline on average. The paper also reports roughly 800x lower marginal API cost and 4x lower end-to-end latency.

4/8

benchmarks matching or beating cloud accuracy

~800x

reported marginal API cost reduction

reported end-to-end latency reduction

Those numbers do not mean OpenJarvis rejects cloud models entirely. The spec search process itself uses a frontier cloud model during development or optimization. The execution boundary is the difference. The personal AI tasks a user runs every day can use a local spec, while a cloud model helps search for better local specs in the background or during an evaluation phase. Cost and data boundaries move from per-request inference toward periodic optimization.

That is also why Ollama belongs in the announcement. Throughout 2026, Ollama has kept adding integrations that connect developer tools such as OpenAI Codex CLI, Claude Code, and OpenClaw to local or cloud models. OpenJarvis v1.0 extends that pattern into personal AI. The same habit developers built around running local models for coding agents can now apply to mail briefs, document research, personal monitors, and code-executing assistants.

The competitive map splits into two broad groups. One group extends personal assistant experiences around cloud frontier models: Claude, ChatGPT, Gemini, and similar products. The other group ties tools and memory to the user's own device or self-hosted infrastructure: OpenClaw, Hermes Agent, Open WebUI, AnythingLLM, and adjacent local-first projects. OpenJarvis belongs closer to the second group, but its paper is more nuanced than "trust only small local models." It uses cloud models for spec search, then fixes routine execution on the local side.

Community validation is still early. The Korean research note did not find a large Hacker News discussion around Ollama's May 28 announcement. Earlier March discussion around the OpenJarvis repository and paper on Reddit and secondary summaries focused on local-first agents, MCP, privacy, latency, and data sovereignty. Around r/LocalLLaMA, local execution is generally welcomed, but model size, tool-calling quality, Ollama configuration, GPU memory, and hardware constraints remain recurring concerns. OpenJarvis v1.0 simplifies installation, but long-term quality will have to show up in user logs, issues, and release notes.

For an engineering team, the first experiment should be small and measurable. Personal document search, repeated reports, and local code analysis fit the strongest part of OpenJarvis because moving private data to a cloud model is expensive or hard to approve. Tasks that require fresh web research, very large context reasoning, or complex multimodal interpretation may still need a cloud fallback. The useful design question is not "local or cloud." It is which tasks can be locked into a local spec, and which tasks should still be routed to a cloud model with explicit evaluation.

Security teams should not approve a local agent just because the data path avoids an external API. A local agent sits closer to the user's keychain, file system, browser session, and development environment. Presets such as code-assistant touch code execution and file I/O, so sandboxing, allowlists, audit logs, secret redaction, and network egress policy still matter. Local execution changes the threat boundary; it does not remove the threat model.

Product teams can also revisit the cost model. In a cloud-first personal AI product, heavier usage turns API cost and latency budgets into product limits. The paper's roughly 800x marginal API cost reduction and 4x latency reduction are research results, not numbers that can be copied directly into pricing. They do show the shape of a different product equation. When enough user tasks can run on-device, a subscription product can shift some cost from provider inference to user hardware, power consumption, and installation complexity.

For global AI builders, the release matters because the location of agent execution is back on the architecture table. Much of the 2025 and 2026 coding-agent cycle has moved toward cloud sandboxes, browser control, remote workspaces, and enterprise governance. OpenJarvis plus Ollama asks the opposite question: if a personal AI really needs long-lived access to private context, should every request leave the user's machine by default? The v1.0 support places that question between an install command and a research paper rather than leaving it as a privacy slogan.

There are three follow-up signals to watch. First, OpenJarvis issues and release notes should show whether tool-calling failures, memory indexing cost, and desktop app stability improve quickly after broader use. Second, Ollama's local model ecosystem needs stronger function calling and long-context behavior for personal AI presets. Third, OpenJarvis spec search has to prove that its benchmark cost and accuracy balance survives contact with messy personal workflows outside the paper.

OpenJarvis 1.0 is not a product launch that flips the personal AI market in one release. A more precise reading is that it gives builders a working counterexample to cloud-by-default personal AI. Ollama supplies a familiar local runtime path, and the paper explains why a simple model swap fails. The comparison target is no longer a single model name. It is the execution location, the spec structure, the permission boundary, energy and cost, and whether the agent can keep operating on a user's own machine over time.