Stanford introduced OpenJarvis — a local AI agent stack with memory and learning

Q: What is the source?

Originally published on MarkTechPost. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Apr 30, 2026. Reading time: 3 min.

Stanford released OpenJarvis — an open-source framework for personal AI agents that run locally on a laptop or PC. The project includes not only model…

Hamidun News Editorial

AI monitoring · MarkTechPost

Apr 30, 2026· 3 min

AI-processed from MarkTechPost; edited by Hamidun News

Stanford introduced OpenJarvis — a local AI agent stack with memory and learning — Source: MarkTechPost. Collage: Hamidun News.

◐ Listen to article

Researchers from Stanford have released OpenJarvis — an open-source framework for personal AI agents that operate entirely on the user's device. The project is conceived as a ready-made stack for local AI: from running models and orchestrating agents to memory, tools, benchmarks, and subsequent training on local data.

Why This Matters

Most personal AI systems to date look local only on the surface: the interface runs on a laptop, but the core reasoning goes to cloud APIs. For tasks involving files, notes, emails, and persistent user context, this means latency, recurring costs, and unnecessary transfer of sensitive data. OpenJarvis proposes a different model: local execution by default, with the cloud as an option only when it's truly necessary.

At Stanford, the release is connected to their own work on Intelligence Per Watt. According to the lab, local language models and local accelerators are already capable of correctly serving 88.7% of single-turn chat and reasoning requests at interactive response speeds, and the efficiency by the "intelligence per watt" metric has increased 5.3 times from 2023 to 2025. The idea behind OpenJarvis is that the hardware and models are nearly ready, but the market was lacking a unified software layer for such systems.

How the Stack Works

OpenJarvis is built around five primitives that can be replaced, tested, and optimized independently of each other. This approach is meant to eliminate the typical confusion in local AI setups, where inference, agent logic, tool handling, memory, and learning are intertwined in one difficult-to-reproduce project. As a result, developers can compare not the entire system as a whole, but a specific layer — the model, engine, memory, or agent behavior. This makes experiments and production deployment considerably simpler.

Intelligence — a models layer with a unified catalog of local LLMs and abstraction over their selection.
Engine — a runtime for execution via Ollama, vLLM, SGLang, llama.cpp, and other engines.
Agents — agent roles, including Orchestrator for task decomposition and Operative for recurring scenarios.
Tools & Memory — access to tools, local memory, semantic search, MCP, and agent-to-agent communication through A2A.
Learning — an improvement loop that uses local traces for fine-tuning and optimization.

Special emphasis is placed on the system not being limited to chat. OpenJarvis can work with local search across notes and documents, connect tools like web search, calculator, file input/output and code interpretation, as well as communicate with external MCP servers. Because of this, the framework is positioned not as a wrapper around a single model, but as infrastructure for a personal agent with long-term memory and access to the user's real environment.

What's Already Available

From a practical standpoint, the project looks quite grounded. OpenJarvis has a CLI, Python SDK, browser interface, and desktop applications for macOS, Windows, and Linux. The `jarvis init` command determines available hardware and recommends an appropriate engine and model combination, `jarvis doctor` helps diagnose configuration, and `jarvis serve` raises an OpenAI-compatible API server on FastAPI so developers can connect existing clients and frontends with minimal changes. Basic scenarios, according to the documentation, can work without network at all.

Another strong point is measuring efficiency, not just response quality. The framework collects telemetry on energy, latency, FLOPs, and monetary cost of a request, supports profiling on NVIDIA, AMD, and Apple Silicon, and standardizes benchmarks through `jarvis bench`. At the same time, OpenJarvis preserves local traces of interactions: from prompt-completion pairs to sequences of agent actions and tool calls. On this basis, one can optimize not only model weights, but also prompts, agent logic, and the inference engine itself — for example, through quantization, DSPy, GEPA, SFT, DPO, or GRPO.

What This Means

OpenJarvis shows that local AI is shifting from experimental setups toward a full-fledged engineering stack. If Stanford's approach catches on, developers will get a standard foundation for personal agents that store data with the user, are cheaper to operate, and become more useful over time through training on their own local scenarios. For the market, this is another signal: part of everyday AI tasks will soon begin to migrate from the cloud to personal devices.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation