Osaurus unifies local and cloud AI models on your Mac

▼ Summary
– Osaurus is an open-source, Apple-only LLM server that lets users switch between local and cloud AI models while keeping files and tools on their own hardware.
– It evolved from a desktop AI companion called Dinoki, after users questioned paying for tokens, leading co-founder Terence Pae to focus on running AI locally.
– Osaurus acts as a “harness” control layer connecting different AI models through a single interface, with a user-friendly design and hardware-isolated sandbox for security.
– Running local models requires at least 64 GB of RAM, but Pae notes local AI’s intelligence per wattage is improving rapidly.
– The tool supports models like MiniMax M2.5 and DeepSeek V4, includes over 20 native plugins, and has been downloaded over 112,000 times since launch.
As AI models become more and more commoditized, a new wave of startups is racing to build the software infrastructure that sits on top of them. One standout in this emerging space is Osaurus, an open-source LLM server designed exclusively for Apple hardware. It allows users to seamlessly switch between different local and cloud-based AI models while keeping their files, tools, and personal data stored directly on their own machine.
Osaurus grew out of an earlier project called Dinoki, which co-founder Terence Pae described as a kind of “AI-powered Clippy.” Users of Dinoki raised a pointed question: why pay for the app if they still had to cover token costs , the usage fees charged by AI companies for processing prompts and generating responses? That question sparked a deeper exploration into running AI locally.
“That’s how Osaurus started,” said Pae, a former software engineer at Tesla and Netflix. “The idea was to try to run an AI assistant locally. You can do pretty much everything on your Mac locally, like browsing your files, accessing your browser, accessing your system configurations. I figured this would be a great way to position Osaurus as a personal AI for individuals.”
Pae began building the tool in public as an open-source project, steadily adding features and fixing bugs along the way.
Today, Osaurus can flexibly connect with locally hosted AI models or cloud providers like OpenAI and Anthropic. Users are free to choose which AI models they use while keeping other aspects of the AI experience , such as the model’s own memory, files, and tools , on their own hardware. Since different models have different strengths, this setup allows users to switch to the one that best fits their needs.
This architecture makes Osaurus what’s known as a “harness” , a control layer that connects various AI models, tools, and workflows through a single interface. It’s similar to tools like OpenClaw or Hermes, but those are often geared toward developers comfortable with a terminal. Some, like OpenClaw, also come with security vulnerabilities. Osaurus, by contrast, offers an easy-to-use interface for consumers and addresses security concerns by running everything inside a hardware-isolated, virtual sandbox. This limits the AI’s scope and keeps your computer and data safe.
Of course, running AI models on your own machine is still in its early days. It’s resource-intensive and hardware-dependent. To run local models, your system needs at least 64 GB of RAM. For larger models like DeepSeek V4, Pae recommends around 128 GB. But he believes the hardware requirements will come down over time.
“I can see the potential of it, because the intelligence per wattage , which is like the metric for local AI , has been going up significantly. It’s on its own curve of innovation,” Pae said. “Last year, local AI could barely finish sentences, but today it can actually run tools, write code, access your browser, and order stuff from Amazon. It’s just getting better and better.”
Currently, Osaurus supports a wide range of models including MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, DeepSeek V4, and others. It also works with Apple’s on-device foundation models and Liquid AI’s LFM family. On the cloud side, it connects to OpenAI, Anthropic, Gemini, xAI/Grok, Venice AI, OpenRouter, Ollama, and LM Studio.
As a full MCP (Model Context Protocol) server, Osaurus gives any MCP-compatible client access to your tools. It also ships with over 20 native plugins for Mail, Calendar, Vision, macOS Use, XLSX, PPTX, Browser, Music, Git, Filesystem, Search, Fetch, and more. Recently, voice capabilities were added as well.
Since the project launched nearly a year ago, it has been downloaded more than 112,000 times, according to its website.
Osaurus’ founders , including co-founder Sam Yoo , are currently participating in the New York-based startup accelerator Alliance. They’re also exploring next steps, which could include offering Osaurus to businesses in fields like law or healthcare, where running local LLMs could address privacy concerns.
As local AI models grow more powerful, the team believes they could reduce demand for AI data centers.
“We’re seeing this explosive growth in the AI space where cloud AI providers have to scale up using data centers and infrastructure, but we feel like people haven’t really seen the value of the local AI yet,” Pae said. “Instead of relying on the cloud, they can actually deploy a Mac Studio on-prem, and it should use substantially less power. You still have the capabilities of the cloud, but you will not be dependent on a data center to be able to run that AI.”
(Source: TechCrunch)



