Testing a Free, Local Rival to Claude Code

▼ Summary
– The free, open-source AI tools Goose (an agent framework) and Qwen3-coder (a coding LLM) are being explored as a potential local, cost-free alternative to paid services like Claude Code.
– Setting up this local AI coding stack requires installing Ollama as an LLM server, downloading the large (17GB) Qwen3-coder model, and then configuring Goose to connect to it.
– A significant requirement for this setup is a powerful local machine with substantial storage and RAM, as the entire AI runs locally without using cloud services.
– Initial testing on a high-spec machine showed the integrated tools could eventually complete a coding task, but it required multiple retries and corrections to achieve a working result.
– While the local performance was comparable to cloud-based alternatives in this early test, the author notes that a full assessment of its viability as a replacement requires testing on a larger project.
Exploring a free, local alternative to premium AI coding assistants like Claude Code can be an exciting venture for developers looking to cut costs without sacrificing functionality. The combination of Goose, an open-source agent framework from Block, and Qwen3-coder, a coding-focused large language model, promises a powerful setup that runs entirely on your own machine. This approach not only keeps your data private but also eliminates recurring subscription fees, making it an attractive option for those with capable hardware.
The journey begins with a cryptic social media post from Jack Dorsey, hinting at the potential of pairing Goose with Qwen3-coder. This sparked curiosity about whether these tools could genuinely compete with established, paid services. To find out, I embarked on a hands-on test, documenting the process from installation to initial coding trials.
Setting up the environment requires downloading two main components: Goose and Ollama. Ollama acts as a local server for running large language models, while Goose provides the agent framework that orchestrates coding tasks. A common pitfall is installing Goose before Ollama, which leads to communication issues. The smoother path is to start with Ollama.
Installing Ollama is straightforward. After downloading the application, launching it presents a chat interface. By default, it may show a generic model, but selecting Qwen3-coder from the model list is crucial. The specific version used here is Qwen3-coder:30b, indicating a model with 30 billion parameters optimized for coding tasks. Importantly, the model doesn’t download until you initiate a prompt, at which point it pulls down a substantial 17GB file. This highlights a key consideration: sufficient local storage and memory are essential.
Once Qwen3-coder is ready, configuring Ollama to expose itself to other applications on your network is a necessary step. This ensures Goose can connect to it. Adjusting settings like context length can be done based on your system’s capabilities; for instance, starting with a 32K context is feasible on a machine with ample RAM. Throughout this process, avoiding cloud sign-ins maintains the purely local, free nature of the setup.
With Ollama running, attention turns to Goose. The installation involves choosing the appropriate version for your operating system. Upon first launch, the welcome screen guides you to provider settings. Here, you’ll find Ollama in the list, select it, and then configure the connection by choosing the qwen3-coder:30b model. This links Goose to your local LLM, completing the foundational setup.
Taking the integrated system for a test drive involves directing Goose to a working directory and inputting a coding prompt. For an initial assessment, a standard challenge like creating a simple WordPress plugin serves as a good benchmark. The first attempt yielded a non-functional plugin. Subsequent tries, with feedback provided to the agent, also stumbled. It wasn’t until the fifth iteration that Goose, powered by Qwen3-coder, successfully produced a working plugin. The agent displayed notable satisfaction upon finally getting it right, though the multiple retries were initially disappointing.
This experience underscores a fundamental aspect of agentic coding tools: they interact directly with the source code, allowing iterative improvements that refine the final product. While some free cloud-based chatbots might solve simple tests on the first try, the local agent’s ability to persistently revise code is a distinct advantage.
Performance on robust hardware, such as an M4 Max Mac Studio with 128GB of RAM, proved quite responsive. Even with several demanding applications running concurrently, the local setup’s turnaround time felt comparable to hybrid cloud solutions like Claude Code. However, it’s worth noting that on less powerful machines, such as an M1 Mac with 16GB of RAM, performance can be significantly slower, bordering on unusable for intensive tasks.
These early impressions suggest promise, but the true test lies in applying this free stack to a large, complex project. While it may handle straightforward coding challenges, its ability to fully replace premium plans costing hundreds per month remains to be thoroughly validated. The initial experiment indicates potential, especially for developers with strong local resources, but also reveals that accuracy and efficiency can require patience and multiple iterations.
(Source: ZDNET)





