Daniel Vaughan danielvaughan

I ran Gemma 4 as a local model in Codex CLI

I wanted to know whether Gemma 4 could replace a cloud model for my day-to-day agentic coding. Not in theory, in practice. I use Codex CLI every day, running GPT-5.4 as my default model. It works well, but every token costs money and every prompt sends my code to someone else's server. I also have friends thinking seriously about spending real money on local setups, and so far I had not been convinced that would be useful for this kind of work. I was open to being wrong. Gemma 4 promised local tool calling that works. I spent a day finding out whether that held up once Codex CLI started reading files, writing patches and running tests.

I set up two machines. A 24 GB M4 Pro MacBook Pro, the laptop I carry everywhere, running the 26B MoE variant via llama.cpp in Q4_K_M because that was the highest practical fit in memory. And a Dell Pro Max GB10, 128 GB of unified memory on an NVIDIA Blackwell chip, running the 31B Dens

Metric	Mac (26B MoE Q4_K_M)	GB10 (31B Dense Q4_K_M)	GB10 (31B Dense Q8_0)
pp512 (tok/s)	590	674	499
pp8192 (tok/s)	531	548	426
tg128 (tok/s)	51.73	10.18	6.74

Metric	Cloud (GPT-5.4)	GB10 (31B Dense)	Mac (26B MoE)
Wall-clock time	1m 05s	6m 59s	4m 42s
Tokens used	21,268	185,091	29,501
Tests passed	5/5 (first try)	5/5 (first try)	4/4 (fifth try)
Code quality	5/5	4/5	3/5
Tool calls	about five (clean)	three (clean)	about ten (messy)

i--- original_source: /Users/danielvaughan/Development/git/codex-resources/drafts/2026-04-12-gemma-4-local-agentic-coding-benchmarks.md rewritten_date: 2026-04-12 style_guide: The Times Style Guide

I ran Gemma 4 as a local model in Codex CLI. Here is what happened

I wanted to know whether Gemma 4 could replace a cloud model for my day-to-day agentic coding. Not in theory, in practice. I use Codex CLI every day, running GPT-5.4 as my default model. It works well, but every token costs money and every prompt sends my code to someone else's server. Gemma 4 promised local tool calling that works. I spent a day finding out whether that promise holds.

AI Coding Ecosystem Comparison

Generic Feature Category	Anthropic	Google	OpenAI	Cognition	Cursor
Product Ecosystem	Claude Code	Gemini	Codex	Windsurf	Cursor
Agent Framework(Builder SDK)	Claude Code SDKEmbed "headless" coding agents into internal tools using Python/TS.	Agent Dev Kit (ADK)Framework for building multi-agent systems with standardized handoffs.	Agents SDKLightweight Python framework for building orchestrated agent swarms.	Devin APIREST API to programmatical

| Type | Amazon | Google | Microsoft/GitHub | OpenAI | Reference | | ------------------------------- | -------------------------------------------------------------------------------------------------------------- | ---------------------------------

Instructions for Technical Reviewers

Thank you for taking the time to review the code from Programming Cloud Native Applications with Google Cloud.

The core of the book is six projects generally comprising a service that forms part of the Skills Mapper application. The purpose of these is to allow the reader to explore how Google Cloud services fit together and can be used in practice rather than just defining their purpose.

The intention also is not to teach the reader how to program but more focus on the Google Cloud functionality and configuration. All instructions are at the command line whenever possible to make it easier to follow in a book and hopefully the command line will stay more consistent than the Google Cloud Console.

What I would most appreciate is understanding anything that is unclear, anything that is under or over explained and most importantly anything that doesn't work from the instructions given alone.

Keybase proof

I hereby claim:

I am danielvaughan on github.
I am danielvaughan (https://keybase.io/danielvaughan) on keybase.
I have a public key ASAYN3G83ZMkoOM21CsFSh2CsIDJVyIdIyAY2IivFLJLbAo

To claim this, I am signing this object:

	#!/usr/bin/env bash

	brew cask install minishift

	brew install hyperkit

	brew install docker-machine-driver-hyperkit

	sudo chown root:wheel /usr/local/bin/docker-machine-driver-hyperkit && sudo chmod u+s /usr/local/bin/docker-machine-driver-hyperkit

	package main

	import (
	"compress/gzip"
	"encoding/json"
	"fmt"
	"log"
	"os"
	)