Cory Chainsman corychainsman

Unsloth Gemma 4 MOE Setup Guide

Running a 26B parameter model with only 6GB RAM using mmap

Overview

This guide covers running Gemma 4 26B MOE (Mixture of Experts) locally on an Apple Silicon Mac using llama.cpp with memory-mapped files. The MOE architecture activates only 8 of 128 experts per token, making it incredibly efficient.

Spec	Value

CHALLENGE

Develop an AI prompt that solves random 12-token instances of the A::B problem (defined here), with 90%+ success rate.

RULES

1. The AI will be given a `<problem/>` to solve.

We'll use your prompt as the SYSTEM PROMPT, and a specific instance of problem as the PROMPT, inside XML tags. Example:

	import Foundation
	import ApplicationServices
	import AppKit


	let appleScript = """
	tell application "System Settings"
	activate
	reveal anchor "AX_CURSOR_SIZE" of pane id "com.apple.Accessibility-Settings.extension"
	delay 1.0


	/*
	the twitter api is stupid. it is stupid and bad and expensive. hence, this.

	Literally just paste this in the JS console on the bookmarks tab and the script will automatically scroll to the bottom of your bookmarks and keep a track of them as it goes.

	When finished, it downloads a JSON file containing the raw text content of every bookmark.

	for now it stores just the text inside the tweet itself, but if you're reading this why don't you go ahead and try to also store other information (author, tweetLink, pictures, everything). come on. do it. please?
	*/

	class LineChart {
	// LineChart by https://kevinkub.de/

	constructor(width, height, values) {
	this.ctx = new DrawContext();
	this.ctx.size = new Size(width, height);
	this.values = values;
	}

	_calculatePath() {

	/*
	MIT License

	Copyright (c) 2024 Maxwell Zeryck

	Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

	The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

	THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTW

	//Twitter: https://twitter.com/DamianCatanzaro

	const html = "<html> \
	<head> \
	<style> \
	body { \
	margin: 0; \
	background-color: cyan; \
	width: 500px; \
	height: 500px; \


	const require = importModule('scriptable-require')
	const moment = await require('moment', true)
	log(moment().format('dddd'))

	using UnityEngine;

	[RequireComponent( typeof(Camera) )]
	public class FlyCamera : MonoBehaviour {
	public float acceleration = 50; // how fast you accelerate
	public float accSprintMultiplier = 4; // how much faster you go when "sprinting"
	public float lookSensitivity = 1; // mouse look sensitivity
	public float dampingCoefficient = 5; // how quickly you break to a halt after you stop your input
	public bool focusOnEnable = true; // whether or not to focus and lock cursor immediately on enable