Vishal Goklani vgoklani

The following guide will show you how to connect a local model served with MLX to OpenCode for local coding.

1. Install OpenCode

curl -fsSL https://opencode.ai/install | bash

2. Install mlx-lm

Summary

This doc servers as a quick reference for the _scaled_mm API and how it has changed overtime for each major version of PyTorch.

NOTE The leading underscore is intended here and we make no current FC/BC guarantees on this API. That being said it is currently the only OP that has native support for FP8 matmuls within the PyTorch Libary. We are planning to make an official Public api for this. Until then this is subject to change but you can use this doc as a reference.

	"""
	An LM with a REPL

	Gives an LLM a Python REPL: the model can write ```repl``` code blocks,
	which get executed, with stdout/stderr fed back into the conversation.

	Requires a running mlx_lm.server:
	mlx_lm.server
	"""

	import torch


	FP8_AMAX = 448.0
	FP8_DTYPE = torch.float8_e4m3fn

	FP4_AMAX = 6.0
	FP4_DTYPE = getattr(torch, "float4_e2m1fn_x2", torch.uint8)
	# midpoints and the corresponding bins
	# representable positives = [0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 6.0]

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},

	import math
	import torch
	from torch.optim.lr_scheduler import _LRScheduler
	from dataclasses import dataclass
	from typing import List

	@dataclass
	class SchedulePhase:
	"""Defines a phase in the learning rate schedule"""
	percent: float # Percentage of total steps this phase covers

	from huggingface_hub import snapshot_download
	import mlx.core as mx
	import mlx.nn as nn
	import time


	class Block(nn.Module):
	def __init__(self, in_dims, dims, stride=1):
	super().__init__()

	# coding=utf-8
	# Copyright 2023 Mixtral AI and the HuggingFace Inc. team. All rights reserved.
	#
	# This code is based on EleutherAI's GPT-NeoX library and the GPT-NeoX
	# and OPT implementations in this library. It has been modified from its
	# original forms to accommodate minor architectural differences compared
	# to GPT-NeoX and OPT used by the Meta AI team that trained the model.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.

	def verify_ddp_weights_equal(model: torch.nn.Module, atol: float = 1e-5) -> None:
	if hasattr(model, "module"):
	model = model.module

	world_size = get_world_size()
	for name, param in model.named_parameters():
	gathered_param = gather(param).reshape((world_size, -1))
	absolute_diffs = (gathered_param[None, 0, :] - gathered_param).abs()
	rank_params_eq = (absolute_diffs < atol).all()
	assert rank_params_eq, f"❌ param [{name}] not equal - got max_absolute_diff={absolute_diffs.max()}"