Fred Bliss fblissjr

Test Time Scaling with MLX LM and R1-based LLMs

Install MLX LM:

pip install mlx-lm

And run:

Notes / Links about Stable Diffusion VAE

Stable Diffusion's VAE is a neural network that encodes images into a compressed "latent" format and decodes them back. The encoder performs 48x lossy compression, and the decoder generates new detail to fill in the gaps.

(Calling this model a "VAE" is sort of a misnomer - it's an encoder with some very slight KL regularization, and a conditional GAN decoder)

This document is a big pile of various links with more info.

Loom Design Document: OpenAI API Modification for Local LLMs

1. Overview

This document outlines the necessary changes to modify Loom's OpenAI API implementation to support local Language Models (LLMs) that use the OpenAI API specification. This modification will allow users to interact with local LLMs using the same interface as the official OpenAI API.

2. Design Goals

Allow users to specify a custom base URL for OpenAI API calls
Maintain compatibility with the existing OpenAI implementation

Loom LLM Provider Integration Design Document

1. Overview

This document outlines the process of adding the MLX LLM provider to Loom and demonstrates how to extend this approach to easily add other providers in the future. The goal is to create a flexible and extensible system that maintains consistency with existing provider implementations.

2. Adding MLX Provider

2.1 Update Common Types (common.ts)

Loom Project Detailed Technical Design Overview - written by claude sonnet 3.5

Project Architecture

Loom is built as an Obsidian plugin, following the plugin architecture defined by the Obsidian API. It's written in TypeScript and compiled to JavaScript for use in Obsidian.

Key Files and Their Roles

main.ts: Core plugin logic and Obsidian integration
views.ts: UI components and rendering logic

	[00:00:00 - 00:00:08] SPEAKER_02: thanks for tuning in to the world xp podcast if you're enjoying the content please drop us up
	[00:00:08 - 00:00:12] SPEAKER_02: drop a like and let us know your thoughts below in the comments also please consider supporting
	[00:00:12 - 00:00:18] SPEAKER_02: our podcast via the link below it really helps us out bill welcome to the world xp podcast how are
	[00:00:18 - 00:00:28] SPEAKER_03: you man i'm doing all right how you doing today not too bad not too bad this delay is going to be crazy
	[00:00:28 - 00:00:33] SPEAKER_01: we're just talking offline there's gonna be five second i know it's killing me right now
	[00:00:33 - 00:00:38] SPEAKER_03: that's all right so let's do a quick intro i'm gonna try to preempt what you're saying
	[00:00:38 - 00:00:43] SPEAKER_01: and answer ahead of time this this might be the most like telepathic one or the worst one ever
	[00:00:43 - 00:00:46] SPEAKER_03: no in between This might be the most telepathic one or the worst one ever.
	[00:0

	## WIP synthetic question/answer pair generation using a local llm based on llamaindex
	### Trying to make autogptq+llamaindex+transformers wrapper fix for broken tokenizer working
	### next, add axolotl integration for prompting strategies and finetuning

	edit:Think I just got it working with AutoGPTQ. Had to manually set stop tokens and edit the transformers.util.py (https://github.com/jerryjliu/llama_index/issues/3501)
	For future reference, if anyone needs the code pattern for using AutoGPTQ with llama_index, this is confirmed working on my side -

	Step 1. Hack transformers (this sucks, but I couldn't find any other way - if anyone else does, let me know)
	https://github.com/jerryjliu/llama_index/issues/3501
	Quote from issue: