kalomaze

Kalomaze's Local LLM Glossary

Not super comprehensive (yet), but I think having up to date documentation like this should be quite helpful for those out of the loop. Things change all the time in local AI circles, and it can be dizzying to catch up from an outsider's perspective, especially if you are new to the more technical aspects of language models in general (and not just locally hosted LLMs).

Available Models

Llama

A language model series created by Meta. Llama 1 was originally leaked in February 2023; Llama 2 then officially released later that year with openly available model weights & a permissive license. Kicked off the initial wave of open source developments that have been made when it comes to open source language modeling. The Llama series comes in four distinct sizes: 7b, 13b, 34b (only Code Llama was released for Llama 2 34b), and 70b. As of writing, the hotly anticipated Llama 3 has yet to arrive.

Mistral

Mistral AI is a French company that also distributes open weight

Simple Llama + SillyTavern Setup Guide

This guide is meant for Windows users who wish to run Facebook's Llama AI language model on their own PC locally. Our focus will be on character chats, reminiscent of platforms like character.ai / c.ai, using Llama architecture models. Most recently, in late 2023 and early 2024, Mistral AI has released high quality models that are based of the Llama architecture, and will work in the same way if you choose to use them.

Requirements
- Windows operating system (may make Mac version of the guide later)
- GPU with at least a few gigabytes of VRAM (NVIDIA graphics cards recommended)
Sufficient regular RAM for a model (system memory)

LLM Samplers Explained

Everytime a large language model makes predictions, all of the thousands of tokens in the vocabulary are assigned some degree of probability, from almost 0%, to almost 100%. There are different ways you can decide to choose from those predictions. This process is known as "sampling", and there are various strategies you can use which I will cover here.

OpenAI Samplers

Temperature

Temperature is a way to control the overall confidence of the model's scores (the logits). What this means is that, if you use a lower value than 1.0, the relative distance between the tokens will become larger (more deterministic), and if you use a larger value than 1.0, the relative distance between the tokens becomes smaller (less deterministic).
1.0 Temperature is the original distribution that the model was trained to optimize for, since the scores remain the same.
Graph demonstration with voiceover: https://files.catbox.moe/6ht56x.mp4

Original Text ("Reading a Sign 43 Times Heals Your Axe Durability" by Hunter R.)

We all know that the axe in Animal Crossing will usually break after using it too much. Of course, the axe is intentionally designed to break like this in order to make the unbreakable Golden Axe an appealing item to unlock. And yet what if I told you that by simply reading a sign over and over you can actually prevent your standard axe from ever breaking? And no, I'm not joking—you can actually sit here and read this sign over and over to heal the durability on your axe, making it theoretically invincible. I'm sure a lot of you are wondering how or why this even works, so let's take a closer look.

Creating an unbreakable axe is a really funny glitch that was recently discovered by Animal Crossing spreadsheet owner Phil. To understand how interacting with a sign heals your axe, let's discuss how axe durability works.

Normally an axe can withstand 72 hits on normal trees before breaking. Since trees take three hits to cut

	# coding=utf-8
	# Copyright 2023 Mixtral AI and the HuggingFace Inc. team. All rights reserved.
	#
	# This code is based on EleutherAI's GPT-NeoX library and the GPT-NeoX
	# and OPT implementations in this library. It has been modified from its
	# original forms to accommodate minor architectural differences compared
	# to GPT-NeoX and OPT used by the Meta AI team that trained the model.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.

	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
	import random
	import os
	import shutil

	# Set a seed for reproducibility
	random.seed(42)

	# Load the model, tokenizer, and configuration

	== Results torch.int8 meta-llama/Llama-2-7b-hf-TP1 ====
	[--------------------------------------- scaled-torch.int8-gemm --------------------------------------]
	\| pytorch_bf16_bf16_bf16_matmul-no-scales \| cutlass_i8_i8_bf16_scaled_mm
	1 threads: --------------------------------------------------------------------------------------------
	MKN=(1x4096x12288) \| 195.3 \| 142.4
	MKN=(1x4096x4096) \| 64.5 \| 47.5
	MKN=(1x4096x22016) \| 322.9 \| 235.6
	MKN=(1x11008x4096) \| 162.6 \| 112.9
	MKN=(16x4096x12288) \| 187.5 \| 142.6
	MKN=(16x4096x4096) \| 66.2 \| 47.6

	datasets:
	- path: anthracite-core/c2_logs_8k_llama3_v1.2
	# contents of this dataset were filtered for quality, but not safety or safe for work-ness. be advised
	type: sharegpt
	conversation: llama3
	- path: anthracite-org/kalo-opus-instruct-22k-no-refusal
	type: sharegpt
	conversation: llama3
	- path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
	type: sharegpt

	import React, { useState } from 'react';
	import { Settings, Bookmark, Download, Library, HelpCircle, RefreshCw, ArrowLeft } from 'lucide-react';

	const STORY_BRANCHES = {
	root: {
	text: `The darkness grew absolute, not that the hyperstitioner could see in the first place. His ears pricked up, however; he could hear the skittering, the mechanical hum as the machine followed him invisibly...`,
	continuations: [
	{
	id: 'a1',
	text: " The mechanical tendrils wrapped tighter around his shoulder, its grip a cold reminder of their symbiosis...",

	import sys
	import random
	import numpy as np
	import string
	from datetime import datetime
	from PIL import Image, ImageEnhance, ImageOps
	from PyQt5.QtWidgets import (QApplication, QMainWindow, QWidget, QVBoxLayout,
	QHBoxLayout, QTextEdit, QPushButton, QCheckBox,
	QLabel, QSpinBox, QComboBox, QSlider, QFileDialog,
	QFrame)