Ziggoto/LLMs.md

Created July 15, 2024 22:55

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/Ziggoto/f94a5f1b6f5c8e0807b5f1cbf2ac048f.js"></script>
Save Ziggoto/f94a5f1b6f5c8e0807b5f1cbf2ac048f to your computer and use it in GitHub Desktop.

Download ZIP

Raw

LLMs.md

Backends:

Local

Transformers
Ollama
llama.cpp
ExLlamaV2
AutoGPTQ
AutoAWQ
TensorRT-LLM

docs about inference backends: https://www.bentoml.com/blog/benchmarking-llm-inference-backends

Cloud

BentoML

Frontends:

oobabooga
Stable Diffusion web UI
SillyTavern
LM Studio
Axolatotl
GPT4all
Open WebUI
- I've used this one
Enchanted
- Mac native

Frameworks/Libs

High-level

Langchain (TS & Python)
LLamaindex (TS & Python)
ModelFusion (TS)
Haystack (Python)
- Used by AWS, Nvidia, IBM, Intel
CrewAI (Python)
Transformers (Python)
- Made by HuggingFace

Low-level

PyTorch
Tensorflow
JAX

Miscelaneous

vokturz/can-it-run-llm
nyxkrage/gguf-vram-calculator
QLoRA
- For fine-tuning models

Benchmarks

LMSYS Chatbot Arena Leaderboard

Youtube channels about AI:

bycloud
HuggingFace
Fireship
- Not exclusively about LLMs/AI
David Ondrej

About models

Models are usually saved on one of these formats:

GGUF
- It's a sucessor of GGML
- Tech doc about GGUF (from HuggingFace)
GGML
Safetensors
Exl2
AWQ

These files contains contexts used by the LLMs

1 tokens ~= 0.75 words

Quantization algorithms

Q4_0
Q4_1
Q5_0
Q5_1
Q8_0

K-means Quantizations

Q3_K_S
Q3_K_M
Q3_K_L
Q4_K_S
Q4_K_M
Q5_K_S
Q5_K_M
Q6_K

Raw

Summary

Tool for using LLMs locally

It has libraries for Python and Javascript

Supports customisation: it can create a model from a GGUF file or using PyTorch and Safetensors

Useful links:

Github page: https://github.com/ollama/ollama
Official docs: https://github.com/ollama/ollama/blob/main/docs/README.md

Model file:

docs: https://github.com/ollama/ollama/blob/main/docs/modelfile.md

The template property uses ChatML with Go Templates. Official doc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment