Typical Configuration Files for OSS LLMs on Hugging Face

Looking at the Qwen2.5-VL-32B-Instruct example, we can see these standard configuration files:

Core Configuration Files

config.json - The primary model configuration file that defines the model architecture, including parameters like hidden size, number of layers, attention heads, and other architectural details. For Qwen2.5 models, it also contains settings for rotary position embeddings and context length.
tokenizer_config.json - Contains settings for the tokenizer, including vocabulary size, special tokens, and tokenization parameters.
generation_config.json - Defines default parameters for text generation such as temperature, top_p, top_k, max length, and repetition penalties.

Tokenizer-related Files

added_tokens.json - Lists additional tokens beyond the base vocabulary.
vocab.json - Contains the mapping of tokens to token IDs.
merges.txt - Used for byte-pair encoding (BPE) tokenization, containing merge rules.
special_tokens_map.json - Maps special token types (like pad, eos, bos) to their actual token values.

Model Template Files

chat_template.json - Defines the format for chat conversations, including how to structure multi-turn dialogues.

Model Weight Files

model-00001-of-00018.safetensors - The actual model weights split into chunks (safetensors format is preferred over older PyTorch .bin files for security and performance).
model.safetensors.index.json - Index file that maps parameter names to their locations in the chunked model files.

Additional Configuration

preprocessor_config.json - For multimodal models like Qwen2.5-VL, this contains settings for processing visual inputs.
configuration.json - Sometimes contains additional configuration metadata.

These files work together to define both the model architecture and how to interact with it. The specific configurations for multimodal models like Qwen2.5-VL include additional settings for handling images and videos, such as resolution parameters and vision encoder specifications.