GGUF

Module: tool mastery

What it is

GGUF (GPT-Generated Unified Format) is a file format for storing quantised language models. It's widely used for local model distribution, particularly with llama.cpp-based tools. GGUF files come in different quantisation levels (Q4, Q5, Q8) offering different size/quality trade-offs.

Why it matters

When downloading models for local use, you'll often see GGUF files. Understanding the naming conventions (model-name-Q4_K_M.gguf) helps you choose the right version. Higher quantisation preserves more quality but needs more VRAM. GGUF is the standard format for local deployment.