Getting Started
This guide will help you get up and running with KTransformers in just a few minutes.
Prerequisites
Before you begin, make sure you have:
- Python 3.9 or higher
- CUDA 11.8+ (for GPU acceleration)
- At least 16GB RAM (256GB+ recommended for large models)
Installation
pip install ktransformers
Your First Model
Let's run a simple inference:
from ktransformers import AutoModel
# Load a model
model = AutoModel.from_pretrained(
"deepseek-ai/DeepSeek-R1-671B",
device_map="auto",
ktransformers_config="./config.yaml"
)
# Generate text
response = model.generate(
"Explain quantum computing in simple terms",
max_new_tokens=512
)
print(response)
Configuration
Create a config.yaml file to customize inference:
backend: torch
quantization: Q4_K_M
offload:
enabled: true
ratio: 0.8
Next Steps
- Installation Guide - Advanced installation options
- Configuration - Full configuration reference
- API Reference - Detailed API documentation