API Reference
AutoModel
The main entry point for loading and using models.
from_pretrained
AutoModel.from_pretrained(
model_name: str,
device_map: str = "auto",
ktransformers_config: str = None,
**kwargs
) -> Model
Parameters:
model_name: HuggingFace model name or local pathdevice_map: Device placement strategy ("auto", "cuda:0", etc.)ktransformers_config: Path to YAML config file
Returns: Model instance
generate
model.generate(
prompt: str,
max_new_tokens: int = 512,
temperature: float = 0.7,
top_p: float = 0.9,
stream: bool = False
) -> str | Generator
Parameters:
prompt: Input textmax_new_tokens: Maximum tokens to generatetemperature: Sampling temperaturetop_p: Nucleus sampling parameterstream: Enable streaming output
Configuration API
load_config
from ktransformers import load_config
config = load_config("config.yaml")
merge_configs
from ktransformers import merge_configs
config = merge_configs(base_config, override_config)
Utilities
get_device_info
from ktransformers.utils import get_device_info
info = get_device_info()
# {'gpu_count': 1, 'gpu_memory': 24576, 'cpu_memory': 131072}