Tools
Explore the tools developed by our researchers.
CURPress in NVIDIA KVPress Library
Prunes keys and values based on the CUR decomposition using approximate leverage scores. Integrated into NVIDIA's KVPress library for efficient LLM inference.
Selective PEFT Toolkit
This toolkit provides a flexible framework for selectively fine-tuning large language models using different selective Parameter-Efficient Fine-Tuning (PEFT) methods.
Collaborative Model Distillation (MPDistil)
This toolkit provides a framework for distilling knowledge from large language models into smaller, more efficient models using collaborative frameworks.