Tools
Explore the tools developed by our researchers.
CURPress in NVIDIA KVPress Library
Prunes keys and values based on the CUR decomposition using approximate leverage scores. Integrated into NVIDIA's KVPress library for efficient LLM inference.
Selective PEFT Toolkit
This toolkit provides a flexible framework for selectively fine-tuning large language models using different selective Parameter-Efficient Fine-Tuning (PEFT) methods.
Collaborative Model Distillation (MPDistil)
This toolkit provides a framework for distilling knowledge from large language models into smaller, more efficient models using collaborative frameworks.
Manifold-Preserving Transformers (TransJect)
This package contains implementations of manifold-preserving transformer architectures that maintain the geometric properties of data during transformations.
Efficient LLM Compression Suite
A comprehensive suite of tools for compressing large language models using various techniques.
LLM Distillation Toolkit
A toolkit for distilling large language models into smaller, efficient models.
Robust LoRA Fine-Tuning
A forked version of LoRA that incorporates robustness techniques to improve model performance under adversarial conditions.