Efficient AI Computing,
Transforming the Future.

Guangxuan Xiao



His research interests focus on the development of efficient algorithms and systems for deep learning, specifically large foundation models. His work has received over 8000 stars on GitHub. His work has a real-world impact: SmoothQuant has been integrated into NVIDIA's TensorRT-LLM, FasterTransformer and Intel's NeuralCompressor and is utilized in the LLMs of industry companies like Amazon, Meta, and Huggingface. StreamingLLM has been integrated into NVIDIA's TensorRT-LLM, Huggingface's transformers, and Intels' Extension for Transformers.

Honors and Fellowships

No items found.

Competition Awards

No items found.


No items found.


Blog Posts

Currently no blog posts.


No items found.