Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

VILA: On Pre-training for Visual Language Models

CVPR 2024
 (
)

VILA is a visual language model (VLM) pre-trained with interleaved image-text data at scale, enabling multi-image VLM. VILA is deployable on the edge.

Condition-Aware Neural Network for Controlled Image Generation

CVPR 2024
 (
)

A new conditional control method for diffusion models by dynamically adapting their weight.

Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

ICML 2024
 (
)

Quest is an effient long-context LLM inference framework that leverages query-aware sparsity in KV cache to reduce memory movement during attention and thus boost throughput.

Efficient Streaming Language Models with Attention Sinks

ICLR 2024
 (
)

We enable LLMs to work on infinite-length texts without compromising efficiency and performance.