Autonomous Driving

Projects

Sparse Refinement for Efficient High-Resolution Semantic Segmentation

ECCV 2024

SparseRefine is a novel approach that enhances dense low-resolution predictions with sparse high-resolution refinements. It achieves significant speedup: 1.5 to 3.7 times when applied to HRNet-W48, SegFormer-B5, Mask2Former-T/L and SegNeXt-L on Cityscapes, with negligible to no loss of accuracy.

EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction

ICCV 2023

(

)

EfficientViT is a new family of vision models for high-resolution dense prediction. It achieves global receptive field and multi-scale learning with only hardware-efficient operations. EfficientViT delivers remarkable performance gains over previous models with speedup on diverse hardware platforms, including mobile CPU, edge GPU, and cloud GPU.

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

CVPR 2023

(

)

We present FlatFormer, an efficient ViT architecture for large-scale point cloud analysis.

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

ICRA 2023

(

)

BEVFusion unifies multi-modal features in the shared bird’s-eye view (BEV) representation space, which nicely preserves both geometric and semantic information. It establishes the new state of the art on nuScenes, achieving 1.3% higher mAP and NDS on 3D object detection and 13.6% higher mIoU on BEV map segmentation with 1.9x lower computation cost.

TorchSparse: Efficient Point Cloud Inference Engine

MLSys 2022

(

)

TorchSparse is a high-performance computing library for efficient 3D sparse convolution. This library aims at accelerating sparse computation in 3D, in particular the Sparse Convolution operation.

PointAcc: Efficient Point Cloud Accelerator

MICRO 2021

(

)

PointAcc is a novel point cloud deep learning accelerator. It introduces a configurable sorting-based mapping unit that efficiently supports diverse operations in point cloud networks. PointAcc further exploits simplified caching and layer fusion specialized for point cloud models, effectively reducing the DRAM access.

Blog Posts

Efficiently Understanding Videos, Point Cloud and Natural Language on NVIDIA Jetson Xavier NX

May 22, 2020

Thanks to NVIDIA’s amazing deep learning eco-system, we are able to deploy three applications on Jetson Xavier NX soon after we receive the kit, including efficient video understanding with Temporal Shift Module (TSM, ICCV’19), efficient 3D deep learning with Point-Voxel CNN (PVCNN, NeurIPS’19), and efficient machine translation with hardware-aware transformer (HAT, ACL’20).