Skip to content

Yzichen/PolarBEVDet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird's-Eye-View

News

  • [2025/06/23] We add support for the occupancy prediction task
  • [2025/06/23] We add support for Waymo dataset
  • [2025/01/23] We release the code and pretrained weights for NuScenes dataset
  • [2024/12/04] PolarBEVDet is on Arxiv.

Getting Started

Abstract

Multi-view 3D object detection built upon the Lift-Splat-Shoot (LSS) mechanism provides an economical and deployment-friendly solution for autonomous driving. However, all existing LSS-based methods transform multi-view image features into a Cartesian Bird’s-Eye-View (BEV) representation without considering the non-uniform distribution of image infor- mation. Consequently, this leads to information loss in the near and computational redundancy in the far. Furthermore, these methods struggle to exploit view symmetry, increasing the difficulty of representation learning. In this paper, to fundamentally remove these limitations, we propose to replace the Cartesian BEV representation with the polar BEV representation, which naturally adapts to the image information distribution and effortlessly preserves view symmetry by regular convolution. To achieve this, we elaborately tailor three modules: a polar view transformer to generate the polar BEV representation, a polar temporal fusion module for fusing historical polar BEV features and a polar detection head to predict the polar-parameterized representation of the object. In addition, we design a 2D auxiliary detection head and a spatial attention enhancement module to improve the quality of feature extraction in perspective view and BEV, respectively. Finally, we integrate the above improvements into a novel multi-view 3D object detector, PolarBEVDet. Experiments on nuScenes and Waymo show that PolarBEVDet achieves superior performance, and the polar BEV representation can be seamlessly substituted into different LSS-based detectors with consistent performance improvement.

arch

Benchmark Results

results on nuScenes validation set:

Setting Pretrain NDS MAP Weights
r50_704x256_24e ImageNet 53.0 43.2 gdrive
r50_704x256_60e ImageNet 55.3 45.0 gdrive
r50_704x256_nuImg_60e nuImg 56.7 46.9 gdrive

results on Waymo validation set:

Setting Pretrain mAPL mAPH mAP
r101_832x1920_24e nuImg 43.8 54.9 59.6

Regrettably, we are unable to provide the model weights due to Waymo Dataset License Agreement

results on Occ3D-nuScenes validation set:

Setting mIoU RayIoU Weights
r50_704x256_24e 33.7 39.4 gdrive

Visualization

arch

Acknowledgement

Many thanks to these excellent open-source projects:

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{yu2024polarbevdet,
  title={PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird's-Eye-View},
  author={Yu, Zichen and Liu, Quanli and Wang, Wei and Zhang, Liyong and Zhao, Xiaoguang},
  journal={arXiv preprint arXiv:2408.16200},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages