Jiajun Deng 邓家俊

Research Fellow

National University of Singapore
Singapore

Email: djiajun1206 [at] gmail [dot] com

Biography

I am a Research Fellow at the National University of Singapore, working with Prof. Lin Shao. Previously, I was a Research Fellow at The University of Adelaide with Prof. Ian Reid and before a Postdoctoral Researcher at The University of Sydney with Prof. Wanli Ouyang. I received my Ph.D. (2021) and Bachelor's degree (2016) in Electronic Engineering and Information Science from the University of Science and Technology of China (USTC), advised by Prof. Houqiang Li and Prof. Wengang Zhou. I have also closely collaborated with Prof. Yanyong Zhang, whom I regard as an advisor.

My research interests include computer vision and robotics. My current research focuses on 3D understanding & reconstruction, (2D & 3D) vision and language, autonomous intelligent systems (e.g., self-driving vehicles & embodied AI).

News

[2025/06] Four papers accepted by ICCV 2025 (Topics include driving-scene 3D Reconstruction, feedforward 3D reconstruction, VLM+object grasping).
[2025/05] Two papers accepted by IJCAI 2025 (One for 3D MLLM and the other for weakly-supervised object detection).
[2025/05] One paper accepted by ICML 2025 (This paper explores autoregressive model for image generation).
[2025/04] One paper accepted by IJCV (This paper targets document image rectification).
[2025/03] Our ACM MM workshop proposal of "Multimodal Foundation Models for Spatial Intelligence" is accepted. Please check here for more information and do not hesitate to submit your papers.
[2025/02] I will serve as the area chair of ACM MM 2025.
[2025/02] Two papers accepted by CVPR 2025 (One for 3D MLLM and the other for camera-radar 3D object detection).
[2025/02] One paper accepted by IJCV (This paper explores 3D action recognition).

Preprint

PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest
Jiajun Deng, Sha Zhang, Feras Dayoub, Wanli Ouyang, Yanyong Zhang, Ian Reid.

[Paper] [Bib]

Selected Publications

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images
Yu Sheng, Jiajun Deng†, Xinran Zhang, Yu Zhang, Bei Hua, Yanyong Zhang, Jianmin Ji†.
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.

[Paper] [Bib]
GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions
Xiaomeng Chu, Jiajun Deng†, Guoliang You, Wei Liu, Xingchen Li, Jianmin Ji, Yanyong Zhang†.
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.

[Paper] [Bib]
S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction
Guangting Zheng*, Jiajun Deng*, Xiaomeng Chu, Yu Yuan, Houqiang Li, Yanyong Zhang.
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.

[Paper] [Bib]
OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection
Xiaomeng Chu, Jiajun Deng†, Yuan Zhao, Jianmin Ji, Yu Zhang, Houqiang Li, Yanyong Zhang†.
International Journal of Computer Vision (IJCV), 2025.

[Paper] [Bib]
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer
Jiajun Deng, Tianyu He, Li Jiang, Tianyu Wang, Feras Dayoub, Ian Reid.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[Paper] [Bib]
RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion
Xiaomeng Chu, Jiajun Deng†, Guoliang You, Yifan Duan, Houqiang Li, Yanyong Zhang†.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[Paper] [Bib]
I2MD: 3D Action Representation Learning with Inter-and Intra-Modal Mutual Distillation
Yunyao Mao*, Jiajun Deng*, Wengang Zhou, Zhenbo Lu, Wanli Ouyang, Houqiang Li.
International Journal of Computer Vision (IJCV), 2025.

[Paper] [Bib]
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion
Bohan Li, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin, Wenjun Zeng.
European Conference on Computer Vision (ECCV), 2024.

[Paper] [Bib]
Agent3D-Zero: An Agent for Zero-shot 3D Understanding
Sha Zhang, Di Huang, Jiajun Deng†, Shixiang Tang, Wanli Ouyang, Tong He†, Yanyong Zhang†.
European Conference on Computer Vision (ECCV), 2024.

[Paper] [Bib]
Hvdistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation
Sha Zhang, Jiajun Deng†, Lei Bai, Houqiang Li, Wanli Ouyang, Yanyong Zhang†.
International Journal of Computer Vision (IJCV), 2024.

[Paper] [Bib]
RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
Xiaomeng Chu, Jiajun Deng†, Guoliang You, Yifan Duan, Yao Li, Yanyong Zhang†.
ACM Multimedia (ACM MM), 2024.

[Paper] [Bib]
FARFusion V2: A Geometry-based Radar-Camera Fusion Method on the Ground for Roadside Far-Range 3D Object Detection
Yao Li, Jiajun Deng, Yuxuan Xiao, Yingjie Wang, Xiaomeng Chu, Jianmin Ji, Yanyong Zhang.
ACM Multimedia (ACM MM), 2024.

[Paper] [Bib]
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang, Jiajun Deng, Mingbo Jia.
AAAI Conference on Artificial Intelligence (AAAI), 2024.

[Paper] [Bib]
Transvg++: End-to-end Visual Grounding with Language Conditioned Vision Transformer
Jiajun Deng, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, Wanli Ouyang.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023.

[Paper] [Bib]
CluB: Cluster Meets BEV for LiDAR-based 3D Object Detection
Yingjie Wang, Jiajun Deng†, Yuenan Hou, Yao Li, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang†.
Advances in Neural Information Processing Systems (NeurIPS), 2023.

[Paper] [Bib]
CLIP4HOI: Towards Adapting CLIP for Practical Zero-Shot HOI Detection
Yunyao Mao, Jiajun Deng, Wengang Zhou, Li Li, Yao Fang, Houqiang Li.
Advances in Neural Information Processing Systems (NeurIPS), 2023.

[Paper] [Bib]
3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection
Changyong Shu*, Jiajun Deng*, Yifan Liu.
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[Paper] [Bib]
Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition
Xuanyu Yi, Jiajun Deng, Qianru Sun, Xian-Sheng Hua, Joo-Hwee Lim, Hanwang Zhang.
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[Paper] [Bib]
Masked Motion Predictors are Strong 3D Action Representation Learners
Yunyao Mao, Jiajun Deng, Wengang Zhou, Yao Fang, Wanli Ouyang, Houqiang Li.
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[Paper] [Bib]
Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection
Yufei Yin, Jiajun Deng, Wengang Zhou, Li Li, Houqiang Li.
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[Paper] [Bib]
Bi-lrfusion: Bi-Directional Lidar-Radar Fusion for 3D Dynamic Object Detection
Yingjie Wang, Jiajun Deng†, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang†.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[Paper] [Bib]
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li.
International Journal of Computer Vision (IJCV), 2023.

[Paper] [Bib]
EZFusion: A Close Look at the Integration of LiDAR, Millimeter-Wave Radar, and Camera for Accurate 3D Object Detection and Tracking
Yao Li, Jiajun Deng, Yu Zhang, Jianmin Ji, Houqiang Li, Yanyong Zhang.
IEEE Robotics and Automation Letters (RAL), 2022.

[Paper] [Bib]
VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion
Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Qiuyu Mao, Houqiang Li, Yanyong Zhang.
IEEE Transactions on Multimedia (TMM), 2022.

[Paper] [Bib]
Transvg: End-to-end Visual Grounding with Transformers
Jiajun Deng, Zhengyuan Yang, Tianlang Chen, Wengang Zhou, Houqiang Li.
IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[Paper] [Bib]
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li.
AAAI Conference on Artificial Intelligence (AAAI), 2021.

[Paper] [Bib]
MINet: Meta-Learning Instance Identifiers for Video Object Detection
Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, Tao Mei.
IEEE Transactions on Image Processing (TIP), 2021.

[Paper] [Bib]
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection
Jiajun Deng, Wengang Zhou, Yanyong Zhang, Houqiang Li.
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2021.

[Paper] [Bib]
Weakly Supervised Temporal Adjacent Network for Language Grounding
Yuechen Wang, Jiajun Deng, Wengang Zhou, Houqiang Li.
IEEE Transactions on Multimedia (TMM), 2021.

[Paper] [Bib]
Adaptive Offline Quintuplet Loss for Image-Text Matching
Tianlang Chen, Jiajun Deng, Jiebo Luo.
European Conference on Computer Vision (ECCV), 2020.

[Paper] [Bib]
Single Shot Video Object Detector
Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, Tao Mei.
IEEE Transactions on Multimedia (TMM), 2020.

[Paper] [Bib]
Relation Distillation Networks for Video Object Detection
Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, Tao Mei.
IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

[Paper] [Bib]

Professional Activities

Guest Editor:
The Special Issue on Pre-trained Models for Multi-modality Understanding, IEEE Transactions on Multimedia (TMM).
Area Chair:
ACM Multimedia (ACM MM) 2024, 2025.
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025.
Workshop Organizer:
The 1st International Workshop on Multimodal Foundation Models for Spatial Intelligence, ACM MM 2025.
Conference Reviewer:
International Conference on Machine Learning (ICML).
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
IEEE International Conference on Computer Vision (ICCV).
European Conference on Computer Vision (ECCV).
ACM Multimedia (ACM MM).
AAAI Conference on Artificial Intelligence (AAAI).
IEEE International Conference on Robotics and Automation (ICRA).
IEEE International Conference on Intelligent Robots and Systems (IROS).
Journal Reviewer:
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
International Journal of Computer Vision (IJCV).
IEEE Transactions on Image Processing (TIP).
IEEE Transactions on Multimedia (TMM).
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT).
IEEE Robotics and Automation Letters (RA-L).

Honors & Awards

World’s Top 2% Scientist (Single Year), Stanford University and Elsevier

2024