Single-camera and inter-camera vehicle tracking and 3D speed estimation based on fusion of visual and semantic features

Published in CVPR Workshop on the NVIDIA AI City Challenge, 2018

Recommended citation: Zheng Tang, Gaoang Wang, Hao Xiao, Aotian Zheng and Jenq-Neng Hwang, "Single-camera and inter-camera vehicle tracking and 3D speed estimation based on fusion of visual and semantic features," In CVPR Workshop (CVPRW) on the AI City Challenge, 2018

[Paper], [Slides], [Poster], [Code], [2018 NVIDIA AI City Challenge]


Tracking of vehicles across multiple cameras with non-overlapping views has been a challenging task for the intelligent transportation system (ITS). It is mainly because of high similarity among vehicle models, frequent occlusion, large variation in different viewing perspectives and low video resolution. In this work, we propose a fusion of visual and semantic features for both single-camera tracking (SCT) and inter-camera tracking (ICT). Specifically, a histogram-based adaptive appearance model is introduced to learn long-term history of visual features for each vehicle target. Besides, semantic features including trajectory smoothness, velocity change and temporal information are incorporated into a bottom-up clustering strategy for data association in each single camera view. Across different camera views, we also exploit other information, such as deep learning features, detected license plate features and detected car types, for vehicle re-identification. Additionally, evolutionary optimization is applied to camera calibration for reliable 3D speed estimation. Our algorithm achieves the top performance in both 3D speed estimation and vehicle re-identification at the NVIDIA AI City Challenge 2018.

Demo Video


Please cite this paper in your publications if it helps your research:

  author = {Zheng Tang and Gaoang Wang and Hao Xiao and Aotian Zheng and Jenq-Neng Hwang},
  booktitle = {CVPR Workshop (CVPRW) on the AI City Challenge},
  title = {Single-camera and Inter-camera Vehicle Tracking and 3D Speed Estimation Based on Fusion of Visual and Semantic Features},
  year = {2018},
  pages = {108--115}