Abstract:
Pedestrian crossing intention prediction is crucial to traffic safety, which is a challenging task in real traffic scenarios. Traditional methods infer the intention of p...Show MoreMetadata
Abstract:
Pedestrian crossing intention prediction is crucial to traffic safety, which is a challenging task in real traffic scenarios. Traditional methods infer the intention of pedestrians to cross by predicting their future movements based on the observed trajectories in history. The performance of those methods is limited due to insufficient features and sources of information. To address those limitations, we propose a ViT-based model which incorporates multi-modal data to predict the pedestrian crossing intention. Specifically, the proposed model takes into consideration the visual information, poses, bounding box coordinates and action annotations, and gradually fuses those features for the final prediction. Besides, different data processing methods are designed based on the corresponding characteristics of different modalities to make full use of each type of data. Extensive ablation studies are conducted to show the performance of temporal modelling and feature fusion.
Published in: IEEE Signal Processing Letters ( Volume: 29)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Predictor Of Intention ,
- Pedestrian Intention ,
- Real Scenarios ,
- Bounding Box ,
- Feature Fusion ,
- Temporal Model ,
- Traffic Safety ,
- Bounding Box Coordinates ,
- Contralateral ,
- Model Performance ,
- Convolutional Neural Network ,
- Image Features ,
- Observation Time ,
- Attention Mechanism ,
- Local Image ,
- Large Image ,
- Area Under Curve ,
- Source Images ,
- Traffic Light ,
- Binary Classification Problem ,
- Vision Transformer ,
- Pedestrian Behavior ,
- Pedestrian Trajectory ,
- Feature Fusion Strategy
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Predictor Of Intention ,
- Pedestrian Intention ,
- Real Scenarios ,
- Bounding Box ,
- Feature Fusion ,
- Temporal Model ,
- Traffic Safety ,
- Bounding Box Coordinates ,
- Contralateral ,
- Model Performance ,
- Convolutional Neural Network ,
- Image Features ,
- Observation Time ,
- Attention Mechanism ,
- Local Image ,
- Large Image ,
- Area Under Curve ,
- Source Images ,
- Traffic Light ,
- Binary Classification Problem ,
- Vision Transformer ,
- Pedestrian Behavior ,
- Pedestrian Trajectory ,
- Feature Fusion Strategy
- Author Keywords