Abstract:
This paper studies the contribution of the visual road scene to estimate the driver-reported stress levels. Our research leverages on previous work showing that environme...Show MoreMetadata
Abstract:
This paper studies the contribution of the visual road scene to estimate the driver-reported stress levels. Our research leverages on previous work showing that environmental factors, such as traffic congestion, weather conditions, and driving context, impact driver's stress. Each of the models we evaluated is trained and tested with the publicly available AffectiveROAD dataset to estimate three categories of driver-reported stress level. We test three types of modelling approaches: (i) single-frame baselines (Random Forest, SVM, and Convolutional Neural Networks); (ii) Temporal Segment Networks (TSN) and two variants of it, which use learned weights (TSN-w) and LSTM (TSN-LSTM) as consensus functions; and (iii) video classification Transformers. Our experiments reveal that the TSN-w, TSN-LSTM, and Transformer models achieve statistically equivalent performances, all significantly outperforming the other models. Particularly noteworthy is TSN-w, which attains the highest performance observed with an average accuracy of 0.77. We further provide an explainability analysis using Class Activation Mapping and image semantic segmentation to identify the elements of the road scene that contribute the most to high levels of stress. Our results demonstrate that the visible road scene offers significant contextual information for estimating driver-reported stress levels, with potential implications for the design of safer urban road environments.
Published in: IEEE Transactions on Affective Computing ( Early Access )
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Visual Scene ,
- Street Scenes ,
- Stress Levels ,
- Convolutional Neural Network ,
- Random Forest ,
- Contextual Information ,
- Urban Environments ,
- Average Accuracy ,
- Long Short-term Memory ,
- Traffic Congestion ,
- Semantic Segmentation ,
- Video Analysis ,
- Learned Weights ,
- Road Environment ,
- Low Stress ,
- Image Area ,
- Physiological Data ,
- Video Frames ,
- Objective Presentation ,
- Visual Context ,
- Frames Per Second ,
- Video Sequences ,
- Driver Behavior ,
- Affective Computing ,
- Visual Elements ,
- Video Modeling ,
- Traffic Light ,
- Optical Flow ,
- Presence Of Vegetation
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Visual Scene ,
- Street Scenes ,
- Stress Levels ,
- Convolutional Neural Network ,
- Random Forest ,
- Contextual Information ,
- Urban Environments ,
- Average Accuracy ,
- Long Short-term Memory ,
- Traffic Congestion ,
- Semantic Segmentation ,
- Video Analysis ,
- Learned Weights ,
- Road Environment ,
- Low Stress ,
- Image Area ,
- Physiological Data ,
- Video Frames ,
- Objective Presentation ,
- Visual Context ,
- Frames Per Second ,
- Video Sequences ,
- Driver Behavior ,
- Affective Computing ,
- Visual Elements ,
- Video Modeling ,
- Traffic Light ,
- Optical Flow ,
- Presence Of Vegetation
- Author Keywords