Conferences >2024 International Conference...

Hybrid Vision Transformer and Convolutional Neural Network for Sports Video Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Engaging in sports is essential for maintaining our mental and physical well being. Sports video libraries are expanding quickly, and as a result, automated classificatio...Show More

Metadata

Abstract:

Engaging in sports is essential for maintaining our mental and physical well being. Sports video libraries are expanding quickly, and as a result, automated classification is becoming necessary for many purposes such as content-based recommendations, contextual advertising, and simple access and retrieval. Vision Transformers (ViT) usage contributes to research on vision transformers’ effectiveness in video classification. ViTs are known for their ability to capture long-range spatial dependencies in data. It efficiently extracts high level features, reducing the need for manual feature engineering. We can use ViTs and CNN model in parallel to create a hybrid model that can adapt to the specific requirements of sports video classification. ViTs can capture global context, while CNN model can excels at detailed local features, improving feature extraction. Combining predictions from both models can enhances classification accuracy. Diverse viewpoints and learning strategies from ViTs and CNN model improve classification in complex tasks. Parallel usage of ViTs and CNN model is a form of ensemble learning. Ensemble methods are known to produce more reliable and accurate predictions by combining multiple models, making them a suitable choice for video classification. Using ViTs and VGG16 in parallel allows us to explore the applicability of state of-the-art vision transformers in video classification tasks. It can contribute to the ongoing research on the effectiveness of vision transformers and their integration with other architectures. We can leverage the power of Vision Transformers (ViT) and CNN model to make the final classification decision.

Published in: 2024 International Conference on Intelligent Computing and Emerging Communication Technologies (ICEC)

Date of Conference: 23-25 November 2024

Date Added to IEEE Xplore: 16 January 2025

ISBN Information:

DOI: 10.1109/ICEC59683.2024.10837289

Conference Location: Guntur, India

Contents

References is not available for this document.

Hybrid Vision Transformer and Convolutional Neural Network for Sports Video Classification

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hybrid Vision Transformer and Convolutional Neural Network for Sports Video Classification

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?