Abstract:
Paintings possess profound cultural and historical backgrounds. Unlike real-life images, they convey complex semantics beyond simple visual features. This diversity and c...Show MoreMetadata
Abstract:
Paintings possess profound cultural and historical backgrounds. Unlike real-life images, they convey complex semantics beyond simple visual features. This diversity and complexity make painting style classification highly challenging, and many popular visual models struggle with it. To address this issue, we propose an art knowledge-driven framework(AKDF) to improve models’ comprehension of art knowledge. AKDF utilizes multimodal models and prompts to extract style-related textual descriptions from images. Text and image features will be fused in enhanced bilinear pooling module, thus integrating art knowledge into the output features. Additionally, we design a contrastive learning auxiliary task based on label embeddings to introduce further art knowledge about style labels. Besides, AKDF incorporates texture features and adds another genre classification auxiliary task to provide more painting information. We constructs two datasets based on WikiArt due to its comprehensive and challenging nature. The extensive experiment results demonstrate the superiority of the AKDF, which effectively develops the performance of various models by over three percentage points across two datasets.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: