Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition | IEEE Journals & Magazine | IEEE Xplore

Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition


Abstract:

Recognizing multiple labels of an image is a practical yet challenging task, and remarkable progress has been achieved by searching for semantic regions and exploiting la...Show More

Abstract:

Recognizing multiple labels of an image is a practical yet challenging task, and remarkable progress has been achieved by searching for semantic regions and exploiting label dependencies. However, current works utilize RNN/LSTM to implicitly capture sequential region/label dependencies, which cannot fully explore mutual interactions among the semantic regions/labels and do not explicitly integrate label co-occurrences. In addition, these works require large amounts of training samples for each category, and they are unable to generalize to novel categories with limited samples. To address these issues, we propose a knowledge-guided graph routing (KGGR) framework, which unifies prior knowledge of statistical label correlations with deep neural networks. The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples. Specifically, it first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence. Then, it introduces the label semantics to guide learning semantic-specific features to initialize the graph, and it exploits a graph propagation network to explore graph node interactions, enabling learning contextualized image feature representations. Moreover, we initialize each graph node with the classifier weights for the corresponding label and apply another propagation network to transfer node messages through the graph. In this way, it can facilitate exploiting the information of correlated labels to help train better classifiers, especially for labels with limited training samples. We conduct extensive experiments on the traditional multi-label image recognition (MLR) and multi-label few-shot learning (ML-FSL) tasks and show that our KGGR framework outperforms the current state-of-the-art methods by sizable margins on the public benchmarks.
Page(s): 1371 - 1384
Date of Publication: 28 September 2020

ISSN Information:

PubMed ID: 32986543

Funding Agency:

References is not available for this document.

Select All
1.
Y. Li, Y. Song, and J. Luo, “Improving pairwise ranking for multi-label image classification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 1837–1845.
2.
Y. Wei, et al., “HCP: A flexible CNN framework for multi-label image classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 9, pp. 1901–1907, Sep. 2016.
3.
R. Cabral, F. De la Torre, J. P. Costeira, and A. Bernardino, “Matrix completion for weakly-supervised multi-label image classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 1, pp. 121–135, Jan. 2015.
4.
H. O. Song, R. Girshick, S. Zickler, C. Geyer, P. Felzenszwalb, and T. Darrell, “Generalized sparselet models for real-time multiclass object recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 5, pp. 1001–1012, May 2015.
5.
M. Lapin, M. Hein, and B. Schiele, “Analysis and optimization of loss functions for multiclass, top-k, and multilabel classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 7, pp. 1533–1554, Jul. 2018.
6.
T.-S. Chua, H.-K. Pung, G.-J. Lu, and H.-S. Jong, “A concept-based image retrieval system,” in Proc. Int. Conf. Syst. Sci., 1994, vol. 3, pp. 590–598.
7.
X. Yang, Y. Li, and J. Luo, “Pinterest board recommendation for Twitter users,” in Proc. ACM Int. Conf. Multimedia, 2015, pp. 963–966.
8.
H. Yang, J. Tianyi Zhou, Y. Zhang, B.-B. Gao, J. Wu, and J. Cai, “Exploit bounding box annotations for multi-label object recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 280–288.
9.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
10.
F. Zhu, H. Li, W. Ouyang, N. Yu, and X. Wang, “Learning spatial regularization with image-level supervisions for multi-label image classification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5513–5522.
11.
Z. Wang, T. Chen, G. Li, R. Xu, and L. Lin, “Multi-label image recognition by recurrently discovering attentional regions,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 464–472.
12.
J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” Int. J. Comput. Vis., vol. 104, no. 2, pp. 154–171, 2013.
13.
C. L. Zitnick and P. Dollár, “Edge boxes: Locating object proposals from edges,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 391–405.
14.
Z. Zhang, et al., “Sequential optimization for efficient high-quality object proposal generation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 5, pp. 1209–1223, May 2018.
15.
J. Pont-Tuset, P. Arbelaez, J. T. Barron, F. Marques, and J. Malik, “Multiscale combinatorial grouping for image segmentation and object proposal generation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 1, pp. 128–140, Jan. 2017.
16.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
17.
J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu, “CNN-RNN: A unified framework for multi-label image classification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2285–2294.
18.
T. Chen, Z. Wang, G. Li, and L. Lin, “Recurrent attentional reinforcement learning for multi-label image recognition,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 6730–6737.
19.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
20.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Representations, 2015.
21.
F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales, “Learning to compare: Relation network for few-shot learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 1199–1208.
22.
J. Kim, T.-H. Oh, S. Lee, F. Pan, and I. S. Kweon, “Variational prototyping-encoder: One-shot learning with prototypical images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9462–9470.
23.
B. Hariharan and R. Girshick, “Low-shot visual recognition by shrinking and hallucinating features,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 3037–3046.
24.
T. Chen, M. Xu, X. Hui, H. Wu, and L. Lin, “Learning semantic-specific graph representation for multi-label image recognition,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 522–531.
25.
M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (VOC) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.
26.
T.-Y. Lin, et al., “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 740–755.
27.
R. Krishna, et al., “Visual genome: Connecting language and vision using crowdsourced dense image annotations,” Int. J. Comput. Vis., vol. 123, no. 1, pp. 32–73, 2017.
28.
J. Zhang, Q. Wu, C. Shen, J. Zhang, and J. Lu, “Multi-label image classification with regional latent semantic dependencies,” IEEE Trans. Multimedia, vol. 20, no. 10, pp. 2801–2813, Oct. 2018.
29.
N. Ghamrawi and A. McCallum, “Collective multi-label classification,” in Proc. 14th ACM Int. Conf. Inf. Knowl. Manage., 2005, pp. 195–200.
30.
Y. Guo and S. Gu, “Multi-label classification using conditional dependency networks,” in Proc. 22nd Int. Joint Conf. Artif. Intell., 2011, pp. 1300–1305.

Contact IEEE to Subscribe

References

References is not available for this document.