Loading web-font TeX/Main/Regular
Linear Complexity Self-Attention With - Order Polynomials | IEEE Journals & Magazine | IEEE Xplore

Linear Complexity Self-Attention With 3{\mathrm{rd}}3 rd Order Polynomials


Abstract:

Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in captu...Show More

Abstract:

Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their use impractical in many real case applications. In this work, we analyze these methods through a polynomial lens, and we show that self-attention can be seen as a special case of a 3 rd order polynomial. Within this polynomial framework, we are able to design polynomial operators capable of accessing the same data pattern of non-local and self-attention blocks while reducing the complexity from quadratic to linear. As a result, we propose two modules (Poly-NL and Poly-SA) that can be used as ”drop-in” replacements for more-complex non-local and self-attention layers in state-of-the-art CNNs and ViT architectures. Our modules can achieve comparable, if not better, performance across a wide range of computer vision tasks while keeping a complexity equivalent to a standard linear layer.
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 45, Issue: 11, 01 November 2023)
Page(s): 12726 - 12737
Date of Publication: 20 March 2023

ISSN Information:

PubMed ID: 37030770

Contact IEEE to Subscribe

References

References is not available for this document.