SDViT: Towards Efficient Visual Foundation Model via Unifying Sparse and Dense Representation Learning

SDViT: Towards Efficient Visual Foundation Model via Unifying Sparse and Dense Representation Learning | IEEE Conference Publication | IEEE Xplore