A 46 TOPS/W In-/Near-Memory Computing Processor for Large Language Model with Extended Sparse Attention

A 46 TOPS/W In-/Near-Memory Computing Processor for Large Language Model with Extended Sparse Attention | IEEE Conference Publication | IEEE Xplore