AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory | IEEE Conference Publication | IEEE Xplore