CLAT: A Clustering-Based Attention Transformer Accelerator for Low-Latency Text Generation in LLMs | IEEE Journals & Magazine | IEEE Xplore