End-to-End Acceleration of Generative Models With Runtime Regularized KV Cache Management | IEEE Journals & Magazine | IEEE Xplore