Generative Inference of Large Language Models in Edge Computing: An Energy Efficient Approach | IEEE Conference Publication | IEEE Xplore