Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Computing: An Active Inference Approach | IEEE Journals & Magazine | IEEE Xplore