Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Networks: An Active Inference Approach | IEEE Conference Publication | IEEE Xplore