Edge vs Cloud: How Do We Balance Cost, Latency, and Quality for Large Language Models Over 5G Networks? | IEEE Conference Publication | IEEE Xplore