Loading [MathJax]/extensions/MathMenu.js
LLMaaS: Serving Large-Language Models on Trusted Serverless Computing Platforms | IEEE Journals & Magazine | IEEE Xplore

LLMaaS: Serving Large-Language Models on Trusted Serverless Computing Platforms


Impact Statement:The contribution of this study lies in proposing and successfully implementing an innovative solution for the partition, secure deployment, and distributed reasoning of L...Show More

Abstract:

In recent years, the emergence of large-language models (LLMs) has profoundly transformed our production and lifestyle. These models have shown tremendous potential in fi...Show More
Impact Statement:
The contribution of this study lies in proposing and successfully implementing an innovative solution for the partition, secure deployment, and distributed reasoning of LLMs. We achieve secure isolation and encrypted transmission of the models at the hardware level. This approach holds significant importance in practical application scenarios. It not only provides valuable insights and guidance for the secure deployment and efficient inference of LLMs on cloud platforms and within clusters but also offers practical solutions for research and applications in related fields.

Abstract:

In recent years, the emergence of large-language models (LLMs) has profoundly transformed our production and lifestyle. These models have shown tremendous potential in fields, such as natural language processing, speech recognition, and recommendation systems, and are increasingly playing crucial roles in applications such as human–computer interaction and intelligent customer service. Efficient inference solutions for LLMs in data centers have been extensively researched, with a focus on meeting users’ quality of service requirements. In this article, we focus on two additional requirements that responsible LLM inference should meet under QoS conditions: security throughout the model execution process and low maintenance requirements for the inference system. Therefore, we propose LLMaaS, a trusted model inference platform based on a serverless computing platform aimed at providing inference as a service for LLMs. First, we design a trusted serverless computing platform based on softw...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 6, Issue: 2, February 2025)
Page(s): 405 - 415
Date of Publication: 17 July 2024
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.