Journals & Magazines >IEEE Journal on Selected Area... >Volume: 42 Issue: 3

SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU Sharing

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

With the rapid development of information technology, the concept of the Metaverse has swept the world and set off a new wave of the industrial revolution. The constructi...Show More

Metadata

Abstract:

With the rapid development of information technology, the concept of the Metaverse has swept the world and set off a new wave of the industrial revolution. The construction of living and manufacturing scenes based on the Metaverse requires the joint participation of scientists and engineers from various fields where “human” is at the core. In the Metaverse, predicting human behavior and response based on the deep learning model is meaningful because the prediction results can provide more satisfactory services for participants. Therefore, how to deploy a multi-stage machine learning reasoning model has become the bottleneck to improving the development level of Metaverse. Thanks to its scalability and pay-as-you-go billing model, the emerging serverless computing can effectively cope with the workload of machine learning inference. However, the statelessness of serverless computing and the lack of good GPU resource-sharing support make it difficult to deploy the machine learning model directly on the serverless computing platform to play its advantages. Therefore, we propose SMSS, a stateful model inference service, which is deployed on a serverless computing platform that supports GPU sharing. Since the serverless computing platform does not support stateful workflow execution, SMSS adopts log-based workflow runtime support. We also design a mechanism of two-layer GPU sharing to fully explore the potential of inter-model and intra-model GPU sharing. We evaluate the effectiveness of SMSS with real workloads. Our experimental results show that log-based stateful workflow operation support can ensure the stateful execution of tasks with low overhead but facilitate error location and recovery. Two-layer GPU Sharing can reduce the cold start time of inference tasks to two orders of magnitude at most.

Published in: IEEE Journal on Selected Areas in Communications ( Volume: 42, Issue: 3, March 2024)

Page(s): 799 - 811

Date of Publication: 21 December 2023

ISSN Information:

DOI: 10.1109/JSAC.2023.3345401

Funding Agency:

Contents

References is not available for this document.

SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU Sharing

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU Sharing

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?