Abstract:
One of the core functions of Natural Language Processing (NLP) is machine translation—which is essential for bridging linguistic and cultural communication gaps. Neverthe...Show MoreMetadata
Abstract:
One of the core functions of Natural Language Processing (NLP) is machine translation—which is essential for bridging linguistic and cultural communication gaps. Nevertheless, computational expenses also increase with the size of huge language models. Libraries such as the high-performance inference engine—Archer—work to minimize the resources required to use these models' predictive power by optimizing their deployment. Therefore, creating a machine translation benchmarking suite is the main goal of this study in order to support such technologies. In terms of output quality and model execution efficiency, the suite is intended to help alleviate performance bottlenecks. The suite's design—which can be extended to accommodate more metrics, exemplifies a methodical approach to assessing machine translation models through ongoing modification. Moreover, the work done offer a building block to help design inference libraries to support the creation of serving Machine Learning (ML) models—as well as to benchmark the inference runtime of trained models. Lastly, sparse—larger models like Switch Transformer and No Language Left Behind (NLLB)- Mixture of Experts (MoE) deployed with Archer are examined utilizing the framework, as well as findings reported from dense models like T5 and NLLB. This study advances the fields of NLP and machine translation by enhancing the effectiveness of model execution and translation quality.
Date of Conference: 24-25 October 2024
Date Added to IEEE Xplore: 15 January 2025
ISBN Information: