Abstract:
Gradient-boosted trees and random forests are among the most popular machine learning algorithms. They are applied to a broad spectrum of applications as diverse as searc...Show MoreMetadata
Abstract:
Gradient-boosted trees and random forests are among the most popular machine learning algorithms. They are applied to a broad spectrum of applications as diverse as search engine ranking, credit card fraud detection, customer relationship management, fault diagnosis in machinery, and geological studies, to name a few. Inference of these models is supported by many widely-used libraries and frameworks including Scikit-Learn, XGBoost, and ONNX Runtime. Many of the inference algorithms integrated in these libraries are optimized for performing fast predictions for large input batches, often targeted at ensemble models containing relatively shallow decision trees. This does not necessarily match well with the requirements of new emerging applications that depend on real-time predictions for individual samples or small input batches. Also, cloud-based inference services put more emphasis on small memory footprints. This paper aims to fill this gap by proposing a novel inference scheme for decision-tree ensembles that efficiently handles a wider range of application requirements and model characteristics. Performance is maximized for an extensive number of parameter combinations through a new concept in which a predict function is selected dynamically at runtime. This is done in a way that the processing resources, including the use of SIMD vector processing units and multithreading, are optimally exploited. Experiments have shown substantial performance improvements over state-of-the-art algorithms for common models.
Published in: 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT)
Date of Conference: 21-25 October 2023
Date Added to IEEE Xplore: 27 December 2023
ISBN Information: