Loading [MathJax]/extensions/MathMenu.js
Accelerating Decision-Tree-Based Inference Through Adaptive Parallelization | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

Accelerating Decision-Tree-Based Inference Through Adaptive Parallelization


Abstract:

Gradient-boosted trees and random forests are among the most popular machine learning algorithms. They are applied to a broad spectrum of applications as diverse as searc...Show More

Abstract:

Gradient-boosted trees and random forests are among the most popular machine learning algorithms. They are applied to a broad spectrum of applications as diverse as search engine ranking, credit card fraud detection, customer relationship management, fault diagnosis in machinery, and geological studies, to name a few. Inference of these models is supported by many widely-used libraries and frameworks including Scikit-Learn, XGBoost, and ONNX Runtime. Many of the inference algorithms integrated in these libraries are optimized for performing fast predictions for large input batches, often targeted at ensemble models containing relatively shallow decision trees. This does not necessarily match well with the requirements of new emerging applications that depend on real-time predictions for individual samples or small input batches. Also, cloud-based inference services put more emphasis on small memory footprints. This paper aims to fill this gap by proposing a novel inference scheme for decision-tree ensembles that efficiently handles a wider range of application requirements and model characteristics. Performance is maximized for an extensive number of parameter combinations through a new concept in which a predict function is selected dynamically at runtime. This is done in a way that the processing resources, including the use of SIMD vector processing units and multithreading, are optimally exploited. Experiments have shown substantial performance improvements over state-of-the-art algorithms for common models.
Date of Conference: 21-25 October 2023
Date Added to IEEE Xplore: 27 December 2023
ISBN Information:
Conference Location: Vienna, Austria

Contact IEEE to Subscribe

References

References is not available for this document.