Abstract:
The HPCC Systems Production Machine Learning bundles provide a diverse set of features that allow the parallelized creation and training of Machine learning models and a ...Show MoreMetadata
Abstract:
The HPCC Systems Production Machine Learning bundles provide a diverse set of features that allow the parallelized creation and training of Machine learning models and a large set of evaluation metrics that can be used to test the trained model to ascertain its performance. To help monitor the models more closely however, a new set of evaluation methods that incorporate the analysis of clusters and the selection of features, as well as other commonly used tests, have been proposed, implemented, and tested. The implementations are written completely in Enterprise Control Language and support the various features provided by the Machine Learning bundles such as the Myriad Interface. This paper provides a comprehensive summary of the evaluation metrics currently available in the library, before presenting the details of the design and implementation of the new evaluation methods. It the goes on to present the results of testing these implementations against implementations present in the python scikit-learn library, and a few data visualisations demonstrating some uses of the implemented evaluation metrics.
Published in: 2019 4th International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS)
Date of Conference: 20-21 December 2019
Date Added to IEEE Xplore: 12 March 2020
ISBN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Machine Learning ,
- Learning Evaluation Metrics ,
- Machine Learning Evaluation Metrics ,
- Evaluation Method ,
- Data Visualization ,
- Testing Results ,
- Machine Learning Models ,
- Scikit-learn ,
- Before Presenting ,
- Set Of Metrics ,
- Scikit-learn Python Library ,
- Receiver Operating Characteristic Curve ,
- Receiver Operating Characteristic ,
- Performance Measures ,
- Big Data ,
- False Positive Rate ,
- K-means ,
- Data Clustering ,
- Precision And Recall ,
- Clustering Results ,
- Adjusted Rand Index ,
- Rand Index ,
- Random Labeling ,
- Silhouette Score ,
- Silhouette Coefficient ,
- Neighbor Clustering
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Machine Learning ,
- Learning Evaluation Metrics ,
- Machine Learning Evaluation Metrics ,
- Evaluation Method ,
- Data Visualization ,
- Testing Results ,
- Machine Learning Models ,
- Scikit-learn ,
- Before Presenting ,
- Set Of Metrics ,
- Scikit-learn Python Library ,
- Receiver Operating Characteristic Curve ,
- Receiver Operating Characteristic ,
- Performance Measures ,
- Big Data ,
- False Positive Rate ,
- K-means ,
- Data Clustering ,
- Precision And Recall ,
- Clustering Results ,
- Adjusted Rand Index ,
- Rand Index ,
- Random Labeling ,
- Silhouette Score ,
- Silhouette Coefficient ,
- Neighbor Clustering
- Author Keywords