Abstract:
For healthcare applications, voluminous patient data contain rich and meaningful insights that can be revealed using advanced machine learning algorithms. However, the vo...Show MoreMetadata
Abstract:
For healthcare applications, voluminous patient data contain rich and meaningful insights that can be revealed using advanced machine learning algorithms. However, the volume and velocity of such high dimensional data requires new big data analytics framework where traditional machine learning tools cannot be applied directly. In this paper, we introduce our proof-of-concept big data analytics framework for developing risk adjustment model of patient expenditures, which uses the “divide and conquer” strategy to exploit the big-yet-rich data to improve the model accuracy. We leverage the distributed computing platform, e.g., MapReduce, to implement advanced machine learning algorithms on our data set. In specific, random forest regression algorithm, which is suitable for high dimensional healthcare data, is applied to improve the accuracy of our predictive model. Our proof-of-concept framework demonstrates the effectiveness of predictive analytics using random forest algorithm as well as the efficiency of the distributed computing platform.
Published in: 2013 IEEE International Conference on Big Data
Date of Conference: 06-09 October 2013
Date Added to IEEE Xplore: 23 December 2013
Electronic ISBN:978-1-4799-1293-3