The Secrets of Data Science Deployments | IEEE Journals & Magazine | IEEE Xplore

Scheduled Maintenance: On Tuesday, May 20, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (6:00-10:00 PM UTC). During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

The Secrets of Data Science Deployments


Abstract:

Much attention is paid to data science and machine learning as an effective means for getting value out of data and as a means for dealing with the large amounts of data ...Show More

Abstract:

Much attention is paid to data science and machine learning as an effective means for getting value out of data and as a means for dealing with the large amounts of data we are accumulating at companies and organizations. This has gained importance with the major waves of digitization we have seen, especially with the COVID-19 pandemic accelerating digital everything. However, in reality, most machine learning models, despite achieving good technical solutions to predictive problems wind up not being deployed. The reasons for this are many and have their origin in data scientists and machine learning practitioners not paying enough attention to issues of deployment in production. The issues range all the way from establishing trust by business stakeholders and users, to failure to explain why models work and when they do not, to failing to appreciate the importance of establishing a robust quality data pipeline, to ignoring many constraints that apply to deployed models, and finally to a lack of understanding the true cost of production deployment and the associated ROI. We discuss many of these problems and we provide what we believe is a pragmatic approach to getting data science models successfully deployed in working environments.
Published in: IEEE Intelligent Systems ( Volume: 37, Issue: 4, 01 July-Aug. 2022)
Page(s): 30 - 34
Date of Publication: 20 September 2022

ISSN Information:


There is much talk about the use of artificial intelligence (AI), machine learning (ML), and data science in organizations and enterprises. The reasons for this are obvious: organizations are facing huge amounts of data that are generated from interactions with customers and from business operations. The volume and variety of these data have dramatically increased with digitization; accelerated by the COVID-19 Pandemic dictating the new necessities of remote work and digital customer interactions. Adoption of algorithms to analyze, understand, and utilize the data is a must as human abilities cannot scale to the size and complexity of the data.

Contact IEEE to Subscribe

References

References is not available for this document.