Optimizing the Electronic Health Records Through Big Data Analytics: A Knowledge-Based View

Many hospitals are suffering from ineffective use of big data analytics with electronic health records (EHRs) to generate high quality insights for their clinical practices. Organizational learning has been a key role in improving the use of big data analytics with EHRs. Drawing on the knowledge-based view and big data lifecycle, we investigate how the three modes of knowledge can achieve meaningful use of big data analytics with EHRs. To test the associations in the proposed research model, we surveyed 580 nurses of a large hospital in China in 2019. Structural equation modelling was used to examine relationships between knowledge mode of EHRs and meaningful use of EHRs. The results reveal that know-what about EHRs utilization, know-how EHRs storage and utilization, and know-why storage and utilization can improve nurses’ meaningful use of big data analytics with EHRs. This study contributes to the existing digital health and big data literature by exploring the proper adaptation of analytical tools to EHRs from the different knowledge mode in order to shape meaningful use of big data analytics with EHRs.


I. INTRODUCTION
With the aim of improving quality of care through the meaningful use of electronic health records (EHRs), the China government has promulgated the Electronic Health Record Architecture and Data Standard in 2009 as a guide for the hospitals.In this guide, EHRs are defined as ''a complete collection of digital clinical information documenting the clinical care rendered to an individual in the Chinese EHR Standard'' [1].Over two decades, EHRs has been suggested to enhance the healthcare service efficiency and effectiveness, but it does not mean that simply adopting the EHRs system could lead to those benefits.Healthcare providers need to make the EHR a routine in the daily work system in order to realize the payback.Thus, Health Information Technology for Economic and Clinical Health (HITECH) Act introduces the ''meaningful use'' of EHR as the goal of adoption.The main objective of Act is to create meaningful and useful digital The associate editor coordinating the review of this manuscript and approving it for publication was Huimin Lu. medical records, including the entry and storage of EHRs, and optimize the utilization of EHRs.
As of 2011, clinical data had reached 150 exabytes (1 EB = 1018 bytes) worldwide, mainly in the form of EHRs [2].Yet, considerable uncertainty still remains about the use of big data analytics within EHRs and its impact on clinical performance [3].Such struggles are due to not only insufficient fund and biased resource allocation at the national level but also lack of planning and governance for the use of big data analytics within EHRs at the hospital level [3], [4].
To address this challenge, although many hospitals in China have invested a great deal of cost, time and resources in learning the implementation and utilization of EHRs, they are still suffering from ineffective use of big data analytics within EHRs to generate high quality information for decision making and reduce health disparities [3], [9].One of the key reasons for this difficulty is the lack of full consideration of EHRs fitness to the specific situations of the particular organization [9].It is important for healthcare practitioners to pay greater attention to understand how to absorb the diverse knowledge of EHRs.As such, little attention has been paid to understanding the role of knowledge mode in improving the use of big data analytics within EHRs.In this study, thus, we examine the relationship between the knowledge about big data analytics within EHRs and the outcome of EHRs adoption (i.e., meaningful use of EHRs).The remainder of this paper is structured as follows: the next section serves as our theoretical background, which leads to the development of the research model and associated hypothesis; followed by our research method, findings and discussions, contributions to research, implications for practice and recommendations, then limitations and future research directions are discussed as our conclusion.

II. OPTIMIZING THE ELECTRONIC HEALTH RECORDS THROUGH BIG DATA ANALYTICS
The meaningful use of EHRs is crucial for improving clinical operations and healthcare service [5].Big data analytics is a tool that enables healthcare organizations to reach this goal by optimizing EHRs through analytical algorithms.For example, Texas Health Harris Methodist Hospital Alliance utilizes medical sensor data to analyze patients' movements and monitor their actions throughout their hospital stay.In this way they can provide healthcare services more efficiently and accurately, optimize existing operations, and prevent some medical risks [6].Indeed, the use of big data analtyics within EHRs is rooted in the concept of data life cycle framework that consists of three components: data collection, data storage, and data utilization, as shown in Figure 1.These logical components that perform specific functions enable healthcare practitioner to understand how to transform the EHRs into meaningful clinical insights through big data analtyics.
Data collection.This component contains all the data sources and content type of EHRs.In general, The EHRs are divided into structured data (e.g., patient demographics, medication history, health status and lab results) and unstructured data (e.g., diagnosis notes, clinical graphics, and medical images).These data are collected from various clinical units inside the hospital or from external units.
Data storage.The EHRs are stored into appropriate databases depending on the source of data and content format.This component aims to handle data from the various data sources by two steps: transformation and storage.The transformation engine is capable of moving, cleaning, splitting, translating, merging, sorting, and validating EHRs.For instance, structured EHRs data will be extracted from healthcare information systems and converted into a specific standard data format, sorted by the criterion (e.g., patient identity, health status medication history), and then the record in the right place.In the next step, the EHRs are loaded into the target databases (e.g., Database Management System; DBMS, Hadoop distributed file systems; HDFS, or in a cloud) for further analysis.
Data Utilization.This component is used to process all kinds of EHRs and report the summarized results for clinical decision making.The analysis of EHRs includes Map/Reduce, stream computing, and in-database analytics, depending on the type of data and the purpose of the analysis.Map/Reduce can provide the ability to process massive unstructured and structured EHRs in batch form in a massively parallel processing environment.Stream computing can support near real time or real time analysis for EHRs.Though stream computing, medical staffs can track EHRs in motion in order to respond to unexpected events and determine next-best actions.In-database analytics is commonly used data mining approach that allows EHRs to be analyzed within database.It can provide high-speed parallel processing and offer a safe environment to process confidential patient information.This component also generates various visualization reporting and real-time and meaningful business insights derived from the analysis.The reporting system is a critical big data analtyics feature that allows EHRs to be visualized in a meaningful way to support medical staff dayto-day operations and clinical decisions.

III. RESEARCH MODEL AND HYPOTHESIS DEVELOPMENT
Prior research has acknowledged that organizational learning has been an important enabler for improving the use of big data analytics within EHRs [7]- [9].From the aspect of information technology (IT) adoption, learning process plays a key role in the outcomes of the IT adoption.When the new IT is introduced to the organization, it implies that a large amount of knowledge is brought in [10], [11].Organizations need to adopt a series of learning processes to merge the gap between what needs to be known and what is already known in order to understand how to use this knowledge effectively and efficiently [10].From the knowledge-based view (KBV), knowledge plays a pivotal role in increasing the organizations' competitive advantage and financial performance [12], [13].Effective knowledge activities in healthcare not only improve the existing operational capabilities of healthcare service but also reduce the care delivery costs and prevent potential medical errors [14], [15].
Drawing on the knowledge-based view (KBV), we develop our research model and associated hypotheses, as shown in Figure 2. KBV posits that organizational knowledge is viewed as a strategic resource of an organization.It also emphasizes that creating knowledge for the production of goods and services can acquire competitive advantage and organizational performance [12], [15].In the context of EHRs implementation, an effective knowledge creation from EHRs is likely to be achieved by all medical staffs knowing how, why, what EHRs can be used properly.
To understand the creation of knowledge, it is essential to explore the mode of knowledge.In general, the mode of knowledge activities can be classified into three categories according to the level of material involvement with the knowledge: knowing-what, knowing-how, and knowing-why [18].Knowing-what refers to a declarative knowledge that contains information about activities and relationships [18].This knowledge allows organizations to understand the digital health technologies in certain detail, such as the principle and characteristics of the technology, and to generate to a certain tangible products or outcomes.In the context of EHRs, hospitals need to understand what EHRs are, its features, and problems when it applies in practice.When they learn about EHRs, hospitals would perceive an attitude towards it and form the basic idea of how to adopt it effectively.Thus, we propose the following hypotheses.Knowing-how is a procedural knowledge that includes the step-by-step procedures executable in a specific system [16].
Data analysts within healthcare organizations need to gain this type of knowledge in order to process EHRs effectively and meaningfully.For example, Tracking EHRs can generate real-time monitoring patient information such as alerts and proactive notifications.Data analysts need to know what the most important outputs are and how to display them and send to interested users or made available in the form of dashboards in real time.Knowing-how about processing EHRs can explore patterns of care and provide exceptional support for evidence based medical practices.Using knowing-how, healthcare organizations can also address data quality issue through knowing well-defined procedures and rules in an EHRs system.Thus, we propose the following hypotheses.
Hypothesis 2a (H2a): Knowing-how about the data collection of EHRs will facilitate meaningful use of EHRs.
Hypothesis 2b (H2b): Knowing-how about the data storage of EHRs will facilitate meaningful use of EHRs.
Hypothesis 2c (H2c): Knowing-how about the data utilization of EHRs will facilitate meaningful use of EHRs.
Knowing-why is a contextual knowledge that enables users to solve the problems based on understanding contextual reasons and axiomatic principles [16], [17].This knowledge provides explanations for rationalization about technology.In the context of EHRs, hospitals realize why EHRs should be used to generate better clinical performance.This includes the examination of the specific situation of their organizations and comparison of other alternative solutions.Also, organizations should be aware of the impacts and consequences of utilizing EHRs.Besides the financial and organizational impact of EHRs, hospitals also have to harness the possible challenges when they use the EHRs system.In hospitals, a high level of knowing-why about EHRs can be accumulated by understanding of knowing-what and knowing-how involved in data collection, storage, and utilization of EHRs in the clinical system.Thus, we propose the following hypotheses.
Hypothesis 3a (H3a): Knowing-why about the data collection of EHRs will facilitate meaningful use of EHRs.
Hypothesis 3b (H3b): Knowing-why about the data storage of EHRs will facilitate meaningful use of EHRs.
Hypothesis 3c (H3c): Knowing-why about the data utilization of EHRs will facilitate meaningful use of EHRs.

A. SAMPLE AND DATA COLLECTION
We investigate the relationship between knowledge mode of EHRs and meaningful use of EHRs among healthcare workers in China, primarily surveyed nurses after receiving ethics approval.An initial population set of 1,000 nurses was obtained from a large hospital in Henan province, China.The first round of 1,000 questionnaires resulted in 351 invitations being rejected due to the availability.Of the 649 invitations that were seen by potential respondents, 580 responses were returned, completed and usable for the data analysis, showing a response rate of 89.37%.

TABLE 1. Demographic characteristics of the final sample with information of the participants (n = 580).
Non-response bias was assessed by comparing the first 25 percent with the last 25 percent of the responses for each variable using paired sample t-tests [18].The results showed no statistically significant difference (p > 0.05) between these two groups, indicating that non-response bias did not present a problem for this study.
The demographic characteristics of the respondents are shown in Table 1.Among the 580 respondents, 86.20% were female.Most nurses (92.20%) were younger than 40 years: 23.30% were younger than 25 years, 40.30%were 25-30 years of age, 21.70% were 31-35 years of age, and 6.90% were 36-40 years of age.Most respondents had a bachelor's degree (91.40%).The respondent seniority (years of employment) was evenly distributed, and the largest group had a seniority of 6-10 years (31.60%).A plurality of respondents (33.28%) worked in the internal medicine department.

B. VARIABLES AND INSTRUMENTS
The instrument used in this study was adapted from previously validated instruments (presented in Appendix 5).All independent and dependent variables were collected using an online survey completed by each participant.The scale of knowing-what, knowing-how, and knowing-why about EHRs was adapted from Lee and Strong's study [16] who proposed the three mode of knowledge underlying data collection, storage, and utilization and examined how knowledge held by different work roles affects data quality.This scale was used to rate the knowledge level of EHRs by which each participant acquires.A seven-point Likert-type scale was used to capture the responses, ranging from 1 = very small extent, through 4 = average, to 7 = very large extent.
The measurement of meaningful use of the EHR was developed from the regulation published by Department of Health and Human Services (DHHS) for the year 2011-2012 [19].Leading by Centers for Medicare and Medicaid Services, DHHS developed a list of criteria for meaningful use requirements on January 16, 2010 based on the call from Health Information Technology for Economic and Clinical Health (HITECH).Five items were developed according to those regulations to measure the performance of the adopted EHRs in hospital.A seven-point Likert-type scale was used to capture the responses, ranging from 1 = strongly disagree to 7 = strongly agree.

C. MEASUREMENT VALIDITY AND RELIABILITY
The validity and reliability of measurements were assessed from the sample data set (n = 580) collected for this study.As shown in Table 2, the loadings are all within acceptable ranges, and all but three items for knowing-what about EHRs storage, knowing-what about EHRs utilization, and knowing-how about EHRs utilization have loadings above the threshold of 0.5.All of the reliability coefficients (Cronbach's alphas) are above 0.80 (Table 2), confirming that the measurements are reliable.The correlations for each construct are presented in Table 3.
Convergent validity was assessed by three criteria: (1) item loading; (2) composite reliability; and (3) average variance extracted (AVE) [20].The composite reliability scores range from 0.579 to 0.881.Each AVE is above 0.4, but KHEU (Table 2), which is acceptable.We assessed discriminant validity by checking whether each item loads more highly on its assigned construct than on other constructs, as suggested by Gefen, Straub and Boudreau [21].Each item loading in the cross-loading table is markedly higher on its assigned construct than on the other variables.Thus, our measurements demonstrate acceptable discriminant and convergent validities.
In addition, we assessed the potential effect of common method bias statistically by conducting Harman's one-factor test [22] generated ten principal constructs; the unrotated factor solution shows that the first construct explains only 11.11% of the variance, indicating that our data do not suffer from high common method bias.Consequently, this test suggest that common method bias is not a major concern for this study.

V. RESULTS
The results from the regression analysis are shown in Table 4.The hypotheses were assessed by checking the direction and significance of path coefficients (β) between dependent and independent variables.Our proposed research model is a good predictor of meaningful use of EHRs in the context of nursing department as the R2 accounts for 60.70% of the variance.According to the results, we found that different modes of knowledge can be used to improve nurses' effective use of EHRs.For example, our finding reveals that knowwhat, know-how and know-why about EHRs utilization can lead improved meaningful use of EHRs, thus H1c, H2c, and H3c are supported.This implies that EHRs utilization plays an important role in developing meaningful use of EHRs practice.In addition to the EHRs utilization, we also found that if nurses know how and why EHRs are stored, they are most likely to use EHRs effectively.Thus, H2b H3b are supported.Surprisingly, knowing what, how, and why about how EHRs are collected does not improve meaningful use of EHRs, which H1a, H2a, and H3a are not supported.

VI. THEORETICAL AND PRACTICAL CONTRIBUTIONS
To strategically meaningful use of EHRs, prior work has developed many analytical approaches to effectively process EHRs.However, what kind of knowledge about the use of big data analtyics within EHRs should be created remains unknown.By addressing this research gap, the theoretical and practical contributions of this study are three-fold.Firstly, our findings have partially confirmed knowledge about the use of big data analytics within EHRs matters for meaningful use of EHRs.This is among the first study to investigate the use of big data analytics within EHRs from a knowledgebased view.Three mode of knowledge about the use of big data analytics for EHRs are identified and tested their impact on improving meaningful use of EHRs practices.Based on our findings, healthcare organizations can make a strategic decision as to which type of knowledge and big data analytics components need to be enhanced to improve meaningful use of EHRs.For example, improving meaningful use of EHRs does not require nurses to understand how, why, and what EHRs are collected within a hospital.
Secondly, we found meaningful use of EHRs is highly influenced by knowing-what, knowing-how and knowingwhy about data utilization of EHRs as generally reflected in common sense.It is particularly important to gain knowledge regarding why various analtyics such as descriptive analtyics and predictive analytics can be used for EHRs.This result is consistent with Lee and Strong's [16] finding who recognizes the critical role that knowing-why plays in producing high data quality.Indeed, constant increasing large volume of EHRs is challenging healthcare organization's data management capabilities [23]- [26].Needs for knowing-why about data utilization of EHRs is not unique for healthcare but more important because the results extracted from the analysis of EHRs concerns patients' quality of care and wellbeing.A poor data utilization of EHRs may lead to issues such as billing errors, intentional frauds, or medical mistakes.
Thirdly, our findings show that knowledge about data collection of EHRs does not matter for improving meaningful use of EHRs.A potential explanation is that in practice nurses are data collectors that know more about collecting accurate and complete healthcare records.Thus, knowledge about how to collect EHRs would not play an important role in improving meaningful use of EHRs.Instead, they are interested in knowing more about making data relevant to their daily clinical tasks.

VII. CONCLUSION
This study has some limitations that may create interesting opportunities for future research.First, this study only collects data from a large hospital as the research sample.Although sufficient number of data points and high response may represent a large portion of population in a region of there is still a need to collect the data from the different hospitals to better generalize our research findings.Future research may assess potential difference among age groups, among working experience groups, and among different clinical department groups, with a more representative sample.Second, future research could applying qualitative methods to complement the general lack of adequate methods.the knowledge mode of EHRs with methods not support the comprehensive view required to capture the non-linear interaction among these knowledge modes [6].Future research could consider using fuzzy-set Qualitative Comparative Analysis as a data analysis approach to better explain how different knowledge mode of EHRs simultaneously combine to achieve meaningful use of EHRs.
Our study contributes to the existing digital health, big date literature and nursing literature in three ways.First, this research explores the proper adaptation of analytical tools to EHRs from the different knowledge mode in order to improve meaningful use of big data analytics within EHRs [29], [30].Second, we identified the important the knowledge modes of EHRs (e.g., know-how, know-what, and know-why about EHRs utilization) that provides evidence regarding the ways in which how training programs/course of EHRs can be designed [29].This also extends and deepens understanding of how meaningful use of EHRs practices can be improved [30].It could be a useful guidance for hospital practitioners, outlining a variety of knowledge mode of EHRs that they can focus [23], [31], [32].Third, this research proposes a conceptual model with a knowledge-based view to explicate the different knowledge mode of EHRs in the meaningful use of EHRs practice for nursing professionals.To the best of our knowledge, as yet, no previous studies have considered the knowledge mode of EHRs driving meaning use of EHRs in the nursing context.

FIGURE 2 .
FIGURE 2. Proposed model of how three mode of knowledge about the use of big data analytics within EHRs for achieving meaningful use of EHRs.
Hypothesis 1a (H1a): Knowing-what about the data collection of EHRs will facilitate meaningful use of EHRs.Hypothesis 1b (H1b): Knowing-what about the data storage of EHRs will facilitate meaningful use of EHRs.Hypothesis 1c (H1c): Knowing-what about the data utilization of EHRs will facilitate meaningful use of EHRs.
CAIFENG ZHANG was born in Henan, China, in 1962.She received the master's degree in regional economics from Henan University, China.She serves as the Associate Dean of the Kaifeng Hospital of Traditional Chinese Medicine and also with the Hospital of Henan University of Traditional Chinese Medicine.Her research interests include big data analytics in healthcare, health communication, medical tourism, and nursing management.She has published 16 articles in her research field.RUI MA is currently a Doctoral Researcher with the Sheffield University Management School, The University of Sheffield, U.K. Her current research interests include big data analytics in healthcare, medical tourism, sustainable development.She received the Excellent Presentation Award at the 10th International Conference on Systematic Innovation (ICSI).SHIWEI SUN received the Ph.D. degree in MIS from the Harbert College of Business, Auburn University, USA, in 2017.He is currently an Assistant Professor and an Associate Research Fellow with the School of Economics and Management, Beijing Institute of Technology, China.His research interests include information technology diffusion, innovation management, social media, and social networks.His research has appeared in several journals such as the Journal of Computer Information Systems, Expert Systems with Applications, the International Journal of Information Management, and some other leading conference proceedings AMCIS, PACIS, and DSI.

TABLE 2 .
Reliability and validity measures of the research model.

TABLE 5 .
(Continued.)The items in the questionnaire and the results of EFA.