Exploring Drivers of Staff Engagement in Healthcare Organizations Using Tree-Based Machine Learning Algorithms

Staff engagement in the work environment is vital to organizational success. Engaged staff are motivated and indulged in their work as they have a sense of belonging, commitment, and loyalty toward their employer, which eventually leads to better performance and outcomes. While various organizational factors are related to staff engagement, limited research is available regarding what drives staff engagement and the degree of their importance in healthcare. Leveraging data-driven approaches, in this article, we employ three machine learning algorithms, random forests, gradient boosting, and extra trees, to identify the relative importance of organizational factors affecting staff engagement. We use hospital-level aggregate survey data from hospitals in the U.K. While staff engagement is the outcome variable, the following factors are used as organizational factors in the prediction model and feature importance analysis: equality, diversity, and inclusion, safety culture, health and wellbeing, immediate managers, quality of appraisals, quality of care, bullying and harassment, violence, and team working. All the algorithms provide comparable prediction results with similar feature importance ranking with respect to prediction accuracy. The results suggest that safety culture is the most influential factor related to staff engagement, followed by the team working. Healthcare managers and decision makers can benefit from this data-driven application to make informed decisions in resource allocation and prioritization efforts to improve staff engagement.

Engagement catalyzes the commitment and keenness to come to work, leading to a sense of loyalty in the workplace [2]. It was shown in an earlier study that staff performs better when they are more engaged and satisfied in their work environment [3]. Engagement may also help prevent burnout in work environments [4].
Due to the increasing importance of staff engagement in work environments, extensive research has investigated this phenomenon [5], [6]. While some researchers identified the importance of staff engagement for the success and welfare of businesses [7], [8], some also pointed out specific factors, such as communication [9], leadership [10], [11], team formation [12], knowledge management [13], and technology embeddedness [14], that impacted the efficacy of engagement-related initiatives. A recent study also showed that engagement is among the most critical aspects of technology management in healthcare [15]. Furthermore, studies discussed the impact of individuals' behaviors and emotional intelligence on improving employee engagement [16], [17]. Earlier studies also showed that establishing psychological contracts between employees and their employer leads to better staff engagement that reflects staff's trust in their employers [7], [18].
Staff engagement is an important measure in most industries, and healthcare is not an exception. The recent COVID-19 pandemic also proved the importance of staff engagement and possible implications, such as burnout, as healthcare systems worldwide were not adequately designed and operationalized to support human performance under such a health crisis. Recent studies also proved that the efforts so far have not taken human behavior and cognition into account in a systematic manner in pandemic preparation [13] and general engineering management [19]. Furthermore, it is noted that limited empirical studies are available on staff engagement [20] and resilience [10], [21] in engineering management research, despite its importance in the operations and supply chain management context. Therefore, an empirical study is essential to understand the human aspect and staff engagement in healthcare organizations. In this article, we aim to address this gap and answer the following question: What is the relative importance of organizational factors driving staff engagement in healthcare organizations?
To measure staff engagement, healthcare organizations apply various tools and methods. Surveys are the most commonly used approaches, mainly limited to exploring the association between individual factors, e.g., absenteeism [22], and staff engagement.
While such approaches are helpful in correlation analysis, advanced analytical approaches and emerging technologies can capture complex relationships between various factors and staff engagement, potentially leading to better decision making in healthcare management. Such technologies are imperative for healthcare organizations to enhance their operations [16]. In the era of big data, healthcare executives need to support data-driven healthcare operations [17].
The contribution of this article is to leverage data and innovative approaches, e.g., machine learning (ML) algorithms, to identify and rank organizational factors that affect staff engagement. Such a methodological contribution is expected to help decision makers and healthcare managers to better understand the drivers of staff engagement and prepare resources adequately for further improvement in healthcare organizations.
The rest of this article is organized as follows. Section II discusses the literature review on staff engagement and tree-based ensemble algorithms. Section III presents the methodology used in this research. Section IV explains the analysis and results achieved by applying ML algorithms. Finally, Section V presents the discussion and conclusions incorporating the study contributions, theoretical and managerial implications, and limitations and future research directions.

A. Staff Engagement
Staff engagement is a psychological state involving employees' complex feelings and emotions about the work environment [25]. Recent studies show that engaged employees are more enthusiastic about coming to work and performing assigned tasks on time and at high standards [1]. Achieving high staff productivity while maintaining mental and physical health is thus imperative to organizations, irrespective of the work nature. A recent study noted that staff involvement and engagement are also associated with service innovation quality in healthcare [26].
Staff engagement can be affected by a range of factors. For instance, increasing the resources allocated to employees positively impacts their work engagement [27]. Another study from Finland also shows a positive relationship between resources allocated to tasks and work engagement [28]. Staff engagement positively influences employees' effectiveness as they are eager to come to work and feel focused during work hours [1]. Earlier studies also showed that staff engagement leads to lower rates of intentional absenteeism [22]. This relationship was also underpinned by a recent study [29], which argued that staff engagement is an influential variable In employees' job performance, as work engagement mediates the relationship between job performance and leadership.
In the healthcare context, the effect of engagement is more evident. A recent study shows that healthcare providers suffer from high personal and work-related fatigue rates [30]. Furthermore, a recent study [6] argues that staff engagement encourages creativity among employees, where engaged ones bring new ideas to work and efficiently utilize job resources. In National Health Service (NHS) acute trusts in England, higher quality ratings are achieved in organizations where employees are more engaged and are more likely to recommend their organizations to others [31].
Staff engagement is imperative in the work environment [14]. A recent study [22] showed the moderating effect of social support from managers and coworkers on the relationship between work engagement and job satisfaction. The study concluded that positive support, help from coworkers, and communication with supervisors increase work engagement and job satisfaction. Moreover, another survey was conducted to understand the effect of managers on employee engagement [33], which substantiated a significant association between managerial support and employee engagement [34]. Another study [5] surveyed four NHS acute trusts and concluded that effective management and teamwork lead to higher staff engagement. A recent study also showed the importance of responsible Artificial Intelligence (AI) embeddedness in staff engagement in healthcare [14].
While the studies mentioned above show that staff engagement is associated with a range of factors, the relative importance of such factors has not been discussed in the literature in detail. Such an analysis might be critical as decision makers, and healthcare managers may need to prioritize their resources based on their importance to staff engagement. In order to achieve this, ML algorithms can be used to provide more advanced statistical capability and more reliable predictions in relative importance analysis. In this context, the effect of multiple factors on staff engagement can be studied simultaneously through ML.

B. ML Applications and Algorithms
As a subset of AI, ML gained increasing recognition in various domains and industries [35], [36], including healthcare [37]. ML algorithms become powerful prediction tools because they adapt to complex linear and nonlinear inter-relations between predictors and outcomes. Healthcare organizations collect medical data where ML algorithms could improve clinical and operational outcomes, such as patient safety and quality of care. For instance, ML algorithms were used in the prediction of skin cancer [38] and lung cancer [39], [40], in identifying patients that are likely to endure postoperative complications [41], [42], and in predicting whether a patient is susceptible to suffer from infections and complications after surgeries [43], [44], [45]. Besides clinical risk predictions, ML algorithms were also used to analyze data extracted from wearable devices, such as wearable fitness trackers [46], [47].
As mentioned above, surveys are common approaches to assessing staff engagement. Considering survey data are the labeled data in nature, supervised ML algorithms can be helpful. In particular, tree-based ensemble algorithms proved to be a valuable and effective tool in learning from the survey data with easy-to-interpret predictive results [48], [49]. Tree-based algorithms can produce interpretable results, incorporate various predictors, and treat missing values in prediction models and feature importance analysis. These ensemble algorithms combine multiple simple decision trees to achieve a robust and accurate model [49]. Ensemble algorithms comprise two main techniques: bagging and boosting [50]. Random forest (RF) and extra trees (ET) are representative algorithms for the bagging technique, while gradient boosting (GB) is an example of the boosting approach. In this study, we introduce these three algorithms as they represent both boosting and bagging approaches and demonstrate reliable predictions and feature importance analyses in the literature [51], [52].
1) Gradient Boosting: GB is a tree-based ensemble learning algorithm introduced by Breiman and further developed by Friedman to be used in classification and regression models [53]. It was used in various domains and disciplines, such as environment (predicting emissions of air polluting gases) [54], biology [55], as well as healthcare [56], [57]. Here, we give a short overview of GB algorithms.
Regression and classification trees partition the space of all joint predictor variables in regions R j , j = 1, . . . , J, represented by the terminal nodes of the tree. A constant γ j is associated with each region j, and the prediction rule is: if Mathematically, a tree is represented by the following model: x ∈ R j and 0 otherwise. As regions may have different shapes, trees adapt to complex nonlinearity among predictors and the response variables. The parameters Θ = (R j , γ j ) J 1 are found by solving the following optimization problem:Θ where y i is the response for input x i and L(·, ·) is a loss function (i.e., mean squared error). GB starts with an initial prediction, minimizing the loss function over the outputs. In the case of least squares loss functions, this is the average of the outputs (Step 1 in Algorithm GB).
Subsequently, generalized residuals (r im ) are calculated (Step 2a in Algorithm GB), and a regression tree is created to find the regions that fit these residuals (R jm ) J m 1 (Step 2b in Algorithm GB). The model is extended to the regions (R jm ) J m 1 and new fitted values (γ jm ) are calculated in Step 2c. The fitted values are scaled using a "learning rate" factor and then added to the initial average predicted output to find the new predicted output in Step 2d. By repeating this procedure M times, the accuracy of the model improves. The pseudoalgorithm for GB is as follows [58]: 1) Initialize the model with a constant value γ that minimizes the loss function 2) Iterate the following steps from m = 1 to M, where m is the number of trees. a) For i = 1 to n, calculate the generalized residual values where F (x) is the predicted value and i is the instance.
with J m being the size of the tree. c) Find new values γ jm that minimize the following loss function: where α is the learning rate.
2) Random Forest: RFs are models that average over regression/classification trees to reduce variance [48], [58]. Similar to GB, it was first introduced by Breiman [59] and successfully applied to numerous application areas, such as computer security [60], energy [61], flood control and finding effective reservoirs [62], and healthcare [63] in recent years. An illustration of an RF is given in Fig. 1.
An RF model proceeds as follows. First, a bootstrap sample is drawn from the training data. A tree of size n min is created for the bootstrapped data by randomly selecting a subset of m features and by splitting a node according to the best variable among the m. This process is repeated B times and has an output ensemble of trees (T b ) B 1 . A prediction at a new point x is given by The variance of the tree ensemble is calculated by where σ 2 is the variance of each tree, ρ is the correlation between pairs of trees, and B is the number of trees. Note that growing Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. many trees decreases the second term of the variance but may increase the pairwise correlation due to using some of the rows in different trees. Therefore, when using RF, one has to make a tradeoff between the number of trees B and the pairwise correlation ρ.
3) Extra Trees: Another more randomized form of RF is ET. ET is similar to RF in considering a random subset of features when splitting. Additionally, in ET, random thresholds for each feature are considered for splitting a node instead of splitting at optimum possible thresholds [64]. Moreover, ET uses the whole training instances to train base learners (decision trees), unlike RF, which uses the "row sampling with replacement" technique. Randomizing thresholds makes ET a faster algorithm than RF [50]. At the same time, ET offers a robust model with low variance [65].
In this article, we evaluate these three algorithms to identify their prediction capability in a new application area of staff engagement. Furthermore, a comparison is made to check the robustness of feature importance analysis, where we identify the relative importance of various organizational factors influencing staff engagement.

A. Data Source
The data used in this article are extracted from the British NHS staff survey conducted by the Survey Coordination Center [66]. The data were collected and merged in a five-year period from 2015 to 2019. The survey aimed to assess staff experience in organizations under the NHS. Moreover, the survey assesses organizational changes, comparing time and recognizing the differences between employees and staff groups and various organizations.
The NHS survey includes several questions, where each group of questions represents a specific theme. In this article, we studied nine themes that may impact staff engagement. The themes and corresponding survey items are summarized in Table I.
It can be noted in Table I that themes are scored on a 0-10 point scale, with ten corresponding to the most favorable outcome. Furthermore, to facilitate a more convincing comparison between different organizations, which might have nonidentical characteristics, e.g., the number of staff working in an organization as well as the size of the organization, the data were weighted according to three types of weights: occupational group weight, trust size weight, and combined weight. Further details on calculating themes' scores can be found in the "Technical Guide to The Staff Survey Data" [67].
In this study, the relationship between the nine themes and the output (staff engagement) is analyzed by training three treebased ensemble algorithms: RF, GB, and ET. A prediction model was built to determine the relative importance of these factors by performing feature importance functions in each algorithm.

B. Procedure
The dataset was analyzed using Python with various libraries, including basic libraries, such as Numpy, Matplotlib, Scikit-Learn, and Pandas, among other additional libraries. In the beginning, the dataset was cleaned for missing values by removing rows with at least one missing value (e.g., NaN cell). In total, 12 rows were removed. The next step involved randomly splitting the dataset into a training set (80%) and a test set (20%). The three algorithms were trained and the models were developed and optimized on the training set. The algorithms were then tested on the test set to measure their generalizability by calculating three error metrics.
The error metrics considered in this study are mean relative error (MRE), mean absolute error (MAE), and root mean squared error (RMSE). These metrics give the same unit of measure while additionally providing an estimate of different aspects, such as the ability of algorithms to deal with outliers. While MAE does not highlight the effect of outliers because it calculates the absolute error, RMSE emphasizes outliers because it calculates the square of errors giving outliers more extreme values. In (9)- (11), y i is the true value in the ith observation, while p i is the predicted value in the ith observation. N is the After defining error metrics for the algorithms, parameter tuning was performed on each algorithm to select the best combination of hyperparameters to ensure the optimum performance of the algorithm. Randomized grid search (RandomizedSearchCV) was applied to choose the best combination of hyperparameters among a wide range of values of each hyperparameter. A tenfold cross-validation scheme, iterated a hundred times, was applied during the tuning process.
Following the prediction models, a feature importance analysis is conducted by computing the impurity reduction across all trees in the forest because of each feature. In other words, the most important features are the ones that split near root nodes. This function was used successfully in earlier studies [51], [68] to identify leading variables. This function investigates the contribution of variables to predict the outcome and assigns relative weights to each variable accordingly.

A. Basic Statistical Results
The NHS data consist of 413 hospital-level observations during a five-year period. Table II provides a detailed description of the variables. "Safe environment-violence" has the highest mean, whereas "Quality of Appraisals" has the lowest. The highest range among the variables is 2.37 for "Quality of Appraisals," while the lowest range is 0.54 for "Safe environmentviolence." Moreover, while the variables are scored out of 10, the lowest score for a variable is 4.21, and the highest is 9.67.
To better understand the relationship between the nine themes, the strength of correlation was analyzed by calculating the Pearson correlation coefficient among the themes (see Fig. 2).
Due to the survey nature of the data, it is expected to have multicollinearity, which may affect the performance of the algorithms. As shown in Table III, we calculated the variance inflation factor (VIF) to identify any multicollinearity among variables. Our analysis demonstrated that all VIF values are under 10, indicating the absence of multicollinearity among input features. Moreover, since tree-based algorithms are being  used in the analysis, it can be noted that the algorithms will not be significantly affected by multicollinearity [69].
It can be noted that most of the correlations are high except few. Tree-based ensemble algorithms can help handle and minimize the effect of correlation between variables due to their random nature.
To measure the significance of independent variables, the p-value of each theme was calculated by performing a linear regression analysis on the NHS data with the nine themes as independent variables (see Fig. 3). Results show that all themes are statistically significant (p < 0.05). Residual analysis indicates that the residuals may be assumed to be normally distributed and independent. However, the residuals versus fitted and the scale-location graphs indicate that linearity and equal variance assumptions may be violated. These remarks led us to use tree-based algorithms that do not make assumptions about residual distributions and do not assume a specific relation between predictors and response.  Randomized grid search (RandomizedSearchCV) was used to select the optimal set of values for each hyperparameter in the parameter-tuning phase. The range of values for each parameter was chosen to provide a wide range of values with small increments to try numerous parameter combinations and achieve as accurate results as possible.
After the three tree-based ensemble algorithms were developed, the error was calculated using the three error metrics. Table V summarizes the results that have been obtained in the three models. Errors were calculated based on the test set to demonstrate the generalizability of each model. As shown in Table V, there are minimal differences in error values between algorithms. However, ET consistently achieved the lowest error values among the three algorithms in the analysis, while RF achieved the highest values based on the MRE.
After obtaining the prediction results of the models, the next step is to scrutinize the variables driving the change in the outcome. The feature importance function of the algorithms was applied to identify the level of importance of each variable in predicting staff engagement. Fig. 4 shows the most influential variables after applying the feature importance functions in RF, GB, and ET. In all three algorithms, safety culture is identified as the most influential theme achieving 36% in RF, 68% in GB, and 46% in ET. The second leading variable is the teamwork theme, which achieved 15% in RF, 8% in GB, and 17% in ET. It can be noted that in GB, the difference between the leading theme and the rest of the themes is quite substantial, whereas this difference in RF and ET is significantly smaller. Moreover, it can be noted that the sequence of the leading themes is the same across all algorithms. Furthermore, the least three important themes are the same in all algorithms. The results are consistent across all three tree-based ensemble algorithms supporting the validity of the analysis.

A. Contributions and Managerial Implications
In this study, we aimed to explore the effect of various organizational factors on staff engagement in healthcare organizations using tree-based ensemble learning algorithms. In line with this aim, we contribute empirical evidence to the engineering management research field. As highlighted in an earlier study [20], the engineering management field needs more empirical studies in this area. Recent studies also showed the importance of empirical studies in understanding the relationship between specific organizational factors (e.g., communication [70] and leadership [11]), and technologies (e.g., AI and digitization) [14], [71] and employees' psychological behavior and engagement. As a subset of stakeholder engagement, the importance of staff engagement was also identified as the most critical criterion in technology management by panels of healthcare subject matter experts [15]. Our study uses novel ML algorithms in healthcare's specific staff engagement context. Healthcare organizations conduct surveys to gain insights into their internal operations, staff experience, and opinions on the procedures in their organizations [25], [72]. In this study, we leverage survey data to develop different ML algorithms for the prediction model and feature importance analysis.
This article provides empirical support for the notion of a relationship between organizational factors and staff engagement in healthcare organizations. The developed ML algorithms can be used as standalone decision support tools to represent better and visualize the impact of different organizational factors on staff engagement. This research provides valuable guidance for healthcare managers and decision makers to consider various organizational factors affecting staff engagement. Our algorithms presented that safety culture is the most important factor associated with staff engagement.
The results are aligned with earlier studies showing the significance of having a positive safety culture in healthcare organizations [73]. A recent study showed the association between safety climate and job satisfaction [74]. Another study highlighted that giving staff more freedom to speak up when observing medical errors or wrong practices is vital for engagement [75]. Such results were also acknowledged in the engineering management field [76]. Based on the feature importance ranking, the results can help managers prioritize improvement activities, such as safety culture enhancement. It is vital for managers to understand the importance of various resources and interactions of capabilities within the healthcare systems [10]. Considering safety culture is relatively more important than others; healthcare managers may consult with hospital safety units to better understand how safety policy and practice can be improved further for staff engagement.
As earlier studies highlighted the importance of strong datadriven decision-making culture in hospitals [24], this study can also help healthcare executives better facilitate the use of survey data in resource allocation and prioritization. Considering staff involvement and engagement positively affects technology management [15], service innovation [26], and digital transformation [71], this study can also encourage healthcare managers to look for further opportunities to leverage the organizational culture in their organizations.

B. Limitations and Future Work
Our study has limitations and methodological constraints that offer opportunities for future research. First, this study is intraindustry research focusing solely on healthcare. Therefore, using data from U.K. hospitals implies that the generalizability may be limited to other countries, domains, and industries. Although we provide context-rich healthcare survey data, other industries may need further validation to see what drives their staff engagement regarding their unique context. While our data are the comprehensive survey data, there might also be other dimensions that could affect staff engagement in organizations, such as employees' personalities (e.g., extroverts versus introverts), language, cultural barriers, and other sociodemographic information. Moreover, NHS staff survey data did not exhibit a complete nonlinear behavior that may not show the full capabilities of the algorithms used.
Moreover, variables showed little multicollinearity among them. Although tree-based algorithms can handle collinearities, such correlations might still affect the results of feature importance analysis. Another possible limitation might be the use of a randomized grid search (RandomizedSearchCV). While we preferred it over the grid search (GridSearchCV) function due to computational power limitations, future research may also test and validate hyperparameter values through a full grid search (GridSearchCV).

C. Conclusion
In this article, the main objective was to use NHS staff survey data to scrutinize the importance of several dimensions of staff engagement in healthcare organizations. This result was achieved using three tree-based algorithms: RF, GB, and ET. The results achieved from the three algorithms were consistent with similar error values. Moreover, when performing feature importance analysis using RF, GB, and ET, all of the algorithms provided similar feature importance rankings. The first three themes were mutual among all algorithms, with safety culture being the most important.
Healthcare organizations should pay more attention to staff engagement in the work environment. Our results suggest that the best way of ensuring better staff engagement is to improve the safety culture within healthcare organizations, which is a subset of organizational culture and a paramount concern [77]. Employees feel more engaged [73] and more innovative [78] in their work environment when their respective organization responds to their concerns.
Based on study findings, we recommend that the use of survey data can be leveraged using ML algorithms to understand better what drives staff engagement. Considering healthcare organizations' dynamic and complex nature, decision makers and healthcare managers may need to prioritize investment options and resource allocations to different organizational factors. Based on the results of this study, safety culture can be a priority for healthcare organizations to better benefit from such investments and enhance staff engagement.