A Deep Learning H2O Framework for Emergency Prediction in Biomedical Big Data

Recently, the design and implementation of new healthcare systems have gained an interest in both industry and academia. The amalgamation between the Internet of Things, cloud, edge computing, and big data helps the proliferation of new scenarios for smart medical services and applications. Deep learning is currently paying a lot of attention for its utilization with big healthcare data. To this end, the main objective of this study is to propose a Deep Learning H<sub>2</sub>O (DLH<sub>2</sub>O) framework for improving the performance and selection of the optimal features to predict emergency cases. The proposed DLH<sub>2</sub>O framework consists of data preprocessing layer, feature selection layer and deep learning layer. The DLH<sub>2</sub>O framework aims to find the optimal subset of features and minimize the error of the classification through a proposed new variant of the Whale Optimization Algorithm (WOA) called ACP-WOA. The proposed changes have been done on the following parameters <inline-formula> <tex-math notation="LaTeX">$a$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$a2$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$A$ </tex-math></inline-formula>, and <inline-formula> <tex-math notation="LaTeX">$C$ </tex-math></inline-formula> which should affect both exploration and exploitation of WOA. The experiments conducted in order to test the validity of DLH<sub>2</sub>O Framework. In regard to the datasets, five experiments are performed for this purpose. The results demonstrate the superiority of ACP-WOA compared to the other state-of-the-art meta-heuristic algorithms in terms of time, error, and scalability. The proposed ACP-WOA is also tested on CEC2017 benchmark functions and proves its superiority over WOA in terms of accuracy.


I. INTRODUCTION
Throughout the world, there has been a gradual increase in life expectancy leading to a dramatic increase in the number of elderly people [1]. Recently, there has been extensive research indicate that approximately 89% of the elderly are likely to live independently, and 80% of older people over the age of 65 suffer from at least one chronic disease, making it difficult to take care of themselves [2]. Emergency cases emerge very quickly and need immediate intervention and medical support. Faster response in such cases can be lifesaving. Technology can play a major role in such cases [3].
Health care systems are an important field in which the Internet of Things (IoT) promises substantial changes and has a major impact [4]. IoT has now become one of the 21 st century's most effective connectivity paradigms. IoT expands the Internet definition as all things in our everyday life are The associate editor coordinating the review of this manuscript and approving it for publication was Chun-Wei Tsai . part of the Internet due to their communications and computational capacities [5], [6]. The Body Sensor Network (BSN) is one of the most important technologies used in the modern healthcare system based on IoT. It is essentially a collection of low-power and lightweight wireless sensor nodes used to monitor the functions of the human body and its surroundings [7]. The synergy between healthcare and technology has taken a big leap across the world. For example, the IoT and Big Data Analytics are increasingly gaining popularity for the next generation of eHealth and mHealth services [27].
In the world of the IoT, the things and persons in charge are at the center of a complex network of device-based interactions. It's no big surprise that IoT has a complex ecosystem. The interconnected devices produce large amounts of data and they require extreme-scale parallel computing systems to perform processing [8]. The convergence of IoT and computing technologies such as cloud, edge, and fog computing has become necessary. Modern healthcare system is also closely connected with research in big data. However, research on modern healthcare system has not considered the issue of real-time requirements. Training data mining and machine learning algorithms from the generated data is a key challenge because this process can take a very long time [9]. The main challenge here is the time and the accuracy of the learning process. The accuracy should be maximized while the time should be minimized so this could be seen as an optimization problem.
Evolutionary Algorithms (EAs) have been proposed for resolving optimization problems of big data which include a lot of variables and require to be investigated in a short period of time; however, most of them lack scalability problems to mitigate big data problems [10]. Nature-inspired metaheuristic algorithms imitate biological or physical phenomena when solving optimization problems. They can be classified into three groups: physics, swarm, and evolution based methods. Physics-based techniques imitate the physical rules in the universe. Swarm-based methods imitate the behavior of animals. The nature-inspired methods include evolutionbased techniques that are derived from the laws of natural evolution [11].
The H 2 O framework is a basic framework for multi-layer neural networks that can be used to perform Deep Learning (DL) tasks. Deep Learning Architectures (DLA) are hierarchical feature extraction models, usually including multiple levels of non-linearity. DL models are able to learn useful raw data representation and demonstrate high performance on complex data such as images, voice and text [12].
The main objective of this study is to predict the emergency cases in intensive care units. Accordingly, this study introduces a Deep Learning H 2 O (DLH 2 O) Framework that can deal with the big data generated from the BSN and provides a new variant of the Whale Optimization Algorithm (WOA) called AC-Parametric WOA (ACP-WOA). The ACP-WOA extracts the most important features that will be used in the learning process. The new variant of the WOA is compared to the original WOA and to the other state-of-the-art metaheuristic algorithms. The proposed algorithm is tested on CEC2017 benchmark functions and proves its superiority.
The paper is organized as follows: In Section 2, The related work is reviewed. In Section 3, The WOA is discussed. In Section 4, the proposed DLH 2 O framework is described. The proposed ACP-WOA is described in Section 5. In Section 6, the experiments are presented and the results are analyzed. In Section 7, the paper is concluded.

II. RELATED WORK
In recent years, there is a massive growth of data resulting in the creation of big data. Big data is characterized by large volume and velocity; thus it needs new high-performance processing. Classical distributed computing models include MPI (Message Passing Interface), MapReduce, and Dryad. One of the most widespread open-source implementations of MapReduce is Apache Hadoop [23]. Despite its great popularity, MapReduce (including Hadoop) doesn't scale well when dealing with online and iterative processes. Apache Spark is considered as an alternative to Hadoop as it has the ability to perform distributed computing faster using inmemory primitives [24].
Alfian et al. [28] propose a personalized healthcare monitoring system for diabetic patients by utilizing Bluetooth Low Energy (BLE) -based sensors and real-time data processing. While Elhoseny et al. [29], proposed cloud-IoT in integrated industry 4.0 for HCS. The proposed architecture is executed using three different algorithms, Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Parallel Particle Swarm Optimization (PPSO). The proposed fitness function is composed of three essential attributes which are CPU utilization, turn-around time, and waiting time. Ijaz et al. [30], proposed HPM by combining DBSCAN-based outlier detection, SMOTE, and RF classifier. The proposed model is believed to help users to find the danger of diabetes and hypertension at the initial phase.
The H 2 O framework provides modern Deep Learning. Unlike the neural networks of the past, it performs very well in a variety of different problems which makes it the best algorithm to choose in such problems. The H 2 O framework is big data-friendly. Fast in-memory distributed parallel processing abilities of H 2 O can be used for better predictions [25]. H 2 O software depends on R, java, and python to enhance machine learning for Big Data. For deployment production, a developer doesn't need to fear about the difference in the production and development environment. Once H 2 O models are created, they can be used as a java object. H 2 O algorithms can be employed by numerous users from business analysts to developers acquainted with a lot of programming languages. Billions of data rows in-memory can be handled by H 2 O by applying the techniques of inmemory compression, even with a small cluster. H 2 O software includes nearly all machine learning algorithms, like Naive Bayes, generalized linear modeling (such as logistic regression, linear regression.), time series, gradient boosting, k-means clustering, principal components analysis, random forest, and finally deep learning.
Modern DL differs from past neural networks as it provides scalability, training stability, and generalization with big data. Because it performs well in a number of different problems, DL becomes the choice that provides the highest accuracy. DL has many theoretical frameworks, however this study concentrates on the feed-forward design used by H 2 O. The neuron, a biologically imitated model of the human one, is the fundamental unit in the model. Regarding humans, the neurons send output signals with different strength travel through the synaptic junctions, then combined as input to activate connected [26].
There has been extensive research on biomedical big data sets. Mirjalili and Lewis [11] proved the superiority of the WOA algorithm in comparison to other meta-heuristic algorithms and conventional techniques. Nakamura et al. [13] conducted experiments to compare several meta-heuristic algorithms such as (BAT -FFA -PSO) but didn't include WOA in this comparison. Sharawi et al. [14] introduced a feature selection approach using the WOA and proved that WOA has the ability to find the best features with maximum accuracy. Hassan et al. [15] showed that a hybrid algorithm of WOA and Naïve Bayes (NB) saves storage space and accelerates the classification process. In the literature, several methods on enhancing WOA algorithms have appeared. These methods can be classified into hybridization, improvement, and variation of the traditional whale algorithm [16]. Gharehchopogh and Gholizadeh et al. [16] have proposed a new hybrid PSO-WOA approach for some unconstrained benchmark research functions. Hybrid PSO-WOA is the outcome of using PSO for the phase of exploitation. The PSO weakness is its inability to cover limited search space with the presence of constant inertia weight while solving higher-order or complex design problems. WOA is used for the exploration process due to the use of the logarithmic spiral feature, which ultimately covers a vast area of unknown search space. Therefore, WOA guides the particles to a maximum value more efficiently and reduces the processing time [17].
In regard to improving the WOA, the Levy flight trajectorybased WOA (LWOA) has been proposed. The LWOA makes WOA's action more efficient and quicker and avoids premature convergence. The Levy flight trajectory can be used to improve population diversity and perform the ability to jump from the local optima. This method obtains a better tradeoff between the exploration and the exploitation of the WOA [18]. Elhosseini et al. [19] have introduced a new variant of WOA which aims to tune a low number of control parameters. A and C impact on exploration and exploitation. This new variant of the WOA is applied to benchmark functions and compared to some of the well-known state-of-the-art algorithms. The results show the superiority of their proposed WOA than the other compared techniques.

III. WHALE OPTIMIZATION ALGORITHM (WOA)
The Whale Optimization Algorithm (WOA) is a heuristic search algorithm that was proposed in 2016 by Seyedali Mirjalili and Andrew Lewis [16]. WOA is proposed for continuous optimization problems WOA is considered as a Wrapper based Feature Selection algorithm that proved that it's better in performance as compared to some existing algorithms. WOA is inspired by nature came from the mimic hunting behavior of the humpback whales. Each solution in WOA is considered to be a whale. Each whale tries to find prey. There are two ways in which whales use for searching the prey location and attack them. First, whale encircles the prey, and secondly, they create a bubble net around the prey to start to eat them. According to optimization, exploration is preferring when a whale looking for the prey, and the exploitation when attacking.

A. INSPIRATION
The whales are beautiful creatures; they are considered to be the largest animals in the world. An adult whale is 30 meters long and weighs 180 tons. Whales are generally considered predators. The whales never sleep because they need to breathe from the surface of the oceans. Only half of the whale's brain sleeps. What is unique about the whales is that they are known to be very intelligent and emotional creatures [16].
Hof and Gucht [20] think that there are specific cells in certain regions of whale brains, such as human brains. Such regions are referred to as spindle cells. These regions are responsible for human action, sentiment, and social behavior. In other words, spindle cells differentiate humans from other organisms. The main cause of whales' smartness is that the whales have more cells than an adult human. Whales can think, understand, judge, communicate, and even get as emotional as a person, but with a lower level of smartness. It has been noted that whales (mostly killer whales) are also capable of developing their own dialect.
One of the most exciting behaviors of whales is that they are usually seen together in groups. However, they can live alone. Some of the kinds (such as killer whales) are capable of living their whole life in a family [11]. The humpback whales are characterized by their distinct hunting methods. They create distinctive bubbles along a '9'-shaped path or circle. This foraging behavior is named the bubble-net feeding method [21].

B. A MATHEMATICAL-BASED MODEL AND OPTIMIZATION ALGORITHM
The model of the WOA involves three phases. The first phase mimics the search for prey (the exploration phase), the second phase is the prey encircling, and the third phase represents the bubble-net feeding method of humpback whales (the exploitation phase). The pseudocode of the WOA is shown in algorithm 1.

1) PREY ENCIRCLING
In this phase, WOA begins with an initial best candidate solution which is the target prey or is close to the best. The remaining search agents update their locations regarding the best search agent consequently. This is explained by equa-tions1 and 2 as follows: (1) where t is the iteration number while A, and C are coefficient vectors. The X is the location vector and X * is the location vector of the optimal obtained solution. Finally, the X * should be iteratively updated. The vectors A, and C are measured by equations 3 and 4.
where r is defined as vector in [0, 1] and a is linearly decreased from 2 to 0 over the number of iterations. This modeling permits any search agent to update its location in the area of the present best solution and mimics prey encircling. The search agents can move in hyper-cubes around the optimal solution attained.

2) BUBBLE-NET ATTACKING METHOD (EXPLOITATION PHASE)
This phase involves two methods as follows: 1) Shrinking encircling mechanism:The value of a is decreased by equation (3) Figure 1 displays the possible positions from (X , Y ) towards (X * , Y * ) that can be attained by 0 ≤ A ≤ 1 in a two-dimensional space. 2) Spiral updating position: as explained in figure 2 the distance between the prey and the positions of the whale is measured, after that an equation of spiral is generated between prey and whale positions to imitate the helix shape movement of whales. This can be modeled by equations 5 and 6.  Equation 6 displays the distance of the i th whale to the prey (optimal solution attained), b represents a constant to define the logarithmic spiral shape and l is a random number in between [−1, 1]. The whale swims around its prey within a shrinking circle in a spiral-shaped path. An assumption of 50% possibility is applied to select between the two modes and find the whale's next position by equation 7 as follows [11]: where p is defined as a number in [0, 1].

3) EXPLORATION PHASE
Whale attains a worldwide optimization in this phase. Humpback whales search for its prey randomly as shown in figure 3. The D is randomly chosen between [−1, 1] to move the search agent away from the reference whale. The updated location of a search agent is found by randomly choosing an agent that permits the whale algorithm to carry out a global search. The exploration mechanism is mathematically described by equations 8 and 9 as follows:   where X rand represents a random position for a random whale that is chosen from the present population.

IV. THE PROPOSED DEEP LEARNING H 2 O (DLH 2 O) FRAMEWORK
The proposed DLH 2 O framework is used to optimize the multiclass classification problem for the intensive care unit. The H 2 O, as shown in figure 4, consists of three layers. The first layer is the pre-processing layer which is mainly responsible for data integration and data cleaning. In this layer, the dataset is also divided into training, validation and testing subsets. The second layer is the feature selection layer where the proposed ACP-WOA is used to select the best features that would be used in the third layer which is the DL layer. The DL layer uses the best-selected features along with the best configuration of the neural network to train the neural network. In the next subsection, the layers of the proposed DLH 2 O framework will be explained in detail.

A. THE PREPROCESSING LAYER
Mainly, data is incomplete, inconsistent, and may contain many errors. This layer is responsible for handling these issues. The data in hand is divided into three datasets. The first dataset is for a hypertensive patient. The second is for a hypotensive patient. The third is for a normotensive patient. These datasets are taken for a period of one year long for each patient. At the first phase of preprocessing, normalization is done on some columns in these datasets. Each value in these columns is rescaled. The second phase of preprocessing is to remove useless columns such as the time column. At the third and the last phase of the preprocessing,the labels for each entry in the dataset are modified from string values to numeric values. These classes are (normal, warning, alert, emergency) and converted to 1, 2, 3, 4 respectively.

B. THE FEATURE SELECTION LAYER
In this layer, the proposed ACP-WOA is used in the feature selection process. This is done in an innovative way. Simply converting the columns into zeros and ones, a binary representation, helps greatly not only in simplifying the problem but also in making it a one-dimensional problem. The proposed algorithm is then used to determine a value from 0 to (2 NTF -1), where NTF is the number of total features. The generated value will be converted to a sequence of zeros and ones in a binary format. The zero marks columns or features that will not be taken into consideration in the learning process which will be done later by the neural network. On the other hand, one means that the corresponding column will be added to the list of the selected features. These selected features are then used in training the neural net to reach the highest accurate model based on our fitness function described in equation 10.
where MCE is the Mean Per-Class Error. The learning process takes place for M trials and the average of the MCE in these trials is taken as a fitness function. The main goal here is to minimize the resulted value.

C. THE DEEP LEARNING LAYER
The proposed ACP-WOA algorithm, the WOA algorithm, the PSO algorithm, the BAT algorithm, and the FireFly algorithm are investigated in this layer. The objective is to assess the performance of the proposed ACP-WOA.

THE CLINICAL DATASET
These datasets belong to old patients complaining from blood pressure disorders and they have been taken from PhysioNet MIMIC-II [22].

V. THE PROPOSED ACP-WOA
The whale algorithm is a very powerful meta-heuristic algorithm that works in two phases exploring and exploitation. VOLUME 8, 2020 In these two phases, some parameters are used to configure the algorithm and some of them affect the performance dramatically. Most of the whale variants can achieve better quality. However, it is found that researchers have paid less attention to the standard Whale's simplicity. In this section, the main work is focused on parameters tuning in the standard WOA that are responsible for its two main phases. The proposed changes have been done on the following parameters a, a2, A,and C which should affect both exploration and exploitation of WOA. These changes are described in the following equations.
As noticed, the parameters a and a2 are time variants which decrease more slowly with the change of the denominator to the square of the maximum iteration numbers which narrows the range of change. While the changing of A and C is converted to sinusoidal change hence the fluctuation of the change is more efficient than normal randomization. The proposed modified whale algorithm is introduced in Algorithm 2 below.

VI. EXPERIMENTAL RESULTS
The following experiments are performed on VMware machine running Linux OS with 2.8 GB of RAM, 2 cores of processor. The number of features is 11 so the range of numbers generated by the optimizing algorithm is from 0 to 2047. The trials are done 10 times and the average was taken as a fitness score. A neural network with a single hidden layer with 10 neurons is used. The experiments are conducted in order to test the validity of DLH 2 O Framework. In regard to the datasets, five experiments are performed for this purpose. Three public datasets with 35233 samples: Hypertensive patient, Hypotensive patient, and Normotensive patient are used in these experiments. The number of features is 11 and the number of classes is 4.
The datasets have been partitioned into three disjoint subsets Z 1, Z 2, Z 3 which correspond to the training, validation and test sets, respectively. We have tested the proposed ACP-WOA against WOA, PSO, FFA, and BAT. The final subset of features will be the one that minimizes the error over Z 3. The related discussion will be presented in the next subsections.  decreased. The number of neurons in the hidden layer is set to 10.

2) EXPERIMENT 2
The proposed optimizer is tested against the other optimizers to check its performance. This process is done over 50 iterations to reach a stable and good solution. The results of these  experiments are shown in figures 6, 7, and 8 and summarized in figure 9 and Table 1.
The ACP-WOA and WOA outperform the other algorithms for the hypertensive patient and normotensive patient datasets. In regard to the hypotensive patient dataset, the ACP-WOA and WOA have achieved the second-best.

3) EXPERIMENT 3
This experiment is done to compare the number of features selected by each optimizer. The results are shown in figure 10 and table 2. The ACP-WOA algorithm selects an acceptable number of features and gets away from the extremes, unlike BAT which has selected all eleven features and PSO which has selected four features.

4) EXPERIMENT 4
The experiment is done to compare the time taken for each algorithm to run. The results are displayed in figure 11 and Table 3. Time is so critical in this application as this is an emergency case prediction. As well, the error must be taken into consideration because the algorithm may produce very   good predictions but in too much time. On the other hand, the results may be produced too quickly but with poor perfor-  mance and wrong predictions. The ACP-WOA has a unique combination between time and error and has shown its superiority over the other state-of-the-art algorithms. Although the ACP-WOA is ranked the second-best in time, it is considered to be the first in error and time combination.

5) EXPERIMENT 5
The experiment is done to check the scalability in DLH 2 O Framework using different optimization techniques. The results are displayed in table 4 and figures 12, 13, 14, 15, and 16. The ACP-WOA, WOA and firefly are scalable but PSO and BAT aren't. This means that the former group can be trained with more data and generate more accurate results     changing the search agent numbers to 100, 50 and 25. F2 is deleted so this function is skipped in tests.

VII. CONCLUSION
The healthcare environment is considered to be 'information rich' yet 'knowledge poor'. Healthcare systems contain a huge data available within. However, there is a shortage of effective analysis tools to find hidden relationships in data. A deep learning framework for the prediction of emergency cases of eldest people is introduced. This DLH 2 O framework consists of three layers pre-processing layer, feature selection layer and deep learning layer. A new variant of the Whale Optimization Algorithm (WOA) called AC-Parametric WOA (ACP-WOA) is introduced. Five tests have been done to validate DLH 2 O Framework using three public datasets. ACP-WOA has been compared against WOA, PSO, FFA and BAT. ACP-WOA has achieved an acceptable time, error, scalability and accuracy. The proposed algorithm proved to be more scalable and faster than the original whale optimizer. Also ACP-WOA is more accurate than most of the compared algorithms. As future work, new metaheuristic algorithms will be checked against our proposed whale algorithm. Also, the effect of increasing the complexity of datasets used in the deep learning is still to be investigated. VOLUME 8, 2020  He is also a visiting part-time Professor with MET Academy. He also teaches in American and Mansoura universities, and has taken over many positions of leadership and supervision of many scientific articles. He has published hundreds of articles in well-known international journals. VOLUME