Stroke Screening Feature Selection for Arm Weakness Using a Mobile Application

This work studies the features of a proposed automated stroke self-screening application that utilizes the gyroscope and accelerometer devices in smartphones to determine the possible onset of a stroke by assessing arm muscle weakness. The application requires users to perform two arm movements to evaluate arm weakness and pronation: Curl-up and Raise-up. For the purpose of the study, 68 subjects, consisting of 36 stroke patients with symptoms of arm weakness and 32 healthy subjects, consented to participate. A total of 78 handcrafted features were proposed, 26 of which were extracted from Curl-up and Raise-up for each arm. Then, the differences between corresponding features for each arm were calculated. These features were then tested on 63 combinations of three classical feature selection methods, three feature sets (i.e., Curl-up-only features, Raise-up-only features, and both-exercises combined features) and seven well-known classification methods. The results from ten runs of 10-fold cross-validation showed that Curl-up-only features achieved an average sensitivity of 83.3%, significantly higher than those of the Raise-up-only features or both-exercises features. From all possible combinations, the random forest classification based on information gain feature selection from Curl-up-only features achieved the most efficient results for arm-weakness-stroke screening. It achieved an average sensitivity of 94.8%, an average specificity of 75.2%, an average accuracy of 84.1%, and an average area under the receiver operating characteristic curve of 85.0%. Our work proposes a novel accessible method to screen symptoms of arm weakness that may indicate the onset of a stroke using a single mobile device. In the future, we can combine this method with other methods of evaluating facial drooping and slurred speech to create a complete Face, Arm, Speech, Time (FAST) assessment application.


I. INTRODUCTION
According to the World Health Organization, 15 million people worldwide are diagnosed with strokes each year, 6.2 million of which die. Stroke is the leading cause of disability in the United States and the second leading cause of death globally [1]. Annually, approximately 800,000 people in the United States alone suffer from this illness. Globally, more The associate editor coordinating the review of this manuscript and approving it for publication was Praveen Gunturi. than 10% of deaths are due to stroke, making stroke the third leading cause of death worldwide. A stroke is a condition that causes brain death due to poor blood flow to the cells present inside the brain [2]. There are two main types of strokes: ischemic stroke and hemorrhagic stroke. Ischemic stroke occurs due to the stoppage of blood flow to the brain, while hemorrhagic stroke occurs due to bleeding in the brain [3]. A stroke causes abnormal functioning of the brain and, thus, the entire body.
All signs and symptoms related to a stroke are quick and unexpected. There are mainly five severe signs that, if noticed on time, can save a substantial number of lives and decrease the severity of the effects of the attack. Immediate action is required or there would be a risk of loss in brain function. The signs are as follows [4]: 1) Immediate numbness in the face, leg or arm region 2) Numbness can occur only on one side of the body 3) Inability to speak or understand speech 4) Immediate vision issues 5) Abrupt disorientation or dizziness along with loss of balance 6) Intense headache with no plausible reason Treatment is imperative once a stroke occurs; the victims need immediate and proper medical treatment before the effects are permanently damaging. Early detection of stroke symptoms can prevent much more severe problems. As an alternative, to reduce the effects, the symptoms of a stroke were developed so that people could identify strokes and receive treatment immediately after developing the disease.
In 1998, a method to improve stroke symptom identification, FAST, was developed. FAST (Face, Arm, Speech, and Time) is a stroke early-identification method that assesses the patient's capability of performing certain tasks associated with stroke symptoms [5]. This method of identification primarily consists of four tasks. First, the patient is asked to smile to examine signs of facial drooping, an indicator of muscle weakness. Second, the patients are asked to raise both of their arms, parallel to the floor, to determine whether they encounter muscle weakness. Third, the patients are instructed to repeat a certain phrase in order to recognize speech difficulties. Finally, if the patients show any of these symptoms, they should call the ambulance immediately to have the shortest possible treatment delay [5].

A. STROKE SCREENING METHODS
Aroor et al. 's study results reflected an interesting aspect of the relationship between age and FAST symptom coverage [6]. The study showed that stroke patients who are younger tend to be missed when using the FAST method compared to older patients who showed FAST symptoms. Corroborating this information, another study was conducted using 5,023 stroke patients aged 18-55 years, aiming to evaluate the effectiveness of FAST for younger patients with stroke. The results of the study showed that at least one FAST symptom is identified in 69.1% of young patients (18-24 years old), 74% of middle-aged patients (25-34 years), 75.4% of those between 35 and 44 years, and 77.8% of 45-55 year-old patients. The proportion of stroke patients identified by the FAST method is very low for younger patients compared with older patients [7].
To improve the identification accuracy of FAST, researchers revised the mnemonic and added two more symptom indicators, B (Balance) and E (Eye-vision), to create BE-FAST. According to a study conducted by Aroor et al., 14.1% of patients who suffered from strokes did not possess any FAST symptoms. However, this proportion of missed patients significantly decreased by 9.6% when aspects of balance loss (B) and ocular complications (E) were added [6]. In another similar study, Berglund et al. analyzed the effectiveness of the FAST method for stroke identification. The research was conducted using a sample of 179 emergency calls from patients who reported conditions of strokes. Of the 179 patients, 64% were actually reported to have strokes; however, only 90% of them exhibited FAST symptoms [8].
Another type of stroke-screening scale is known as the RACE scale (Rapid Arterial Occlusion Evaluation) [9]. It is used for stroke detection and is simpler to use. It is mainly used for acute stroke patients. It is a simple scale with more inclination towards occlusion. It has five main items, with a score ranging from zero to nine. Zero means that the patient is normal, and nine indicates that occlusion is present and a severe stroke has occurred [9].
CPSS, also known as the Cincinnati Prehospital Stroke Scale, is a system used to diagnose the onset of a stroke without the presence of a medical examiner [10]. It is used to evaluate three major signs of a stroke: facial palsy, arm weakness, and speech abnormalities. If the abnormal sighting of these three signs has occurred, then the patient needs to be immediately taken to a medical center for immediate care. The victims detected by this test have a seventy-two percent chance of having an ischemic type stroke if only one of the signs is shown [11].
Another useful tool to detect the possible onset of a stroke is transcranial Doppler ultrasound (TCD). TCD is a medical imaging tool that allows measurement of the velocity of blood flow in intracranial arteries. It can be used to detect circulating cerebral emboli and could enable rapid treatment and prevention of embolus-related stroke. However, while using TCD, the embolic signals (ES) share very similar characteristics with those of artifacts (AF) caused by patient movements. Therefore, human experts are required to analyze the audio and spectral characteristics, making it prone to human errors. An efficient automated algorithm to distinguish ES is greatly desired, but no systems have been unanimously agreed upon for routine clinical use. In our previous works [12]- [14], we proposed an automated algorithm based on an adaptive neuro-fuzzy inference system (ANFIS) that allows for real-time detection of embolic signals (ES) when used with TCD. The system achieved results of 91.5% sensitivity, 90.0% specificity, and 90.5% accuracy. Prior to that, we reported on the use of ANFIS as a classifier. We performed feature extraction on captured ES candidates using the adaptive wavelet packet transform (AWPT) and the fast Fourier transform (FFT) and proposed the seven best features for the system. ANFIS was then used to classify the extracted features as non-ES or ES. The results of this study significantly outperformed the results from the combination of features and algorithms of Karahoca and Tunga [15]. In another prior study, we investigated the use of a deep convolutional neural network (CNN) as a tool for cerebral ES detection [13]. The study did not yield better results compared to using ANFIS.

B. STROKE DETECTION USING MOBILE DEVICES
Even though tools such as CPSS are available for diagnosing the severity and probability of a stroke, the detection of stroke symptoms has been more focused on for mobile applications. Compared to wireless medical devices, mobile applications on smartphones offer a more efficient, portable, and accessible alternative that could help detect FAST symptoms. Nougeira et al.'s study developed an application that incorporates an automated decision-making algorithm in combination with a series of clinical questions to assess the likelihood of a patient experiencing a stroke. Upon detection, the application analyzes its database of all regional stroke centers and directs Emergency Medical Services (EMS) to the most suitable stroke center for a given case. However, the application is only targeted for EMS physicians to handle and not the general public [16]. With every smartphone's gyroscope and accelerometer, an application can be developed to measure the movement of a patient's arm to allow detection of arm weakness. Applications also allow users to pinpoint their exact location to help find nearby emergency hospitals in case of a stroke. The American Stroke Association has developed a mobile application known as ''Spot a Stroke F.A.S.T.'' This application allows the user to recognize the signs of a stroke and respond quickly by getting in touch with the emergency services [17].
Accelerometers are instruments that quantitatively measure the acceleration of a body in movement of any form. They are unique tools to measure physical activities. In mobile phones, accelerometers are used to detect the orientation of the mobile device, as well as the pressure applied to fingerprint scanners and touch screens [18].
Nam et al. used mobile phone devices as a qualitative method for detecting arm weakness, drift, and pronation caused by the onset of a stroke. Two mobile devices were placed on each forearm and held firmly in place with straps above the wrists. A support system was developed as a part of the application for decision-making situations and emergencies [17]. Mukherjee and Arvind proposed a method to classify muscle strength. The method uses a wearable device that combines a triaxial gyroscope, an accelerometer, and a magnetometer. Ten healthy subjects simulated four levels of muscle strength of the biceps brachii by performing the biceps curl motion with different resistance loads [19].
A gyroscope is a device that spins around freely on its axis and can alter the direction in which it is spinning. If the device is tilted, this does not affect the axis orientation. This allows the device to provide stability. A gyroscope is used as a reference when finding the direction by navigation systems. Gyroscopes are used in mobile phones to detect when users change orientation, such as tilting the screen during gaming and shaking [20].
Gyroscopes and accelerometers are inexpensive and small enough for easy transportation. Capela and Lemaire carried out a study that utilized a wearable smartphone for physical activity detection in stroke patients [21]. A comparison was also made with other stroke-free individuals in the populace. Certain parameters were chosen to be tested for each populace. Physical activity detection outside the medical center is a valuable parameter to be monitored after a stroke. The use of accelerometers and gyroscopes has made it easier and more affordable to shift studies outside of controlled environments. The study measured the total activity per day of the subjects through the body-worn sensors. Physical activity included walking on flat ground, standing, sitting on furniture, climbing staircases, etc. Other mundane tasks were also taken into account, such as washing dishes, doing laundry, combing hair and brushing teeth. A control sensor was used as a standard, and videos of activities were recorded. The data collected were processed. A mean and a standard deviation were given for each populace. The results showed that the gyroscope was capable of distinguishing between phases of idle sitting and standing up. Accelerometers measured movement and acceleration of the subjects. Moreover, the gyroscope detected minute movements by the subjects, making it suitable for neurological examination. Additionally, the accelerometer could detect changes in acceleration when climbing up and down stairs [21]. This confirms the validity and reliability of both sensors.
Previous works [22]- [24] attempted to utilize the gyroscope and accelerometer as separate wearable devices. However, using these miniaturized wearable devices raises concerns regarding battery life and signal fusion. Maximizing energy efficiency to extend battery life is vital for medical media technologies, especially due to the emerging trends in miniaturized wearable devices. Various wireless body sensor networks have different requirements regarding power, data rate, and other parameters. Conventional approaches with constant transmission power are inappropriate for use in healthcare purposes due to their inefficient power management and energy-saving capability. Sodhro et al. proposed a transmission power control (TPC)-based energy-efficient algorithm (EEA) for use in wireless body sensor devices that track a subject in three different postures (i.e., walking, running, and standing). Compared with the previous adaptive TPC (ATPC) method, their experimental results showed that EEA achieved 42.5% energy savings with an acceptable packet loss ratio (PLR). Despite enhanced energy savings, the main limitations of the newly proposed EEA are its higher packet loss ratio and high standard deviation. Zhang et al. proposed a self-adaptive power control-based enhanced efficient-aware approach (EEA) to reduce energy consumption, extend battery lifetime, and improve battery reliability. They evaluated the proposed method by analyzing real-time data traces of static and dynamic postures, comparing it to conventional constant TPC methods. Their experimental results showed that the proposed EEA enhances the energy efficiency, reliability, and sustainability, while constant TPC does not.
When using wearable wireless body sensors, data may be obtained from multiple devices. However, data may be less meaningful when derived from an array of individual signals. This highlights the need for a multisensor fusion method to connect data from a multitude of sensor sources and transform them into high quality fused data that can predict events with higher confidence. Muzammal et al. proposed a data fusion enabled ensemble approach to fuse together medical data obtained from a collection of wireless body sensor networks (BSNs) to predict the presence of heart disease. They developed a fog-based computing environment that facilitates communication between the wearable sensors and the system. For classification, a kernel random forest ensemble was used, which produced better quality results than random forest. Given these limitations, we strive to find a more efficient and accessible method to utilize the gyroscope and accelerometer in measuring arm weakness in one device.
In our recent work [25], we utilized the gyroscope and accelerometer in mobile devices to collect arm movement data, aiming to detect early symptoms of arm weakness for stroke patients. The novelty of this method is the utilization of the gyroscope and accelerometer signals in creating an accessible self-screening application that detects arm weakness to screen the possible onset of a stroke. Focusing on detecting the arm factor of FAST, subjects were asked to perform two arm exercises while carrying a mobile device and using the MAWD application. The application has been designed to appear as a game so that patients would not be stressed during data collection. With only results from the arm factor of FAST, the study yielded an accuracy of 61.7%-74.1% and an average area under the ROC curve (AUC) of 66.2%-81.5%.

C. AIM OF THIS STUDY
The primary objective of this study is to further develop the information and analysis presented in our previous work [25]. We aim to follow the same methods as in our previous work [25], with the addition of increased numbers of subjects, feature selection methods, and classification methods. Moreover, a more in-depth graphical analysis will be presented.
The main contributions of this study are as follows: 1) A novel stroke-screening method is proposed. The proposed method analyzes the gyroscope and accelerometer signals of a smartphone, collected from patients performing two arm movement exercises (Raise-up and Curl-up), and assesses arm muscle weakness. The data from these signals are extracted into 78 features. Due to the low complexity of these features, most modern smartphones are able to calculate them independently. The novelty of this method lies in the utilization of gyroscopes and accelerometer devices, available in most modern smartphones, to detect arm muscle weakness and predict the possible onset of a stroke. By using measuring tools that are commonly available in mobile The proposed approach of this particular study is boxed in red. The blue box includes features that are planned to be added in future studies to create the complete application.
phones, we are able to diagnose arm muscle weakness by using a single device. This contributes to the accessibility of our proposed method, as most people have access to some type of smartphone. Ultimately, we plan to combine features of facial drooping and slurred speech factors to craft a complete FAST stroke-screening application as illustrated in Fig. 1. 2) Two arm exercises (Raise-up and Curl-up) were crafted to detect arm weakness and pronation. The study focuses on assessing arm weakness as a possible indicator of a stroke (different nerves). The Curl-up movement is designed to measure muscle strength of the biceps brachii [19]. The Raise-up movement is designed to evaluate pronator drift, an indicator of mild arm weakness.
3) The highest accuracy achieved from this study, despite only assessing the arm weakness symptom as an indicator of a stroke, is better than that achieved from the assessment of normal witnesses. From all possible combinations investigated in this study, the random forest classification based on information gain feature selection from Curl-up-only features achieved 84.1% accuracy. In the general population, people are able to identify whether an observed symptom is an indicator of stroke only approximately 60 to 80 percent of the time [26]. Table 1 summarizes four distinct stroke screening methods and highlights the advantages and disadvantages of the proposed approach in comparison to previous relevant methods. Section II describes some related theories of feature extraction. Section III proposes our handcrafted feature extraction. Section IV describes the process of data collection and the VOLUME 8, 2020 use of the numerous results for cross-validation and multiple types of feature selection and classification. Section V presents the results. Section VI presents the discussion and conclusion of this study. Lastly, Section VII proposes future works and possible developments.

A. VELOCITY AND DISPLACEMENT CALCULATION
Acceleration data were measured with the accelerometer sensor module. In general, the velocity is the first and the acceleration is the second derivative of displacement. In a continuous time system, the acceleration value must be integrated to be able to compute the velocity. However, the data were collected in a discrete time system. Therefore, we used the trapezoidal method to estimate the velocity from the retrieved acceleration data a[n] with sampling interval T [27] as However, one concern that arose was that the acceleration value may include noise. As a solution, we set the value to zero if the absolute value of the acceleration was less than the noise level threshold (ε a ) [27]. Our selected threshold is ε a = 0.5 m 2 /s. Another concern arose as the noise from the acceleration caused unwanted ramping. To adjust this, the velocity was integrated using the Omega Arithmetic method [28]. The Omega Arithmetic method applies the discrete-time Fourier transform (DTFT) and integrates the frequency domain to adjust the signal. The acceleration can be defined by Eq.
(2) and Eq. (3) give This method is not suitable for some low-frequency signals due to the appearance of interference signals, called ''1/f noise'', and we assume that no subject can Curl-up over ten times per second. To ameliorate this issue, we added cutoff frequencies (f c L for the lower cutoff frequency and f c H for the upper cutoff frequency) into Eq. (4) as Based on the preliminary experiment, we suggested using 1 Hz for the lower cutoff frequency and 10 Hz for the upper cutoff frequency. The relationship between displacement and velocity is the same as the relationship between velocity and acceleration. They can be described as B. GYROSCOPE SIGNAL ADJUSTMENT Due to a limitation of the gyroscope, the output of the gyroscope will be limited to (−π/2, π/2] for yaw (α) and roll (γ ) and limited to (−π/2, π/2] for pitch (β). However, the gyroscope signals must be normalized due to the varying initial angles, θ, at which users hold their mobile phones while performing the Raise-up or Curl-up exercises. After normalizing the gyroscope signals, incorrect angles occasionally occurred, as shown in Fig. 2. To replace the incorrect angle shift, we used the arctan and arctan2 functions after the gyroscope signal normalization [29] as follows: where

C. FINDING THE PERIODICITY
A correlation signal is an indicator tool that shows the relationship or similarity of a signal. An autocorrelation signal (R xx ) is the correlation of a signal with a delayed copy of VOLUME 8, 2020 Algorithm 1 Calculate the First Maximum of x With Starting Index i 0 , Window Length Th w , and Threshold Th h Require: itself. It can show the similarity of the signal itself or its periodicity and can be defined as where N is the signal sample length, and 0 ≤ m < n < N . Chatfield [30] showed that the time period of the signal is the time at which the correlated value first peaks. We used Algorithm 2 to find the periodicity of a signal, such as an acceleration or gyroscope signal.

D. CURL-UP AND RAISE-UP EXERCISES
The pronator drift test where a patient is asked to lift both arms in the air with forearms supinated and eyes closed is a neurological examination to detect signs of cerebral damage affecting the motor nerve for pronation [31]. Patients with arm weakness will show symptoms where they will attempt to overturn their hand. This test is similar to the standard arm test of FAST. We adopted this test such that the pronator drift can be measured with a single mobile phone, measuring one arm at a time, and named it the ''Raise-up'' exercise. The accelerometer and gyroscope were used to measure the lifting distance and stability of the patient's arms.
The arm strength test used by Mukerjee and Arvind [19] focuses on measuring the strength of biceps brachii muscles. Because weakness rather than spasticity is the main factor interfering with voluntary force control in chronic stroke [32], we adopted this test's primary motion of 'elbow flexion against gravity' to measure arm weakness for each arm by using a mobile device, and we called this the ''Curl-up'' exercise. The accelerometer and gyroscope were used to measure the applied force and the periodicity of cycling of the biceps curl.

A. DATA PREPROCESSING
The gyroscope and acceleration signals collected from our application were enhanced by the Savitzky-Golay smoothing filter, the same configuration as in [33], and the DC offset removal method, called normalization. After normalizing the gyroscope signal, we replaced the incorrect angle as described in Section II-B.

B. ARM WEAKNESS SCREENING POSTURES
Before the exercises, all subjects consented to the agreements and conditions of the ''Software Usage Agreement'' in the application and documented personal information including age, gender, and dominant hand. Then, they classified themselves by selecting their health conditions: • Level 1 -Stroke patients with arm muscle weakness • Level 2 -Regular healthy subjects • Level 3 -Healthy subjects who regularly exercise After completing the introductory part of the application, subjects were instructed to perform four exercises as follows: 1) Curl-up -Right hand 2) Raise-up -Right hand

3) Curl-up -Left Hand 4) Raise-up -Left Hand
The Curl-up exercise was designed to measure the subject's arm strength: more specifically, the biceps and triceps muscles. As explained in the instruction part of the application, subjects are instructed to sit in a straight and relaxed posture with their mobile device held in the indicated hand. After clicking ''Start'' on the screen, each subject should be in the ''Prepare'' posture, placing their hands on their lap with their palms facing up as they wait for the ''Go'' signal. The ''Go'' signal will sound for five seconds after clicking. After the ''Go'' signal, subjects will begin to perform the biceps curl. To perform the Curl-up, subjects will contract their biceps to elevate their forearm towards their chest. Then, they will relax their biceps to allow the forearm to descend back towards their lap. These motions should be repeated as rapidly as possible. Subjects will eventually stop after time runs out at 15 seconds. This exercise is then repeated with the other arm.
After completing the Curl-up exercise, subjects will proceed to perform the Raise-up exercise. The Raise-up exercise was designed to follow the standard procedures to detect muscle weakness for the arm factor in FAST. For Raise-up, subjects will also begin in their ''Prepare'' posture as they wait for the ''Go'' signal. Once the ''Go'' signal is played, subjects will extend their arm horizontally, parallel to the floor, with their palms facing upwards. The phone should rest on the extended arm's palm. On the mobile application, subjects will see two circles on the screen: one white stationary circle on the center of the screen and one adjustable circle that correspondingly moves with the subject's palm angle. The objective of this exercise is to align the adjustable circle with the stationary center circle and maintain that balance for as long as possible. By doing so, subjects will earn points, and their scores will increase. The Raise-up exercise will end after 20 seconds.
While performing the exercises, accelerometer and gyroscope data are being collected by the application with a 20 millisecond sampling interval. The accelerometer collects data of the device's acceleration on the X , Y , and Z axes, represented as x, y, and z, respectively. The gyroscope provides data on the device's orientation in space; pitch (β), roll (γ ), and yaw (α) represent rotations around the X, Y, and Z axes, respectively.

C. DEFINITION OF SIGNAL PARTS
We are interested in analyzing three parts of the two exercises. The first point of interest is the last 10 seconds of the Curl-up exercise, called the ''curl part''. This part reflects the strength of the arm muscles and the consistency of exertion. The next point of interest is the first 2 to 12 seconds of the Raise-up exercise, called the ''raise part'', which indicates the distance that the subjects can lift and the time needed for the subject to maintain stability after lifting the mobile phone. The third point of interest is the last 10 seconds of the Raise-up exercise, called the ''stable part''. During this part, the subjects must try to balance their phone, parallel to the ground, at all times.
Gyroscope and accelerometer data were retrieved with a 20 millisecond sampling interval to ensure that the method can be run with the majority of mobile devices in the global market. According to portrait orientation, the X axis runs from left to right, the Y axis runs from bottom to top, and the Z axis runs from back to front. The acceleration data display the device's acceleration on three axes, x, y, and z, which constitute the device acceleration along the X , Y , and Z axes, respectively. Gyroscope data reveal the device's orientation in space; yaw (α), pitch (β), and roll (γ ) represent the rotations around the Z , X , and Y axes, respectively.
We defined the sum of the rotation signals in both motions by where θ β and θ γ are the orientation in pitch and roll, classified as follows: good level-a sum less than 1 degree; adequate level-a sum less than 3 degrees; moderate level-a sum less than 5 degrees; poor level-a sum greater than 5 degrees.

1) Curl Part's Feature Extraction
• First local maximum (delay time and value) and number of local maxima of the autocorrelation of the Z -axis acceleration (first(T C z ), first(S C z ), and • First local maximum (delay time and value) of the autocorrelation of the pitch (first(S C β ) and first(T C β )) and the roll (first(S C γ ) and first(T C γ )), where and • Maximum acceleration and maximum jerk (derivative of acceleration) • Absolute values of maximum positive velocity and maximum negative velocity and their sum • Range of Z -axis displacement in 500 ms • Standard deviation of sin of the sum rotation (std(sin( C [n]))) 2) Raise Part's Feature Extraction • Range of Z -axis acceleration in 500 ms • Time used to stabilize the device to the proper or adequate level after maximum Z -axis acceleration, called ''peak-to-stable'' 3) Stable Part's Feature Extraction • Percentage of time at the good, adequate, and moderate levels • 25th, 50th, and 75th percentiles of the degree of stabilization (P 25 ( S ), P 50 ( S ), and P 75 ( S ))

E. ''DELTA OF'' FEATURES
Stroke patients are expected to experience difficulty in performing both exercises efficiently for the same arm. For example, in the Curl-up exercise, arm weakness can be indicated by a drastic difference in the acceleration of each arm. Since force is directly proportional to acceleration, the weaker arm would yield a significantly lower acceleration. All extracted features are computed for each individual arm and calculated to find the difference between both arms. This difference is called ''Percent Delta of'' (% ), defined as where F X ,L is a feature from the X part for the left arm and F X ,R is a feature from the X part for the right hand. If % is a Not-a-Number (NaN) value, caused by having a zero denominator, then this feature will be set to the ''ignore'' value.

IV. EXPERIMENTAL SETUP A. DATA COLLECTION
In this study, we collected the gyroscope and acceleration signals from the Curl-up and Raise-up exercises by using an iPhone 6, a smartphone device, with a 40 Hz sampling frequency. With the approval of the Institutional Review Board, Chulalongkorn University No. 242/61, the data from 68 participants (32 healthy subjects and 36 chronic stroke patients) were collected at the King Chulalongkorn Memorial Hospital, Bangkok, Thailand. All chronic stroke patients have been diagnosed and confirmed to have symptoms of arm weakness by the hospital's neurologists. However, they could speak and understand speech. Facial drooping and other stroke-related damage were present in some patients. All patients were assisted in the documentation and information process; however, all patients performed the exercises unassisted.

B. CROSS-VALIDATION
Cross-validation is used in machine learning to estimate the performance of a classifier on unseen data (often used in the case of limited sampling data). We used the k-fold cross-validation method. This method splits data randomly into k equal parts. After that, one part is used to test the performance of the model. To be able to compare each method, we must split data in the same way for every classifier. Then, this process is repeated for all k parts. No pair of data points in this study is from the same subject. In this study, we repeated  ten runs of 10-fold cross-validation for all combinations of classification and feature selection methods with randomized initial values.

C. SELECTED FEATURES
In each fold of the cross-validation, we augmented the training dataset by swapping data between left and right hands to avoid imbalance of the weak side in the data that we randomized. Then, the gyroscope and acceleration signals from both arms were adjusted and extracted into 78 features, described in Section III-D. We chose to use the two most common feature selection methods: Information Gain (InfoGain) [34] and Correlation-Based Feature Selection (CFS) [35]. InfoGain is an entropy-based feature selection method, defined as the amount of information provided by the feature items for the text category. It is calculated according to how useful a term is for classification of information. InfoGain ranks subsets of features based on high information gain TABLE 4. Average and standard deviation (italics) of percentage accuracy (ACC), sensitivity (SENS), specificity (SPEC), and area under the ROC curve (AUC) for ten runs of 10-fold cross-validation. entropy in decreasing order. It is frequently employed as a term-goodness criterion in the field of machine learning, measuring the number of bits of information gained for category prediction by knowing the presence or absence of a term in a document. CFS is an evaluation method used to analyze subsets of features based on how correlated the features are with the classification. This feature selection method operates on the original feature space, making the knowledge induced by the learning algorithm interpretable in terms of the original features and not the transformed space. A good feature subset contains features that are highly correlated with the class but not correlated with other feature subsets. In principle, the CFS technique, only requiring a means of measuring the correlation between any two variables, can be applied to a variety of supervised classification problems. Although optional, CFS does not require the user to specify any thresholds or the quantity of features to be selected. Moreover, being a filter, CFS incurs low computational cost because it does not require repetitive invocation of a learning algorithm.
The purpose of this study is to find features that can help screen stroke by analyzing arm weakness. We classified the features into stroke patients and healthy subjects. In each fold, the training models were produced from the training dataset using 63 combinations of three classical feature selection methods (InfoGain, CFS, and no feature selection), three feature sets (Curl-up-only features, Raise-up-only features, and both-exercises features), and seven well-known classification methods (naive Bayes (NaiveB), Bayes network learning (BNet), random forest (RF), J48 decision tree, k-nearest neighbors (kNN), locally weighted learning (LWL), and multilayer perceptron (MLP)). We used the Weka tool [36], well-known software that collects tools and methods of data mining and machine learning, to perform all of the combination tasks of feature selection and classification. We then used the testing dataset to measure the performance of all VOLUME 8, 2020 combinations of models and selected features according to the statistics of each feature selection.
The measurements of performance included the accuracy (ACC), sensitivity (SENS), specificity (SPEC), and area under the receiver operating characteristic curve (AUC). These measurements of performance reflected the classification without regard to class distribution and error costs [19]. To calculate them, we must measure the true positive (TP), true negative (TN), false positive (FP), and false negative (FN) values corresponding to the number of correctly classified stroke patients. We also need to note the number of correctly classified healthy subjects, the number of incorrectly classified stroke patients, and the number of incorrectly classified healthy subjects. Due to the unequal cost of incorrectly classified stroke patients and healthy subjects, i.e., that misclassifying stroke patients could be harmful but misclassifying healthy subjects could be a waste of time, we adjusted the cost function ratio of false positives to false negatives in the training step to two to one.

D. EVALUATION
In ten runs of 10-fold cross-validation, a number of selected features are counted every round because the information gain method considers every feature related to the entropy of the output class, but CFS considers a random feature subset that is uncorrelated with the inner features but correlated with the output class. The difference between the two selected features method could describe some characteristics of each feature.
The average performance is evaluated by multivariate analysis of variance (MANOVA) to analyze whether the independent grouping variables (feature sets, feature selection, and classification model) simultaneously explain a sta-tistically significant amount of variance in the dependent variable (accuracy, sensitivity, specificity, and AUC).
Primarily, we will select the appropriate model for stroke screening by selecting the combination that yields the highest accuracy value. However, if two or more combination performances yield a similar high accuracy, then we analyze and select the combination with the highest sensitivity as an alternative.

V. EXPERIMENTAL RESULTS
The data from Tables 2 and 3  The results for the InfoGain and CFS feature selection methods are summarized in Table 5 and Table 6

VI. DISCUSSION AND CONCLUSION
The percentages of the selected features for both feature selection methods showed that the essential factors of arm-weakness screening are from the curl part, related to the strength of the biceps brachii test, and the stable part, related to the pronator drift test. The displacement and velocity of the raise part are quite difficult to detect; this could be attributed to slow changes in displacement resulting in small accelerations that were similar to noise and removed by the smooth filter. However, the raise part is still essential because the time necessary to stabilize from the raise part to stability had been detected in 79% of instances by the information gain method, the first not-always-selected feature, and it is part of the Raise-up exercise.
Both feature selection methods selected almost the same features of both exercises and preferred to select the features from the stable part rather than the curl part, especially those VOLUME 8, 2020 from the acceleration signal, and not to select those from the raise part. Table 4 shows the average and standard deviation of all performances for ten runs of 10-fold cross-validation. The top 5 combinations of classification and feature selection methods achieved the highest average accuracy; none of them are significantly different, including the naive Bayes model with all features from the Raise-up exercise, which yielded an average accuracy of 85.0%, the naive Bayes model with information gain feature selection from the Raise-up exercise, which yielded an average accuracy of 84.2%, the naive Bayes model with information gain feature selection from both exercises, which yielded an average accuracy of 84.2%, the random forest model with information gain feature selection from the Curl-up exercise, which yielded an average accuracy of 84.1%, and the naive Bayes model with information gain feature selection from the Curl-up exercise, which yielded an average accuracy of 84.2%. The random forest model with information gain feature selection from the Curl-up exercise yielded the highest average sensitivity of 94.8%, significantly different from the three former combinations. Based on our collected data, we suggest that the random forest model with information gain feature selection from the Curl-up exercise should be for used for automated stroke screening.

VII. FUTURE WORK
First, we plan to collect more data, especially from patients who exhibit signs of pronation drift, to increase the sensitivity of the Raise-up features because these two exercises are tested with different muscle strengths. Then, we will collect data from patients who have arm weakness resulting from other diseases to compare and analyze differences between the data. Finally, we will combine features from other factors, such as facial drooping and speech, to complete the total FAST test. However, the time used and robustness in an open-environment situation will be considered.

NOTATIONS
Abbreviations and symbols in this paper are explained in Table 7.

ACKNOWLEDGMENT
The content is solely the responsibility of the authors and does not necessarily represent the official views of TRF or the Faculty of Engineering of Thammasat University. The authors thank all subjects who participated in the data collection.
PHONGPHAN PHIENPHANICH (Student Member, IEEE) received the B.E. degree (Hons.) in computer engineering from the Suranaree University of Technology, Nakhon Ratchasima, Thailand, in 2009, and the M.E. degree in electrical engineering from Thammasat University, Bangkok, Thailand, in 2012, where he is currently pursuing the Ph.D. degree in computer engineering. He was a Co-Researcher with the National Electronics and Computer Technology Center (NECTEC), Thailand, from 2010 to 2012. His research interests include signal and speech processing, pattern recognition, and machine learning.
NATTAKIT TANKONGCHAMRUSKUL (Associate Member, IEEE) is currently pursuing the degree with the Ruamrudee International School, Bangkok, Thailand. His research interests include signal and speech processing, pattern recognition, and machine learning. NIJASRI CHARNNARONG SUWANWELA received the M.D. degree (Hons.) from Chulalongkorn University, Bangkok, Thailand, in 1989. She held a Residency Training with the King Chulalongkorn Memorial Hospital, Bangkok, in 1993. She is currently a Professor and the Director of the Chulalongkorn Comprehensive Stroke Center, Bangkok. She is also the Former Head of the Neurology Division, Chulalongkorn University. She has published more than 50 peer-reviewed articles and many book chapters. She has pioneered the thrombolysis use and neurosonology. She is the President of the Neurological Society, Thailand, the Vice President of the Neurological Society, Thailand, the Vice President of the Thai Stroke Society, and the President of the Asian Stroke Advisory Panel. She received the King Anandamahidol Foundation to study as a Fellow in cerebrovascular diseases from the Massachusetts General Hospital, USA, in 1996.