A Depression Recognition Method for College Students Using Deep Integrated Support Vector Algorithm

The infinite increase in population, the pressure of survival, and the pressure of learning make the competition between people more and more fierce. Some college students have also been in a state of anxiety and panic for a long time, and mental health diseases have shown an explosive growth trend. The development of social networks such as Weibo, QQ, and WeChat not only provides more convenient communication methods for college students, but also provides a new emotional vent window for college students. They can record their living conditions in real time through social networks and interact with friends to express emotions and relieve stress. At the same time, the development of social networks has also provided a new way for the detection of depressed users. The current computer technology analyzes the user’s social network data to detect the user’s depression. This study uses text-level mining of Sina Weibo data from college students to detect depression among college students. First, collect text information of college student users in Sina Weibo, and construct the text information into input data that can be used for machine learning. Deep neural networks are used for feature extraction. An deep integrated support vector machine(DISVM) algorithm is introduced to classify the input data, and finally realize the recognition of depression. DISVM makes the recognition model more stable and improves the accuracy of depression diagnosis to a certain extent. Simulation experiments verify that the proposed depression recognition scheme can detect potential depression patients in the college student population through Sina Weibo data.


I. INTRODUCTION
The psychological impact of depression on people is huge. The concentration and learning ability of people with depression will decrease accordingly, and the efficiency of work will be greatly reduced, which will greatly affect the lives of these people. Of the top ten major diseases that disable or incapacitate people in the world, five are mental illnesses, and depression ranks first. This is enough to show that depression can cause huge harm to society. The current detection of depression is mainly based on a questionnaire survey. Hospi-The associate editor coordinating the review of this manuscript and approving it for publication was Yizhang Jiang . tals or psychological testing agencies issue survey questionnaires to users who participate in psychological surveys. The method based on the psychological evaluation table can well predict whether the user has a psychological disorder, and the score of the psychological self-evaluation table can roughly judge whether the user has a psychological disorder such as depression. However, this method is only applicable to oneto-one surveys. For large-scale data, using this questionnaire survey to conduct a population census will consume a lot of manpower and material resources.
As a social networking tool in China, Weibo is one of the most popular personal and media publishing platforms in China. As Weibo is a platform for users to share their feelings, express their opinions, and interact with others, users' Weibo contains a large amount of user personal information and emotional dynamics. Obtaining and analyzing these Weibo contents can mine personal emotions. Deep mining of these contents provides the possibility to analyze the emotions of individual users. Most of the social network platforms selected in the existing research on the psychological health of social network users are Facebook, Twiter, Weibo, Renren, and other social networks. Among them, Weibo users are huge. Since most of Weibo does not carry out real-name authentication, Weibo users can express their truest self on the platform and publish their most authentic thoughts and voices anytime, anywhere. This fully illustrates that public social networks can effectively reflect the mental health of users. This is why this study does not use private social network data such as WeChat and QQ. Social networks such as QQ and WeChat that have emerged in recent years require the consent of both parties to become friends in order to post information. Usually people with depression tend not to like adding friends, their friends are definitely few, and they rarely publish some personal updates in the circle of friends. The characteristics of Weibo make it still a social network loved by young people, especially college students. Text data on Weibo can better reflect the user's mental health than networks such as WeChat. This article first analyzes the feasibility of detecting depression users from text data in Sina Weibo from a theoretical level. The overall framework of the feasibility analysis is shown in FIGURE 1.
Through the above feasibility analysis, the feasibility of depression recognition using Weibo text data is theoretically verified. In this paper, we collected university student Weibo user information, pre-processed the data and extracted features, and finally used DISVM to classify the data to achieve the purpose of identifying college students with depression. The innovations in this article are (1) The feasibility of detecting depression users based on Weibo social network data is discussed theoretically. At present, for the related research work of using users 'behaviors on social networks and published texts to analyze users' mental health, the social network platforms selected mainly include QQ and WeChat. And for college students who are potentially depressed, they generally don't like social activities. Therefore, such college students have relatively few QQ and WeChat friends, and they rarely publish comments and feelings on these platforms, and their performance is not active. However, Weibo users are huge, and Weibo users do not need real-name authentication. Depression patients are more inclined to publish their own real speech on this platform, which is relatively active. Therefore, the Weibo platform can more effectively reflect the user's living state and psychological state, and can better reflect the user's depression and other mental health issues.

2) DIVIDED BY NUMBER OF ATTACKS
According to the number of episodes of the disease, it is divided into single episode depression and recurrent episodes of depression. Single-onset depression is manifested in a single episode of the diseased individual in a lifetime, and depression will not occur again after successful treatment, but this situation is not common clinically. Recurrent depressive disorder refers to the phenomenon of repeated symptoms after successful treatment, and the condition has not been effectively alleviated, which is more common.

3) DIVIDED BY INDIVIDUAL AGE LEVEL
According to the different age levels of individuals with depression, it is divided into children and adolescents with emotional disorders, adult depression and senile depression. The age span of the onset of depression is large, which does not mean that depression does not occur at a young age. Therefore, the classification of this angle is made according to the different age levels of the subject.

B. DEPRESSION RECOGNITION VECTOR 1) PHYSIOLOGICAL SIGNAL
The survey results show that people with depression have different physiological behaviors than normal people. For VOLUME 8, 2020 example, the generated brain wave signal [1]- [3] and the content of serotonin in the body are different. EEG signals, nuclear magnetic resonance signals, and other medical image signals extracted by medical professional medical equipment can characterize the unique physiological characteristics of test subjects. The data can be used to identify and classify depression, and can achieve a good recognition effect for depression recognition. The rapid development of information technology has made the analysis of EEG signals more reliable. For example, SPSS software is very useful for data analysis [4]. EDFbrower can perform more detailed and specific analysis of EEG signals [5]. The study of EEG signals of depression in foreign countries includes analysis and research of four basic rhythm waves of EEG signals. In terms of depression identification, Shamla et al. [6] proposed the use of signal processing technology FFT and machine learning technology SVM to distinguish between depression and normal people. This signal processing technology can not only identify depression, but also help to take appropriate measures based on the level of depression. Hosseinifard et al. [7] achieved a LR classification rate of 85% by studying the non-linear characteristics of depression EEG signals, which is superior to the classification effect of KNN and LDA algorithms.
Researchers use eye-tracking technology to identify depression tendencies. Eye-tracking uses eye trackers to record eye movements and extract eye movement data to identify emotions. Koster et al. [8] and others used eye tracking technology to explore the tendency of depression. The results show that depression tends to be accompanied by problems such as attention bias. In addition, Joorman and Gotlib [9] and others found that individuals with depression tend to show a noticeable bias towards negative stimuli. However, normal individuals show a noticeable bias towards positive stimuli. Heller et al. [10] found that individuals with depression tend to spend more time browsing pictures of negative emotions in eye movement experiments. Fritzsche et al. [11] studies on the tendency to depression through emotional faces found that individuals with depression tend to be disturbed by sad expressions, and the response time is significantly longer. The collection of physiological signals is often tedious, the equipment used is expensive, and real-time monitoring is required by professionals. Therefore, this paper did not select physiological signals for research.

2) BEHAVIORAL SIGNAL
According to the different behavioral expressions of facial expression activities, eye movements, hands and legs of depressed and non-depressed individuals, the data of real-time behavior signals collected by the sensing device worn on the subject as the modeling input for depression 75618 VOLUME 8, 2020 classification. Speech, movement, and facial behaviors can identify depression. The ''lens model'' theory proposed by Brunswik [12] pointed out that individual behavior can identify emotional changes. As an important part of personal behavior, online behavior can be used to identify depression. Moreno et al. [13] and others analyzed Facebook data and found that individuals with depression tend to post negative photos on social platforms. This further illustrates that social networking sites can provide a basis for studying depression. Li et al. [14]analyzed 547 Weibo behavioral features extracted by 547 Chinese Weibo active users to predict the personality characteristics of users. Katikalapudi et al. [15] conducted research on identifying depression tendency from the aspect of online social activity. The experimental results found that individuals with depression tend to reduce social activities and social circles. Dao et al. [16] found that social media can be a platform for the treatment of depression.

3) VOICE SIGNAL
Depression patients have significant speech characteristics in speech signals, and research on depression recognition based on speech signals has a solid theoretical basis, so depression recognition based on speech signals is considered to be the best method. Compared to the cumbersome data collection process of physiological signals and behavioral signals, there are currently open source and authoritative depression speech data sets to choose from. Ambady and Rosenthal Pointed out through the study of human behavioral behavior that through a brief observation of human behavior, they can predict their behavioral behavior with a higher probability than random guessing [17]. In speech recognition, there are more studies on speech slicing. In speech-based emotion recognition, the recognition effect of relative time slices is the best [18]. Alghowinem et al. experimentally found that in reading aloud, using the beginning of the entire speech can get better depression recognition effect than using the entire speech [19]. Moore's research on glottic features indicates that when using glottic features for depression recognition, only 10 to 20 seconds of speech fragments can be used to obtain a recognition effect similar to that of speech fragments of 2 to 3 minutes [20].

4) OTHER SIGNALS
Saki et al. [21] and other studies found that depression-prone individuals have reduced memory and slower response. Other researchers have studied depression tendency from the perspective of cognitive style. Carver et al. [22] and other college students have found that the lower the cognitive level of depressed individuals, the greater the possibility of negative responses when encountering difficult problems. Major [23] proposed that people with different cognitive styles will adopt different coping styles when facing the same pressure. Eysenck [24] and others studied the relationship between college students' cognitive style and emotion, and the results showed that cognitive style is closely related to emotional state.

C. DEPRESSION RECOGNITION METHOD 1) TRADITIONAL DEPRESSION RECOGNITION
Traditional diagnostic tools include interview assessments, the Hamilton Depression Rating Scale, and the Baker Depression Scale. These scales and the severity of behavioral symptoms can give a certain grade of scores to patients with depression. The diagnosis using this method is extremely complicated. It depends heavily on the patient's ability, patience, willingness, and whether the patient is honest and serious when communicating with them. But according to the definition of depression, their thinking and motivation are impaired, so diagnostic information takes a long time to collect. And patient training, practice and validation are needed to produce acceptable results. Diagnosing depression in this way is very difficult at the primary treatment stage. The diagnosis of many depressions can also be very wide. Moreover, the results obtained by such classification and classification are very general, and can only tell patients that they are suffering from high or low levels of depression, which may easily lead to misdiagnosis and consume diagnostic time. At the same time, due to the hidden nature of the physical symptoms of depression, depression can not be successfully detected, which are the disadvantages of this detection method.

3) RECOGNITION MODEL BASED ON DEEP LEARNING
As data gradually becomes more dimensional and diverse, data processing becomes more and more important. Deep neural networks are powerful in processing data and are powerful tools for processing complex data [39]. Pitts [40] simulates the operation of the human brain to store and process complex information. It can process data in parallel and dig more useful information from the data. Convolutional neural network [41]- [43] is one of the commonly used depression recognition models.   ipant will be recorded one to four times, and the interval between each recording is about two weeks. Subjects are given different tasks to stimulate their emotional state in 75620 VOLUME 8, 2020  different language environments, thereby mobilizing their personal emotions. Subjects with or without a tendency to depression can be expressed through source speech. The distribution of depressed and non-depressed patients in this data set is shown in FIGURE 4.

2) DAIC-WOZ DEPRESSION SPEECH DATA SET
The DAIC-WOZ [45] data set is mainly for audio, video, and questionnaire feedback data collected for the diagnosis of anxiety, depression, and other psychological diseases. The distribution of depression and non-depression individuals in this data set is shown in FIGURE 5.

3) TEXT DATA IN SOCIAL NETWORKS
Contemporary college students like social networks such as Weibo, and post, repost, and comment on various information on Weibo. It is possible to analyze and identify whether the user has a depression tendency from the text information. How to translate the information in this article into input vectors available to the classifier. There are generally representation models in this paper, which are vector space model, theme model and Word Embedding. In Word Embedding, the proposal of word2vec enables text to be converted into  CBow is a model that predicts the current word based on the context. It has the same structure as the ordinary feed-forward neural network, and uses a local sliding window mechanism. The selected window size in the Figure 6 is 5, where w(t) is the word vector at position t, and it is also the central word to be predicted. The model has three layers: one is the input layer. The context word vector w (t-2), w (t-1), w (t-1), and w (t+ 2) of the word at the input position is represented by a distributed representation method. In the middle is the projection layer, which can superimpose the input word vector. The main idea of the projection layer is the Huffman tree. The third time is the output layer to get a vector of words at position t.
The Skip-gram model is the inverse process of CBow, and predicts the probability value of the context based on the central word. The word vector obtained by the Skip-gram model deepens the text at the semantic level, making it more prominent on tasks related to semantic expression. The Skip-gram model also has 3 layers. Input in the input layer is the word vector at position t. The main idea of the projection layer is also the Huffman tree. Get the context word vector of the word at position t in the output layer.

E. DEEP NEURAL NETWORK
In this study, deep neural networks [44] are used for feature extraction. In pattern recognition, features are mainly used to describe and represent the things studied. The features extracted by the algorithm from the data directly affect the performance of the model. How to effectively extract features that conform to the data distribution structure has always been a hot issue in this field. Traditional feature extraction methods mostly select features based on different tasks and data designs, such as Gabor features [45], SIFT [46], local binary patterns [47], etc. However, designing a good feature is not simple, especially in the case of a large amount of data, it takes time and effort. How to find a method that can automatically learn feature description is very critical for this research.
From the perspective of deep learning multi-layer nonlinear function relationships, deep neural networks belong to a subclass of deep learning. A neural network that maps the learned function to a more complex function through nonlinear combination is called a deep neural network. Figure 8 is a neural network with three hidden layers. ''Depth'' refers to the number of non-linear operation combination levels in the function learned by the network, so the deep neural network can also be understood as a neural network with multiple hidden layers. Deep neural networks use fewer neurons to achieve better generalization capabilities.

III. DEPRESSION RECOGNITION OF COLLEGE STUDENTS BASED ON WEIBO DATA A. RECOGNITION MODEL DESIGN IDEAS
The functional modules involved in the depression prediction model are shown in FIGURE 9, which can be divided into three layers as a whole: the acquisition of raw data, data preprocessing, and training recognition model. Acquisition and preprocessing of raw data: This part is the basis of the entire technical framework. The user's raw data on social media is mainly obtained through the Internet. Then through step-by-step processing, the user network behavior characteristics are finally stored in the database. Construction of the recognition model: This part is the core of the entire technical framework. After the user's network behavior characteristics are stored in the database, this paper uses the integrated SVM to build a depression recognition model.

B. OBTAINING DATA SETS AND PREPROCESSING
Obtain the user data required to build a depression recognition model and perform preprocessing such as de-redundancy, cleaning, and sorting of these data. The data source needed for the depression recognition model constructed in this paper is the Internet behavior characteristics of Internet users on social media. Specifically, we use the third-party interface provided by Sina Weibo and our own crawler software to obtain college student Weibo users' online behavior data on Sina. Then combined with the network behavior system to extract specific network behavior characteristics specific to depression.
First, we selected college student users among all Sina Weibo users, and randomly selected 1,000 users among college student Weibo users as the target sample. Download the data record of the target user through Application Programming Interfaces provided by Sina Weibo. Since depression is a chronic and long-term symptom, this article decided to capture data from users over a long period of time, from January 1, 2015 to December 31, 2017. Grab the data of 20 Weibo posts or reposts every month by each user. For  users with less than 20 Weibo posts, grab all their Weibo data. Some users have too few Weibo posts and reposts to meet the research needs, and such users will be deleted. The specific information of the college student Sina Weibo users finally selected is shown in TABLE 3.

C. FEATURE ANALYSIS AND EXTRACTION 1) ANALYSIS OF HIGH-FREQUENCY WORDS IN WEIBO
According to the research results in the literature [48], [49], it is known that depression patients mostly use the first person when speaking. In addition, depression patients like to post and repost negative and negative messages. Therefore, this paper constructs input data by counting the frequency of specific words in Weibo. Specific words generally include 6 levels: degree words, negative evaluation words, negative emo- tional words, positive evaluation words, positive emotional words, and propositional words. Some words belong to multiple general categories at the same time, for example, the word ''very'' appears in the degree level, positive evaluation words, and other categories. In order to simplify the problem, this article ignores the part of speech of words. By removing duplication, each word appears only once in the vocabulary. We extracted the 20 words with the highest frequency in the two types of user Weibo content, which are called high-frequency words. Table 4 shows that the words often used by the two types of users are different. FIGURE 10 shows the frequency of high-frequency words of normal users in the microblogs of two types of users. Depression-prone users use these words more frequently than normal users. This shows that in Sina Weibo, people's language style is related to depression, and the content of Weibo potentially contains information about whether users have a depression tendency. This allows us to analyze Weibo content manually or by machine to mine this information.

2) EXPRESSION ANALYSIS IN WEIBO
In addition to text content, Sina Weibo also supports pictures, videos, music, emojis, and more. Contents such as pictures and videos are too large and difficult to obtain. This study does not involve this part of the data. The Weibo content data obtained in this article actually includes emoji. For example, the symbol ''candle'' is represented as ''[candle]'' in the text. Emoji is a very popular form of expression on the Internet. It refers to static or dynamic pictures of people's expressions contained in text. For example, the expression ''dagger'' on Sina Weibo is usually used to express anger. Generally,  Through the analysis of the Weibo data of the experimental sample throughout 2017, we found that college students with depression use negative and neutral expressions more frequently, and normal college students use positive and neutral expressions more often. The comparison of the two types of college student users using positive, negative and neutral expressions is shown in FIGURE 11.
According to the previous feature analysis, we found that the word frequency of Weibo users, the time distribution of the number of Weibo posts, the number of followers, and the number of followers all contain information on the user's depression tendency, which are all useful features. This article uses these features to describe a user sample. The i-th user Ti is described as: Ti = [Emoji frequency, frequency of Weibo content words, number of Weibo, number of fans, number of followers] After such a combination, the dimension of each sample is as high as 1325. Such a dimension is obviously not conducive to subsequent processing. It is necessary to reduce the dimension of the sample, find the ''most useful'' features, and delete other features, thereby reducing the sample dimension. Here, deep neural network [44] is used to perform feature extraction on the original data. The classic SVM classifier is used to classify samples of different dimensions, and the relationship  between the accuracy rate and the feature dimension is shown in FIGURE 12. According to the results, the sample dimension is reduced to 36. The specific information represented by each dimension is shown in TABLE 6.

D. BUILDING A RECOGNITION MODEL
The performance of a single SVM is unstable and the accuracy is not high enough. In this paper, an integrated SVM is selected as the depression recognition model. The integration strategy uses the AdaBoost algorithm, which can repeatedly modify the weight distribution of the training sample set to fit a series of weak learners [50]. At the beginning of the loop, each sample has equal weight. As the iteration progresses, the weight of samples that are misclassified increases with each iteration, and the weights of correctly classified samples decrease. This reduces the classification error rate of the classifier and improves the classification performance . FIG-URE 13 shows a framework for depression recognition based on DISVM. SVM is used as the base classifier of AdaBoost, and the grid search method [51] is used to optimize the parameters. Learn the base classification SVM t (t=1,2,3,. . . ,T ) on the original data set. For each successive iteration, the weight of the misclassified samples by the base classification SVM t increases, while the weight of the correctly classified samples decreases. As iterations progress, misclassified samples play a greater role in the next round of learning. AdaBoost algorithm linearly combines T base classifiers.
The running steps of the DISVM algorithm are summarized as follows

Step1
For t = 1,2,. . . ,T max Step2 Learning under X t , the t-th weak learner L t = L (X , X t ) is obtained from the X 1 (i) training set with weight distribution.

Step3
Calculate L t error rate ζ t by ζ t = Step4 If ζ t ≥ 0.5,break; Step5 Calculate the weight w of L t according to for- Step6 Update the weight of each sample X t+1 (i) = X t (i)

A. EXPERIMENT RELATED SETTINGS
Details of the experimental data set are shown in TABLE 7. This section uses accuracy P (Precision) to evaluate classification models. The formula for P is as follows Among them, a indicates users who are correctly classified as depression, and b indicates the number of users who are misjudged as depression. The comparison models are Radial Basis Function Neural Network (RBF-NN) [52], [53], SVM, K -Nearest Neighbor (KNN) [54].    Overall, the performance of each classifier on the training set is better than that on the test set. The DISVM algorithm proposed in this paper can give low-quality samples low weight, thereby improving the model recognition rate. The data in the table verify that the proposed model can play a good recognition effect in both the training set and the test set.

C. RECOGNITION ACCURACY IN DIFFERENT TIME DIMENSIONS
The experimental data used in the previous section are data captured from January 2016 to December 2017. In order to analyze the recognition efficiency of depression on Weibo  When the time dimension of the crawl is 24 months, the recognition rate does not change much, and there is a downward trend in the recognition rate later. This is because depression in mental health is an unstable psychological variable that changes over time. With the further growth of original data, the correlation between network user data and mental health gradually weakens, and the ability of network behavior to predict mental health gradually weakens. Therefore, an excessively long observation period is not conducive to identifying the depression tendency of Weibo users. For classification models, we can use the same method to describe the accuracy trend over different time periods, as shown in Figure 16. It can be seen that the trend of the accuracy of the classification model is similar to the trend of continuous value prediction.

V. CONCLUSION
In an era of increasing competition, the number of college students with depression is increasing. This paper proposes a method that can effectively identify college students with depression. In the process of identifying depression, there are many medias, and this article chose Sina Weibo data as the original data. This article first analyzes the possibility of depression recognition based on Weibo data from a theoretical level, and then begins to verify it in practice. In the practical part, first analyze the user data of Sina Weibo, the differences between depressive users and normal users in language style, use of emojis, number of Weibo, followers, etc. The differences between the two types of users are derived. Based on this, feature extraction and dimensionality reduction for depression-prone user identification were performed, and input data suitable for the classifier was constructed. Secondly, this paper introduces a more stable and more accurate classification model called DISVM, which uses the AdaBoost integration strategy and deep neural network. At the beginning of the DISVM algorithm, each sample is given equal weight. As the iteration progresses, the weight of samples that are misclassified increases with each iteration, and the weights of correctly classified samples decrease. This reduces the classification error rate of the classifier and improves the classification performance. Experiments have further verified that the use of DISVM algorithm is indeed better than other traditional classifiers in recognition efficiency. Considering the long-term nature and huge amount of Weibo data, how much data is selected for identification can reduce the running time and ensure the identification efficiency, which has become a key issue. In the experimental part, a time span of 36 months is selected for analysis. It is verified that when the data is collected for 24 months, a better recognition result can be obtained. And the amount of data is not particularly huge at this time. This study verifies the feasibility of identifying depression among college students and has certain reference value. However, the choice of data features in this article is relatively subjective. Further research will follow on how to more objectively screen out the most suitable data features to achieve the best recognition accuracy.