Research on User Experience Evaluation of Mobile Applications in Government Services

Mobile applications in government services provide a good platform for improving government credibility and social governance. This study evaluates the user experience of mobile applications in government services from the perspective of users and provides suggestions for improvement with an aim to improve the user experience of mobile applications in government services. The research process was analysed in three stages. First, a user experience evaluation index system of the mobile applications in government services was preliminarily constructed by using literature review, user interview methods and combining with Donald Norman’s emotional design theory. In the second stage, the data were collected through questionnaires, the reliability and validity of the data were tested, then the weights of indexes were determined by entropy methods. In the third stage, mobile applications in government services of four provinces were selected as the object of empirical research. The user experience evaluation scores of the mobile applications in government services of four provinces were calculated by the grey correlation analysis method. The problems existing in the mobile applications in government services in each province were analysed according to the score results, and corresponding suggestions or solutions were proposed.


I. INTRODUCTION
In recent years, with the enhancement of mobile communication technology, the number of mobile applications in government services has exploded. As of March 2020, the number of mobile Internet users in China reached 897 million, accounting for 99.3% of the total Internet users. The number of online government service users reached 694 million, accounting for 76.8% of the total Internet users [1]. The mobile applications in government services are changing with each passing day. Mobile applications such as government affairs apps, Microblogs, WeChat, short videos are increasingly becoming the important channels and ways for the government to communicate, serve, and unite the public [2], [3]. It provides a good platform for improving government credibility and social governance [4].
With the accumulation of time, the public have reported some problems to the new media of government affairs, such The associate editor coordinating the review of this manuscript and approving it for publication was Giuseppe Desolda . as unclear function positioning, non-authoritative information, loose supervision, insufficient interaction, insufficient service feedback, etc. [5]. In 2018, as an article from People 's Forum on mobile applications in government services pointed out, most of grassroots new media for government affairs are prone to two extremes, ''zombification'' caused by the lack of content and interaction for a long time or ''Personalization'' caused by the publication of highly personalized speech [6]. In the article, Xin, S.Y. pointed out that the user experience service of the government affairs app is poor [7]. The specific manifestation is that the functions are mixed and the classification is unclear. Such mobile applications in government services are often given to the public by government agencies, without considering what the public really want, and basic services have not been implemented. Therefore, the user experience of the mobile applications in government services deserves attention [8].
On the research of user experience theory model, Donald Norman analyzes the layers of user experience from the perspective of cognitive psychology, classifying the user experience into visceral, behavioral, and reflective [9]. Jesse James Garrett divides the elements of a website's user experience into five distinct layers: strategic, scope, structure, framework, and presentation [10]. Peter Morville proposes a honeycomb model that depicts the elements of user experience, namely, usefulness, availability, discoverability, satisfaction, reliability, and value. Based on the research on Affordance of Gibson et al., Zhao et al. divide perceptive Affordance into four parts, namely, perceptive physical Affordance, perceptive cognitive Affordance, perceptive emotional Affordance, and perceptive control Affordance, so as to guide the research on user experience design in social media.
At present, there are few researches on user experience of mobile applications in government services. Some studies considered that the objective of mobile applications in government services was to deliver cost-effective, efficient and effective service to a number of stakeholders within the government [11], [12]. Colesca through a survey found that the public's confidence in the structure of government social media, expectation confirmation, reputation perception, reciprocity perception and trait similarity perception will enhance the public's trust in the government agency [13]. Li and Cao believed that the user experience of e-government refers to the subjective psychological feelings that users have built on the interface design, functionality, operation convenience, interactive responsiveness and other aspects when they visit the official government websites [14]. Xu constructed an evaluation index system of e-government user experience from the aspects of effectiveness, technical guarantee, information interaction, ease of use, information content, and website design [15], [16].
Therefore, how to evaluate the user experience of mobile applications in government services, explore the influencing factors on its user experience, analyze the existing problems, and then put forward targeted countermeasures and suggestions have become urgent problems to be solved. Based on the existing research results of user experience of mobile applications in government services, this study builds a reasonable evaluation index system of user experience of mobile applications in government services from the perspective of users. The weight of the index was determined by entropy method, and samples were selected for empirical analysis. The correlation degree of the sample index was determined by grey correlation analysis method. Finally, corresponding optimization strategies and suggestions were put forward according to the empirical results.

II. CONSTRUCTION OF USER EXPERIENCE EVALUATION SYSTEM
A. CONSTRUCTION IDEA When evaluating the user experience of mobile applications in government services, it is necessary to follow the principles of systematization, comprehensiveness, scientificity, applicability and typicality, and the index system should conform to the scientific structure level.
Based on Donald Norman's emotional design as the theoretical basis, this project draws lessons from the evaluation indexes constructed in the existing user experience evaluation research, and summarizes the evaluation dimensions of user experience of mobile applications in government services [16]. Based on the characteristics and literature analysis of mobile applications in government services, the evaluation indexes of user experience are selected. In order to ensure the scientificity and rationality of the indicators, 14 senior users of mobile applications in government services were invited to conduct in-depth interviews to further screen the evaluation indicators and preliminarily establish the user experience evaluation system of mobile applications in government services.
Questionnaire design was carried out according to the preliminarily constructed user experience evaluation index system, and reliability and validity test was conducted by collecting questionnaire data to ensure the reliability and accuracy of the index. Finally, mobile applications in government services user experience evaluation index system will be formed. The construction idea is shown in Figure 1.

B. CONSTRUCTION INDEX
In order to provide theoretical reference for the selection of evaluation indicators and system construction, as shown in Table 1, the existing representative research results of user experience in related fields were sorted out, and the specific evaluation indicators were listed.
In terms of the selection of evaluation index dimensions, it was not difficult to find, based on the existing research theories, that the evaluation dimensions of user experience were all developed from the surface to the inside around user perception, which can be essentially summarized into the three levels of design psychology proposed by Donald Norman, namely the visceral level, the behavior level, and the reflection level. Therefore, the evaluation dimension of user experience of mobile applications in government services was explained from the visceral experience, behavior experience, and emotion experience.
According to the relevant literature analysis of mobile applications in government services, relevant evaluation indexes are selected. In order to ensure the scientific nature of the evaluation index, 14 senior users (7 male and 7 female were selected as interviewees, including 4 public officials, 2 scientific researchers, 4 Internet staff and 4 college students, who came from Shanghai, Heilongjiang, Qinghai and Yunnan provinces respectively. The one-on-one interviews for 14 selected interviewers were conducted around the evaluation indexes through the following questions: 1) you will use on government affairs new media platform which functions, and describes the functional use path. After the interview, the interview contents were transcribed and sorted out to extract relevant conversation elements, and the evaluation indicators were screened to preliminarily form the user experience evaluation index system of mobile applications for government services, as shown in Table 2.

C. RELIABILITY AND VALIDITY TEST
This questionnaire was divided into two parts and consists of 22 questions. The first part is mainly about the basic information of the respondents, and the second part is about the questions edited according to the 17 indicators selected above. This questionnaire was mainly based on users' recognition of the importance of influencing factors of user experience when using mobile applications in government services. Likert Five Subscales were adopted here, with 1-5 indicating unimportant, less important, general, important and very important respectively. The specific question options were used to obtain the respondents' evaluation on the user experience of new government media.
In the process of survey, it is more feasible to survey the users of new media of government affairs in different regions. The survey was mainly divided into two parts. The first part is to send questionnaires to colleagues in Nanjing government affairs publicity units through electronic questionnaires. A total of 30 questionnaires were sent out and 27 valid questionnaires were collected. The second part is for users of new media of government affairs in different regions. A total of 100 questionnaires were issued on the platform of Questionnaire Star in the form of electronic questionnaires, and 91 valid questionnaires were collected. A total of 130 questionnaires were sent out and 118 valid questionnaires were collected. The reliability and validity analysis are used to test the data results of the questionnaire survey, so as to modify the index system and determine the final index.
Cronbach's Alpha is the commonly used test method for reliability test [26]. In Cronbach's Alpha, it is reasonable for the coefficient to be between 0.7 and 0.8 in general. A figure below 0.6 indicates that the data is not reliable enough for further study. Spss25 was used for reliability and validity test, and the test results are shown in Table 3. It can be seen from the above table that the Cronbach's alpha coefficients of the three dimensions are 0.895, 0.934 and 0.959 respectively, which are all greater than 0.8, indicating that the reliability of the questionnaire has a high degree of reliability.
Validity test is the degree of accuracy of the characteristics of things. Validity analysis is divided into three categories, namely, content validity, construct validity and calibration validity. Construct validity test is usually used in the validity test of user experience indicators. Factor analysis is usually used for validity test. KMO and Bartlett tests are needed before factor analysis. The test results are shown in Table 4. As can be seen from the above table, KMO value is 0.832, higher than 0.8, approximate Chi-square value of Bartlett sphericity test is 3012.722, sig value is 0.000 < 0.05, reaching the significant level, which is suitable for factor analysis. Next, 118 data samples are calculated by spss25 to get the factor analysis results, as shown in Table 5. According to the operation results of the above factor analysis, this paper tests the evaluation index of government new media user experience initially constructed in the previous paper, and finally obtains the evaluation index system of user experience of mobile applications in government services, as shown in Figure 2.

III. ENTROPY METHOD AND GREY CORRELATION ANALYSIS A. ENTROPY METHOD
In the comprehensive decision-making of the index system, different weights should be given reasonably according to the importance of the index. The weight of the index reflects the influence of the index on the overall contribution. Therefore, determining the weight of each comprehensive index is the basis of comprehensive decision-making.
To obtain the weights of the indices, some methods have been introduced in literatures such as the subjective and objective. The subjective preference of experts can be determined by the method such as Delphi [27], AHP [28], etc. The objective weighting method, which derived only from the evaluating data set, contains entropy [18]- [20], DEA [29], CRITIC [30] and CCSD [31], etc. The AHP method is a powerful tool in making complicated and often irreversible decisions benefited from its ability of decomposing a complex problem into multiple layers and the capacity of quantitatively treating complex and multi-criteria systems, and the entropy method is one of the most popular objective weighting methods. Han et al. conducted a comprehensive evaluation of high-quality economic development through the entropy method, and put forward policy suggestions from four aspects of production, distribution, consumption and risk [32], [33]. In order to reduce the influence of subjective factors and improve the accuracy of evaluation conclusions, Guo et al. evaluated the service quality of beverage industry through the entropy value method [33].
As the information entropy in the information system is a measure of the degree of information disorder, the information entropy can be used to evaluate the order degree and the utility value of the obtained system information. The larger the information entropy, the higher the information disorder, and the smaller the information utility value, the smaller the weight. On the contrary, the smaller the information entropy, the lower the information disorder, and the greater the information utility value, the greater the weight. The entropy method can accurately determine the utility value of the index information entropy value and allow for the construction of a judgment matrix by evaluating the index numerical values to determine the index weights. Therefore, as the entropy method eliminates evaluator subjectivity and objectively determines the indexes based on survey data, the results are more in line with reality. this paper uses the entropy method to determine the weight of the index, which can not only consider the intention of the decision-maker, but also have a strong mathematical theoretical basis. The specific steps are as follows.
For the decision matrix X of n kind alternatives and m kind evaluation indices, which formulated as: The first step, normalizing the matrix: x ij , i = 1, 2, . . . , n; j = 1, 2, . . . , m (2) The second step, calculating the entropy of each index: The third step, calculating the entropy weight value: B. GREY RELATIONAL ANALYSIS The grey system was proposed by Deng Julong [23]. This method is mainly used to solve problems in the case of incomplete factors, uncertain information, unclear behavior paths, and so on. GRA in the grey system is a quantitative analysis of the changes of different factors in the system. According to the difference in development trend between different scheme indicators, the correlation coefficient polyline of different schemes is drawn. The more similar the polyline features, the closer the change trend and the greater the correlation [34]. This study is based on the GRA for user experience evaluation of mobile applications in government services. The main reasons are as follows: The user experience has certain and undetermined evaluation elements. The grey system is suitable for solving the problem of clear external information and unclear internal information. Both have a high degree of attribute agreement. This study will explore the undetermined user experience evaluation of new government affairs media based on the determined objective operations and subjective feelings.
Using GRA to evaluate user experience requires a small number of samples, and experimenters will use the evaluation data of these sample indicators to judge the grey correlation level of the scheme.
Using GRA to analyze the experimental results can be demonstrated in the following steps [35].
Step 1: Define sample comparison series. The evaluation experiment result of user experience is defined as the comparison sequence X i , and X i (k) is used to represent the k th value of the i th scheme.
Step 2: Normalization. The purpose of the normalization process is to reduce the difference in the absolute value of the data, unify them into an approximate range, and focus on their changes and trends. Initialize the sequence data first; and then divide the sequence value by the mean.
Step 3: Calculate the grey correlation coefficient. The calculation formula of the grey correlation coefficient is as follows, where ρ is an adjustable coefficient, which controls the discrimination of the ζ coefficient, and the value range is (0, 1).
Step 4: Calculate the grey correlation. Calculate the value of ζ correlation coefficient, and then the values of different dimensions of each factor would be averaged. The mean value represents the degree of correlation between the comparison sequence and the reference sequence.
Step 5: Determine the optimal solution. According to the grey correlation degree measured by each schema index, the numerical value is compared. If r i > r j , then the i th schema is better than the j th schema. In this way, the optimal schema can be determined.

IV. CASE ANALYSIS A. CASE SELECTION
Based on the research of the provincial government new media platform according to the regional, economic and cultural differences, as well as the province's highest download amount of government app, the following 4 provincial government apps from four provinces were selected as the empirical objects for analysis: Shanghai government service app ''Citizen Cloud'' in the eastern region, Qinghai government service app ''Qinghai Renshetong'' in the western region, Yunnan government service app ''Banshitong'' in the southern region, and Heilongjiang government service app ''Longjiang Renshe'' in the northern region. The provincial government app samples are shown in Figure 3.

B. WEIGHT ANALYSIS OF USER EXPERIENCE EVALUATION INDICATORS
In order to ensure the reliability of the weight, we invited 6 senior users from different regions, different occupations and different platforms, 2 experts in the field of user experience and 2 government new media operators to form an expert group. The questionnaire was sent to the expert group, and the 5-step Likert scale was used to measure the indicators of the   The data in Table 6-8 are substituted into formulas 1-4 to obtain the weight of mobile applications in government services user experience evaluation indicators, as shown in Table 9.
The comparison series and reference series are substituted into formula 5-6 to calculate the grey correlation coefficient ζ i (k) of the four design schemes. The calculation results are shown in Table 10-12. Finally, the correlation coefficient is substituted into formula 7, and combined with the scheme index weight W j , the grey weighted correlation degree r i of the final design scheme is obtained, as shown in Table 13.

D. ANALYSIS OF EVALUATION RESULTS
Through the results of grey correlation analysis, it is found that the highest correlation degree of Shanghai government services app is 0.050, followed by Qinghai and Yunnan, which are close to each other, 0.042 and 0.41 respectively. Compared with the first three, Heilongjiang government services app has a slight gap, with a correlation degree of 0.034. Therefore, Shanghai government services app is the best solution, but on the whole, there are still many deficiencies in the four provinces' government services app.
In the dimension of visceral experience, the performance of factors related to interface design and content classification, VOLUME 9, 2021   Shanghai government services app is better than the other three, mainly because of the innovative design of Shanghai government services app interface, that is, the interface color and pattern change with time, which can attract users and give users a sense of novelty. The other three government services apps are standardized interface design templates, showing the effect is not outstanding.
In the dimension of behavior experience, we should let users fully understand and participate in government affairs, and let users become the masters of the application. In terms of function satisfaction and behavior operation, the entry of political participation and political inquiry functions are all placed in an important position of the interface, but in terms of operation process and filling fluency, Shanghai government services app has shorter operation path and smoother filling than the other three. In terms of information consultation related functions, Yunnan government services app has opened 12345 mailbox, hotline and online timely consultation, while the other three are mainly online consultation functions, so the timeliness is not guaranteed, so the consultation effect of Yunnan government services app is better in this aspect.
In the dimension of emotional experience, there is a small gap in the usability score of government app among the four provinces, which indicates that government services app has a high effect on local users. However, in terms of ease of use and the final feeling of use, there is still a lot of room for optimization of the government services app in Qinghai, Yunnan and Heilongjiang provinces. The government services app should not only be useful, but also be easy to use.

V. DISCUSSION
Through the analysis of the above four samples of user experience evaluation results, it is not difficult to find that there are still some problems in the current construction of mobile applications in government services in China, such as homogenization of interface design, complex participation process, lack of interactive function, lack of interpretation of government information, etc. Based on this, this study puts forward the following suggestions, hoping to provide reference and inspiration for mobile applications in government services.
Interface design is an important part of visceral experience, which is related to the user's first impression of mobile applications in government services. The ''order effect'' theory shows that users will affect the overall impression because of the different order results, so the quality of the first impression is related to whether users are willing to continue to use mobile applications in government services. In the design, the layout of functional modules of the platform should follow the principle of simple and orderly, clear hierarchy. When users are in the process of searching, they only need to spend less time to understand and master the functional division of the platform. Style should follow the simple and elegant, layout design, all the images, text to maintain a unified style.
Function coverage and function operation path are important components of behavior experience. At present, most mobile applications in government services have not yet opened the function of comments and timely communication, which will reduce the sense of participation of users. Therefore, content auditors should be increased to meet the needs of timely communication. In the operation path, the steps should be visualized, the content should be light-weight, the expected prompt should be given to the user, and the principle of controllability and humanization should be followed. Mobile applications in government services can also improve the in-depth interaction between the platform and users through prize winning Q &A, voting, and holding offline activities, so as to shorten the distance between mobile applications in government services and users C and enhance user stickiness.
The content of government information is an important part of government new media. Government information is usually released according to professional and rigorous standards. It is difficult for ordinary users to understand such content. The secondary interpretation of government information can help these users get a better reading experience, so that they can participate in political discussion more deeply.

VI. CONCLUSION
With the development of mobile government services and the promotion of district-level integrated media, mobile applications in government services have become significant platforms for people to pay attention to current affairs and online service. Based on user experience theory and characteristics of mobile applications in government services, this study constructed a set of user experience evaluation indexes of mobile applications in government services, and determined the weight of user experience evaluation indexes by entropy method. In order to verify the scientific nature of the evaluation index, the government services App of four provinces is selected as the experimental sample, and the user experience index of the four samples is calculated as the grey correlation coefficient through the grey correlation analysis method, so as to carry out the scheme evaluation. The evaluation index of user experience of mobile applications in government services proposed in this study, which will make it easy for decision makers to obtain the optimal scenario and ultimately realize the improvement of user value.
The measurement standard of user experience should comply with the changing technology and environment. In the future design decisions, more dimensions of indicators will be explored to collect both the operational data of the user's test environment and real environment. For example, the test environment obtains user operation data through focus groups, questionnaire interviews, etc., but the real environment obtains user operation data in a non-intervention environment through computer program steps. The increase of indicator dimension in breadth will excavate more operational data in different environments in depth, which will optimize the user experience more precisely, and create higher value for users. JIANGANG ZHU received the Ph.D. degree from Nanjing Forestry University (NFU), Nanjing, China, in 2004.
He is currently an Associate Professor with the College of Furnishing and Industrial Design, NFU. Since 2014, he has been the Dean of the College of International Education, NFU. He has more than 20 scientific publications, as the first author. His research interests include furniture design and manufacturing, informatization in the furnishings manufacturing industry, and so on. He received the Best Paper Award from the 10th China Forestry Youth Academic Annual Conference, in 2012. He serves as the International Students Session Co-Chair for the 7th China Forestry Science Conference (CFSC 2019).
HONGPING HOU is currently pursuing the master's degree in industrial design and engineering with Nanjing Forestry University, Nanjing, China. His research interests include user experience and product design. VOLUME 9, 2021