Evaluating Web and Mobile User Interfaces With Semiotics: An Empirical Study

Computer interface signs, such as navigational links, thumbnails, small images, command buttons, symbols, icons, etc., which serve as communication artifacts between designers/systems and users, constitute an under-researched area. To design and evaluate intuitive interface signs, the Semiotic Interface Sign Design and Evaluation (SIDE) framework was developed. The aim of this study is to test the framework’s applicability to evaluate web and mobile user interfaces. To that end, two empirical user studies were conducted among a total of 86 practitioners (n1 = 58, n2 = 28). The results show that the SIDE framework helps identify unique usability problems, such as the intuitiveness of interface signs in terms of their referential meaning, which would not have been detected with traditional heuristic evaluation methods. The paper increases our understanding of the intuitive nature of interface signs of web and mobile interfaces, and of the practical use of intuitive signs.


I. INTRODUCTION
A user interface (UI) is a crucial component of any software system, with usability being the key factor determining its effectiveness. Ease-of-use and user-friendliness are the key terms associated with the concept of usability [1], [2]. In fact, usability refers to the extent to which a software system, or any product, can be used efficiently, effectively and satisfactorily within a specified user context [3]. The evaluation of usability is one of the main cornerstones of designing intuitive user interfaces [4], [5]. Although, personal computers with graphical user interfaces have been widely used since the 1980's, due to technological innovations, societal changes and dynamic user preferences, our knowledge regarding UI design in the domain of human-computer interaction (HCI) is in need of continuous improvement [6]- [8]. One of the under-researched areas in this context is the use of interface signs, including navigational links, thumbnails, small images, command buttons, symbols, icons, etc. [9]- [11], which are used to locate the content or functionality for which the users are looking.
The associate editor coordinating the review of this manuscript and approving it for publication was Liang-Bi Chen .
In order to understand the signs of an UI, semiotics can be used. Semiotics is the science of signs that serve as the basis of designing interface signs, focusing on interpretation and sense production [12]- [14]. According to Nadin [15], semiotics includes all activities in HCI, from designing interfaces to testing their usability. The theory of semiotic engineering proposed by De Souza [16], and related methods [17], [18], focus on analyzing system communicability, which measures how efficiently and effectively user interfaces convey the logic of a given system, which also means that it improves the system's usability [17], [19].
Interface signs facilitate interaction or communication between the UIs/systems and their end-users [15], [20]- [24]. As such, earlier studies have argued that designing intuitive interface signs is essential to improving the learnability of a system, as well as ensuring that the tasks involved are understood, supported and completed. For example, Derboven et al. [25] note that ''interpretation is central to human-computer interaction. Users interpret icons, buttons, and other controls to make sense of the functionality offered by an application'' (p. 367). Thus, if the endusers'understanding (interpretation) matches the meaning of a sign as intended by the designers, they can perform the task involved efficiently and effectively. As such, the interpretation of signs is central to HCI and the intuitiveness of interface signs is an explicit element that helps improve effectiveness (accuracy and completeness to achieve the goals), efficiency (efforts required to successfully complete a task) and satisfaction (users' comfort and positive attitudes towards the application), or system usability. For example, a sign called 'Faculties' on a university website can be interpreted differently by different end-users, either as 'list of academic staff' or 'list of academic divisions', which means it can be confusing, and some people may not be able to interpret the meaning of this sign at all. If a user's understanding matched the intended meaning of this sign, then the user can perform the task accurately within the shortest possible time and with a minimum number of clicks (thus maximizing effectiveness). By contrast, when people are unable to interpret the sign accurately, they will take more time, generate a navigational error, have to ask for help or even fail to accomplish the task in question (leading to user dissatisfaction and reduced efficiency). To summarize, semiotic aspects are required to build well-designed user interfaces and achieve the desired quality of communicability, which in turn improves usability.
There are two major research gaps that are addressed in this paper. First, usability research focuses mainly on the information architecture, navigation, layout and content of web applications, paying little attention to the interface signs applied in user interfaces [9]. Although, some usability evaluation methods (UEM) emphasize the importance of semiotics in the evaluation tools [26], the question remains why some signs are more intuitive than others. The Semiotic Interface Sign Design and Evaluation (SIDE) framework was developed to design and evaluate the use of interface signs in web interfaces [11]. The applicability of SIDE framework in the context of mobile user interfaces is unknown. Although, so far, a number of semiotic studies have been conducted, only a few of them have focused explicitly on interface signs, while even fewer included mobile user interfaces [26].
Second, prior research has compared different usability evaluation methods to assess their performance [27], [28]. However, to the best of our knowledge no study has compared between a semiotics-based approach and a traditional usability evaluation method. Therefore, more empirical research is needed for investigating the performance of semiotics-based approach in compare to other usability evaluation methods. This will also help to understand if semiotics-based approach can complement other usability evaluation methods by identifying unique usability problems.
Consequently, the aim of this paper is to assess the applicability of the SIDE framework to determine the usability of web and mobile user interfaces and compare with other usability evaluation methods. The overall research question (RQ) addressed in this paper is: RQ: Does using the SIDE framework to evaluate mobile and web interfaces yield more detailed insights into usability issues compared to existing evaluation methods?
To answer this question, we conducted two studies to compare existing usability evaluation methods to the SIDE approach, using the Heuristic Evaluation, proposed by Nielsen [29] as the candidate usability evaluation method in our studies. Heuristic Evaluation helps researchers detect usability problems on the basis of ten heuristics. It has some limitations, for instance a lack of user involvement and the need to include multiple experts in the evaluation. Despite these limitations, Nielsen's [29] heuristic evaluation technique is a widely used and cited approach to evaluating user interfaces. In addition, we use a set of validated heuristics proposed by Bertini et al. [30], [31], an extension of Nielsen's evaluation technique that has been customized explicitly to evaluate mobile user interfaces.

II. RELATED WORK A. SEMIOTIC THEORIES, FRAMEWORKS, METHODS FOR UI DESIGN AND EVALUATION
Some studies have proposed semiotic theories, frameworks and methodologies for UI design and evaluation. De Souza [16], for instance, introduced a theory of semiotic engineering designed to analyze the connection between system signs (e.g., links, icons, buttons, etc.), semantics and functions, with the aim of understanding the meta-communication between designers and users [16], [17], [20], [32]- [34]. She proposed two methods to assess the communicability of software artifacts: (i) the semiotic inspection method (SIM) [17], and (ii) the communicability evaluation method (CEM) [18]. A number of other studies, based on the semiotic concept and the semiotic engineering theory, assessed the communicability issues of software user interfaces [35]- [38] to measure the applicability of SIM in evaluating the software systems [39], [40] and the human-robot interaction interface [41], to assess the branding and communicability of tourism websites through semiotic analysis of the visual and verbal signs used in the tourism websites [42], and to evaluate the usability [43].
Another approach is the W-SIDE (Web-Semiotic Interface Design Evaluation) framework, which focuses explicitly on the intuitiveness of interface signs (the smallest element of user interfaces) to design and evaluate information-intensive web user interfaces [9], [21], [44]. According to Nadin [15]], the designers first have to determine a sign's content or function and semantics appropriately, and then determine how the content or function can be represented. Another framework, which was first proposed by Andersen [45], uses semiotics to show how signs help users interact with software systems.
Barr et al. [46] proposed a semiotic model based on Peirce's triad (representamen, object and interpretant), while a semiotic model for interactive media was proposed by O'Neill and Benyon [47], [48] based on Eco's model of semiotics [49] and Andersen's concept of computer semiotics [45]. In a similar vein, French et al. [50] proposed a shared meaning design framework (SMDF) based on the concept of semiotics. Goguen [51] introduced algebraic semiotics for designing user interfaces, while Malcolm and Goguen [52] applied algebraic semiotics for UI design. Other studies have examined the semiotic guidelines or principles for user interface design and evaluation, including Bolchini et al. [21], De Souza [53], Amare and Manning [54], Ferreira et al. [55], Ferreira et al. [56] and Liu et al. [57].
Finally, the SIDE framework, which extended the semiotic engineering and W-SIDE approaches, was developed to design and evaluate interface signs, and improve overall system usability by making them more intuitive [11]. The framework includes of a set of determinants, attributes and heuristics related to design, evaluation and user interpretations. In a related study, Islam and Bouwman [58] assess the feasibility of the SIDE framework to evaluate user-intuitive web interface signs. The framework is based on empirical data and semiotic layers (or constructs) that make it different from other frameworks. The framework was developed to analyze signs on different semiotic layers. Table 1 provides a summary overview of relevant studies.
All the above-mentioned studies focus on the web interfaces of PCs. Despite the widespread adoption and use of mobile devices and tablets, very few studies have so far analyzed mobile interfaces with the use of semiotics, even though signs play a key role in a mobile context. The available studies focus on: adopting semiotic concepts to design a graphical interface for mobile phones [62]; proposing an analytical approach to analyze interface signs (icons) for mobile user interfaces [63]; exploring the semiotic engineering theory in the design of mobile user interfaces to control mobile robots [64]; simplifying the mobile interfaces by increasing the spatial and temporal indexicality from a semiotic perspective [65]; evaluating the multimodal design of a language learning app (Memrise app) through semiotic analysis [66]; analyzing a wearable app based on the SAPAD (Semiotic Approach to Product Architecture Design) model to improve the application's interactions for elderly people [67]; and presenting a semiotic analysis of mobile interfaces, based on Eco's revised KF model [49], to show how mobile signs represent information and an interface functions [68].

B. USABILITY EVALUATION: COMPARATIVE STUDY AND INTEGRATED USABILITY STUDY
There are a number of evaluation methods that focuses on usability [69], [70], including heuristic evaluation, task analysis, think-aloud usability testing, questionnaires, cognitive walk-through and interviews [71]- [76]. These methods have been compared to each other, to assess their performance [27], [28], [77]- [79]. For example, two analytical methods were compared by Bekker et al [27], to assess the usability of computer games developed for young children.
Other studies proposed and evaluated integrated usability evaluation methods. For example, Ternauciuc & Vasiu [28], who proposed an integrated usability testing tool to replace a certain type of laboratory testing, where users' actions on the real platform are measured and analyzed, while Brejcha & Marcus [80] compared two heuristic inspection methods, one of which used the heuristics proposed by Marcus [81], and the other was the semiotic heuristics proposed by Brejcha and Marcus [80].
There are also some studies that combine usability studies and semiotic approaches. A study by Islam & Tetard [82], for example, showed that integrating semiotic perception into laboratory-based usability testing improves the performance of the usability evaluation; Silva [83] combined the usability evaluation technique with the semiotic analysis to design the meaningful interface in game development, while Bolchini et al. [21] proposed applying a set of semiotic guidelines as a complementary toolkit for heuristic evaluation, and other studies have highlighted the importance of integrating semiotic concepts into usability testing [11], [12], [58].
The brief discussion presented above indicates that semiotic research in HCI builds on earlier work, for instance in providing semiotic frameworks, models, design principles, heuristics and analysis methods for UI, as well as describing new concepts, theories and properties. Semiotics can play a significant role and become an accepted approach in HCI research, especially with regard to designing and evaluating user interfaces. So far, only a few semiotic studies have focused on examining mobile user interfaces or on comparing semiotic and non-semiotic approaches, which is why we adopted the SIDE framework and contrasted it with existing methods, to determine whether or not the SIDE framework yields new or more detailed insights. As such, in this study, we compare and observe the evaluation performance of a semiotic evaluation, using the SIDE framework in combination with a heuristic evaluation (carried out using the non-semiotic heuristics) to assess the usability of both web interfaces and mobile interfaces. In the following section, we describe the SIDE framework and explain why we decided to use it in this study.

III. THE SIDE FRAMEWORK
Although the semiotic theories, frameworks, models and methods discussed above all have their own merits, the SIDE framework [11], [58], [84] is different because it focuses explicitly on interface signs, rather than on other dimensions of UI design. The SIDE framework was developed based on a longitudinal research design for a period of three years [11], [58]. De Souza's [16] semiotic engineering theory and Andersen's [45] computer semiotics theory were used as background theories to develop the SIDE framework. Furthermore, SIDE is an extension to the W-SIDE. Taken together, we believe that SIDE is an extension of prior semiotic frameworks and has been developed in a rigorous way. Therefore, we employed SIDE framework in our research. SIDE has five levels (Semantic, Environment, Social, Pragmatic and Syntactic), each of which is defined by determinants (themes), which in turn have attributes. In addition, the framework provides a set of semiotic heuristics for the design and evaluation of interface signs, which are mapped onto different levels of the framework (see Appendix), each of which is briefly discussed below. The SIDE framework, and its applications are uses,is discussed in greater detail in Islam & Bouwman [11], [58].
The Syntactic level contains features of interface sign presentation. It consists of six determinants: (a) Interactivity refers to the types of interactivity with the interface signsdecorative, indicative, indicative-interactive, functional, navigational and hybrid-interactive. (b) Color refers to the color being used (sign color), as well as to brightness and contrast. (c) Clarity and readability include attributes like overlap, obscure, distract, closeness, distance and conciseness, which indirectly helps participants interpret the meaning of the signs in question. (d) Presentational aspects refer to the sign labels, what the signs looks like (pictorial view) and what their structure is. (e) Context includes attributes like the web page encompassing the sign, the name of the website and the web domain. (f) Consistency refers to the uniform design strategy for a web application.
The pragmatic level refers to the relationship between a sign and its use or interpretation, and has four determinants: (a) Position has four main attributes: user habits, neighboring signs, user attention and common positions. (b) Amplification has the following attributes: appended thumbnail, appended icon, appended small image, appended short text, appended indicative text and appended abbreviated letter(s), which indirectly help users understand the meaning of the sign. (c) Relations include the relationships among different interface signs within a web page or user interface, which can be paradigmatic, syntagmatic, concurrence and dependence. (d) Coherence refers to the logical relation with the real-world facts.
The social level refers to the meaning of a sign within its social context. This level has four determinants: (a) Cultural marker refers to the color, language, and labels of the sign for a specific cultural context. (b) Matching refers to reality, convention and real-world objects. (c) Mapping refers to the metaphors that resemble a user's real-world experiences. (d) Organization refers to the category, name and products or services of an organization (interlocutor or owner of the website).
The environment level deals with the surrounding factors that, collectively, can affect user behavior, building on the user's presupposed knowledge or ontology, and representing (i) the user's knowledge and memory, and (ii) an association of the user's interpretation with the actual meanings of the interface signs. This level also contains the attribute 'ontology', which refers to the skills, knowledge and concepts that the user requires to interpret the meaning of an interface sign [9], [21]. The framework includes a number of ontologies, like an Internet Ontology (user's skills in relation to web surfing, the online world, etc., for example, the 'Sign out' sign) and a Current Web Domain Ontology (concept related to the sign that is very specific for the current web domain, e.g., the 'Shopping Basket' in an e-commerce application domain).
The semantic level refers to the meaning of a sign and the relationships between the sign itself, and its meaning from a user's designer's perspective, respectively. 'Interpretation accuracy', which is an attribute of this level, refers to how accurately users interpret interface signs in the way the designers intended, with the following options: accurate, moderate, conflicting, erroneous and unable.
We are now in a position to compare the SIDE framework and its concepts other usability evaluation methods, and discuss how they are used in the research project.

IV. RESEARCH METHOD
Two empirical user studies were conducted. Study I was designed to show how useful the SIDE framework is in terms VOLUME 8, 2020 of assessing the usability of web user interfaces. A semiotic evaluation was carried out and compared to a heuristic evaluation for web user interfaces using both methods, while Study II focused on the evaluation of mobile user interfaces.
A. STUDY I: ANALYTICAL EVALUATION OF WEB INTERFACES 1) PARTICIPANTS 58 students (32 male, 26 female) from the postgraduate programme of Computer Science and Engineering department of the Military Institute of Science and Technology (MIST) participated in the study. The average age of the participants was 30 years. During the period of data collection, the participants were enrolled in a human-computer interaction course. They had all completed their Bachelors in Computer Science and had taken several academic courses related to UI design and evaluation. Thirty-one students had 3-5 years of professional experience in software development, twelve students had 10-12 years of experience, eight students were recent graduates, and the remaining seven had 2-3 years of experience as IT staff in different organizations. None of them had any experience with the websites involved in the study, although they all had broad experience in terms of accessing computers and the mobile Internet. Although, they all had experience with UI design and evaluation, they were not familiar with the concepts of semiotics in UI/HCI.

2) PROCEDURE
Data was collected through two semesters taught in two academic sessions. Throughout the entire semester (14 weeks, 3 hours per week), the participants were taught the basics of usability evaluation, the SIDE framework (semiotic evaluation), cognitive walkthrough and heuristic evaluation, based on a hands-on approach to teaching the various evaluation techniques. A within-subject study was designed. At the end of the semester, the participants were asked to evaluate six e-government websites (see Table 2) using a semiotic

evaluation (SE) and a heuristic evaluation (HE) technique.
Bangladesh National Portal website was assigned to eighteen evaluators. Each of the remaining website was assigned to eight evaluators. The participants were asked to conduct the semiotic evaluation based on the SIDE framework [11], [58]. For the heuristic evaluation, participants were asked to use Nielsen's [29] heuristics. Nielsen's [29] concept of severity rating (0 to 4) was adopted in both methods, with 0, signifying not a usability problem at all; 1, a cosmetic problem only; 2, a minor usability problem; 3, a major usability problem; and 4, catastrophe from a usability perspective. Templates for recording the findings were provided. Both templates (for HE and SE) included the following fields: problem number, where is the problem located, what is the problem, why is it a problem, how many heuristics are violated, what is the severity rating and what are recommendation for a possible solution. The average of the severity ratings of each usability problem was calculated for both methods. For example, if a problem was identified by three evaluators using the heuristic evaluation, with severity ratings 3, 2, 3, the average severity score would be 2.67.
The severity ratings of the usability problems were classified as follows: (a) an average below 1.5 is Cosmetic, (b) an average rating between 1.5 and 2.5 is Minor; an average rating between 2.5 and 3.5 is Major; and (d) an average rating equal to or above 3.5 is Catastrophic.
The different websites were randomly assigned to the various participants. To avoid order effects, the method (HE and SE) to be used first was also randomly selected. Finally, the participants were asked to answer a set of closed questions, to collect their opinions about the SE approach in terms of its ease of use, contribution, how the framework (SE) may be used (or conducted) and future use, and one open question related to the SIDE framework, to collect generic feedback. In short, both qualitative and quantitative data were collected and analyzed.
B. STUDY II: ANALYTICAL EVALUATION OF MOBILE INTERFACES 1) PARTICIPANTS 28 (18 male, 10 female)students from a postgraduate program of Åbo Akademi University (ÅAU) and the University of Turku took part in the test, with an average age of 27. At the time, the students were enrolled in a course called User Centered Design of Information Systems. They all had completed some (1-5) academic courses related to UI design and evaluation. Nine participants had one to five years' experience and contributed to several UI related projects, while six participants had contributed to three to five UI design and evaluation related projects,and had six to twelve months'experience, while others had some experience with UI design and evaluation. Although none of the participants had any experience with the mobile application under study, they all did have extensive experience accessing computers and the mobile Internet. Like the participants in the first study, none of them were familiar with the concepts of semiotics in UI/HCI.

2) PROCEDURE
This study adopted an approach similar to the one adopted in Study I. The participants were lectured on the SIDE framework and usability evaluation methods, including heuristic evaluation, cognitive walkthrough and diary method. The SIDE framework was customized to the evaluation of mobile interfaces. For example, the attributes of the 'representamen context' of the syntax layer of the SIDE framework were customized as apps' domain, apps' name, and apps' page name instead of web domain, web name, web page, respectively [11]. Moreover, the heuristics of the SIDE framework are defined as context-independent, to make them suitable for the evaluation of both web and mobile interfaces (see Appendix). However, for each technique, the participants attended a four hour practice session, where they evaluated mobile interfaces using the approaches listed above. After the theoretical and practical training, the participants were asked to evaluate the interfaces of the Wellmo Mobile Application (www.wellmo.com), an app aimed at health professionals and designed to track people's health-related behavior. The Wellmo app aggregates data from leading wearable devices and apps, i.e. Fitbit and iHealth. Service providers can add their own services to the app and launch campaigns. The participants evaluated the Wellmo apps based on the two techniques (heuristic evaluation and semiotic evaluation) in random order. They were asked to conduct the Semiotic Evaluation based on the SIDE framework. With regard to the Heuristic Evaluation, they were asked to use the set of heuristics proposed by Bertini et al. [30], [31] to evaluate mobile user interfaces. Templates for recording the findings were provided for each type of evaluation. The templates included fields similar to the templates used in Study I. Finally, the participants were asked to provide feedback about the SE technique at the end of the course with regard to its ease of use, contribution, how the framework (SE) may be used (or conducted) and future use, and to answer two open questions related to the SIDE framework was used to collect feedback: (a) Please provide overall comments on the use of the SIDE framework to evaluate the mobile interface signs; and (b) Please provide comments on the possible issues to refine the SIDE framework to make it more effective for evaluating the mobile interfaces. Qualitative and quantitative data were collected and analyzed. Since all the participants evaluated the same mobile application, a paired t-test was used to assess whether the HE findings were significantly different from the SE findings.

A. STUDY I
We examined the usability-related problems mentioned by the evaluators (participants) for both techniques. Different measures of the predicted usability problems were calculated, aggregated and presented in Tables 3 and 4, where Table 3 presents the evaluation results in detail for a specific website and Table 4 represents the summary findings for all websites. The evaluation of website W1 (Bangladesh National Portal), which was evaluated by 18 participants (P1-P18) using both techniques is presented in Table 3. Column B and D summarizes the number of problems predicted by the various participants using the HE and SE, respectively. We examined whether these problems were false positives, in some cases consulting our department/laboratory colleagues to validate false positives. The actual number of problems, or hits (found after deducting the false positives from the list of predicted problems) for HE and SE are listed in columns C and E, respectively.
The number of problems identified using the two methods are listed in column F, while the total number of problems predicted using both techniques by each participant are listed in column G (see Table 3). Finally, the overall number of actual problems (see final row) was measured by combining two problem sets, predicted by eighteen participants using VOLUME 8, 2020 the two evaluation methods. In W1, the participants identified 130 problems (87 + 43) in all, 16 of which were commonly predicted; as such, a total of 114 additional problems were identified using both methods (see Table 3). The different measures of predictive usability problems for the other five websites were calculated in the same way, and the results are summarized in Table 4. Column 2 shows the total number of usability problems (hits) found by each participant using both methods, while the total number of problems for each website is presented in column 6. The percentage of problems identified using HE and SE (in columns 7 and 8) was calculated using equation 1 and 2, respectively, and common problems identified using both techniques (in column 9) were calculated based on equation 3.
Percentage of distinct problems found by Percentage of distinct problems found by Percentage of problems found by both techniques = X: No of problems found by HE Y: No of problems found by SE C: Common problems found using both methods T: Total number problems found using both methods The results showed that most participants identified more problems when using the heuristic evaluation, with the exception of participants P28, P38 and P45. Participant P28 identified the same number of problems, while P38 and P45 identified some more problems using SE when evaluating the websites of Bangladesh Computer Council (W3), Bangladesh National Portal (W1) and Bureau of Educational Information & Statistics (W4), respectively. However, all the participants identified usability problems for every website using SE. Table 4 also shows that although each group of participants identified more problems using HE, they also found problems when using SE. In some cases, the same problems were identified in both methods. The table shows the differences between the numbers of problems that were identified when combining the findings, and the number of problems found using either HE or SE for the different websites. The results indicate that a combination of HE and SE produced better results.
A few examples of problems predicted using the Heuristic Evaluation and the Semiotic Evaluation, respectively, are described here: in HE, participants P35 and P39 found that the e-Survey page of the site of Bangladesh Bureau of Educational Information & Statistics (W4) (see Figure 1) does not provide the facilities required to navigate to other pages or to return to the home page/previous pages, violating the heuristics of 'user control and freedom' and 'flexibility and efficiency of use'. The severity of this problem was rated 4 and 3 by participants P35 and P39, respectively. In the Semiotic Evaluation, none of the participants identified any problem in relation to the 'e-Survey' sign, and, on the e-Survey page, they found no indication (or problematic interface sign) to highlight this navigational problem. Again, in SE, participant P1 found that the sign 'Online Registration with icon' (see Figure 2) on Bangladesh National Portal (W1) is confusing and not intuitive. Here, the sign is designed to provide online services to citizens, although the icon associated with this sign represents having to walk to obtain the service, while online services are in fact designed so people do not have to come (walk) to the government office. So the icon that is used conflicts with the text and fails to match real-world conventions. Moreover, the icon is not logically connected to the text of the sign. Moreover, the icon is not logically connected to the text of the sign. As such, this particular sign violates semiotic heuristics 8, 10, 11 and 12 (see Table 10 in Appendix). The severity of this problem was rated as 2.
The problem related to this sign was raised by 11 participants in the Semiotic Evaluation, while only two participants identified this problem in the Heuristic Evaluation, claiming that this sign violates the heuristics of providing a 'match between the system and the real world'. In SE each small element is investigated in different layers. Therefore, although for the sign 'Online Registration with icon', no problem was identified with regard to the syntactic level, the participants identified a problem in relation to the pragmatic (heuristic numbers 8 and 10) and social level (heuristic numbers 11 and 12). As such, the results indicate when and in what capacity one method performed better than the other method. Firstly, the Semiotic Evaluation can be used in an evaluation setting in which a sign is present in a UI, while the Heuristic Evaluation can identify a problem when a sign is missing; because HE focuses on the elements of control and freedom, as well as on the feedback of the system status, it can be used to identify problems even when there is no relevant sign (as observed in evaluating the navigational facility and the 'e-Survey' sign). Secondly, Heuristic Evaluation focuses on a different dimension of the interface evaluation, including the information architecture, navigation, layout and content, indicating when there are problems in relation to all the dimensions, while paying little attention to interface signs (as observed in the evaluation results of navigational status, 'e-Survey' sign, and the 'Online Registration with icon' sign). Thirdly, compared to the Heuristic Evaluation, the Semiotic Evaluation succeeded well in revealing problems involving interface sign, providing insight into why problems occur and suggesting alternative solutions (as observed in the evaluation process of 'Online Registration with icon' sign). Figure 3 shows the severity ratings of the usability problems identified using the two methods. For example, when evaluating the website W5 (Dept. of Immigration and Passport), evaluators identified nine catastrophic, 13 major, eight minor and 11 cosmetic usability problems using the Heuristic Evaluation method, while identifying four catastrophic, six major, 11 minor and 10 cosmetic usability problems using the SE (see Figure 3). In terms of the severity of the usability problems, the results in Fig. 3 indicate that the SE method helps identify the usability problems of all severity ratings. The SE method performs comparatively well when it comes to detecting minor and cosmetic problems, and it helped the participants identify about 26.6% and 30.8% of all catastrophic and major problems, respectively. The Heuristic Evaluation method performs comparatively well in terms of identifying catastrophic and major problems, which is consistent with the measure in terms of the number of usability problems. Heuristic Evaluation detects a relatively larger number of usability problems and more severe heuristic problems, whereas Semiotic Evaluation detects a large number of cosmetic and minor problems, as well as a reasonable number of catastrophic and major problems.
As mentioned in the method section, the feedback form consisted of two parts: a set of closed items using a 7-point response scale (from 1: strongly disagree, to 7: strongly agree) and an open-ended question related to the practices of SE, based on the SIDE framework. A summary of the results regarding the closed statements is shown in Table 5, VOLUME 8, 2020  which indicates that the participants agreed with all the statements. They found that the heuristics of the SIDE framework are easy to use (m = 6.42) and indicated they would use the Semiotic Evaluation (SE) in the future (m = 6.10), although not always as a stand-alone technique (m = 4.38), and that they felt that SE should be integrated with other usability evaluation methods to evaluate web interfaces (m = 6.00).

B. STUDY II
Study II used a similar approach to the one used in Study I to identify usability problems and assess the performance of mobile UIs. A combined overview of predicted usability problems by each participant using both techniques for the Wellmo application is shown in Table 6. The participants found a total of 24 distinct problems using the HE and 21 distinct problems using the SE, while there were four problems that they had in common. In total, 41 problems were predicted using the two techniques. Eight participants identified no problems that the two techniques had in common, while the other 20 participants identified 1 -4 common problems (see Table 6).
The percentage of usability problems identified by the participants in the Wellmo mobile app using both techniques VOLUME 8, 2020 are calculated on the basis of the data presented in Table 6 and shown in Figure 5. Figure 5 shows that combining the evaluation results of both methods yields better results than using either of the two evaluation techniques alone. Most participants (17 out of 28) identified more problems using the HE, while nine participants identified more problems using the SE, and two participants identified the same number of problems in both cases. A paired t-test showed the significant differences between the findings (number of problems) of HE and the results of the combination of both techniques (t = −6.171, p < 0.001). Similarly, a significant difference was shown between the findings of SE on the one hand, and the results of a combination of the two techniques (t = −8.176 < 0.001). Again, there was no significant difference between the findings (number of problems) of HE and the findings of SE (t = 1.249, p = 0.216).
A few examples of predicted problems using the HE and SE are shown here: In HE, seven participants found that data input is only possible with a sign, but not via keyboard using android HTC device (see Figure 4(a)), which violates heuristic number 5 of the set of heuristics proposed by Bertini et al. [30], [31] (see also Appendix). The average severity of this problem was rated as 3, while none of the participants identified this as a problem in the Semiotic Evaluation. Some other participants found that there was no clear navigation to the homepage (see Figure 4(a)), which violates heuristic number 6 of the heuristics proposed by Bertini et al. The average severity of this problem was rated as 2. Again, this problem was not identified by any of the participants using the Semiotic Evaluation. A total of 13 participants found that the application crashes if more numbers are entered (see Figure 4(a), which violates heuristic 8 of the set of heuristics proposed by Bertini et al. The average severity of this problem was rated as 4, and again, it was not detected by participants using the Semiotic Evaluation, where the referential meaning and intuitiveness of a sign is analyzed at different levels, and which does not look at the reliability of app functionality and the possibility of generating an input error by entering a different set of trial data.
An almost equal number of participants identified problems related to the help page (see Figure 4(b)), with both techniques. In this case, the Help document is less convenient for users, because it contains too much text, instructions should be provided in bold print, and there is no search option, which violates heuristics 4 and 5 of the set of heuristics proposed by Bertini et al. [30], [31], as well as heuristics number 2, 3 and 4 of the syntactic layer (see Table 10 in Appendix). The average severity of this problem was rated as 2.
Once more, in SE, no fewer than 18 participant found that three icons in ''My own tracker'' are not intuitive (see Figure 4 (c)). For example, 'Track your success with traffic lights' does not indicate what the purpose of this sign is, which is potentially confusing. The sign violates semiotic heuristics no. 9, 12 and 16 (see Table 10 in Appendix). The average severity of this problem was rated as 3. Another three participants found that the sign 'Past 14 Days' (see Figure 4(d)) looks like something that you should be able to click on, but that was not the case (it was actually the caption of the graph), which violates semiotic heuristic no.1. The average severity of this problem was rated as 1. Interestingly enough, none of the participants identified the above-mentioned problems in the Heuristic Evaluation. These example findings similar to those indicated in Study I (website evaluation study), in that the SE can reveal problems that may not be found using the HE. The SE primarily focused to explore the underlying reasons of the problems associated with the interface signs, however, the problems related to the reliability of a system's functionalities and the missing sign cannot be identified using SE. The examples also indicate that many problems can be identified in HE that may not be detected using the SE, especially, if the problems are associated with the information architecture, content, navigation and layout of the UI (i.e., other than the interface sign). In some cases, both techniques can detect the same problems.
The synthesized results of the severity ratings of the usability problems identified in Study II are shown in Figure 6.
The results indicate that both methods helped the participants identify problems at each severity level. The results also show that, although the Semiotic Evaluation identified fewer severe (catastrophic) problems, the method is able to identify catastrophic problems that are not identified by the Heuristic Evaluation (see Figure 6). Figure 6 also shows the low percentage of usability problems identified through both methods, which indicates that Semiotic Evaluation detects a reasonable number of distinct catastrophic and major problems, and a large number of distinct cosmetic and minor problems.
We collected feedback in the same way we did in Study I. A summary of the feedback provided after Study II is shown in Table 7. The scores show that the participants agreed with all the statements. Although the participants assigned lower scores compared to the feedback in Study I, they agreed more strongly with regard to the ease-of-use of the SIDE framework, and they were more positive about the likelihood of using the framework in future evaluations of mobile UIs. Similar to the feedback in Study I, the participants in Study II indicated they were less inclined to use SE as a stand-alone technique (m = 4.11), and agreed more strongly with regard to integrating Semiotic Evaluations with other UEMs to evaluate mobile interfaces (m = 6.11).

C. QUALITATIVE DATA ANALYSIS AND RESULTS
In both studies, the responses to the open-ended questions were analyzed and coded using thematic analysis [85]. The researchers went through the data carefully and assigned codes to the portions of data that represent a common theme. Three researchers were involved in this coding process. First, two researchers coded and categorized the data separately. After completing the coding, the researchers came together to compare the coding. The inter-coder agreement, calculated as the sum of all the agreements divided by the sum of all agreements and disagreements, was 0.88. The disagreements were resolved via discussions. We observed that the codes could be categorized into four broad categories similar to the categories used in assessment survey: ease of use, contribution of the framework, how the framework may be used and future use. The categories and codes are listed in Table 8, with sample quotes from the data.
The responses highlighted the issues involving the easeof-use of the SIDE framework, the contribution of the framework, its future use and how the participants intended to use the framework. The participants indicated that the SIDE framework is easy to understand and apply in assessing both web and mobile interfaces. Because many of the heuristics proposed in the SIDE framework helped them identify usability problems and understand the root causes associated with any problematic interface sign, and with less effort, they suggested that the SIDE-based semiotic evaluation is an effective approach to evaluating mobile and web interfaces. Many of the participants expressed an interest in using the SIDE framework in the future. They also mentioned that, since SE based on the SIDE framework focuses mainly on the smaller VOLUME 8, 2020  elements (i.e., interface signs) of UIs, Heuristic Evaluation proved useful in focusing on all other issues involving the evaluation of web interfaces (content, navigation, etc.). As a result, integrating Semiotic Evaluation with other usability evaluation methods, like HE, is a must if the aim is obtain maximum results, which is very much inline with the assessment feedback included in Tables 5 and 7.
The participants also highlighted some limitations of the SIDE framework. The framework has a total of 67 features (attributes) and five levels, which all have to be taken into account during the evaluation process. In both studies, the participants indicated that, although the framework covers the broader aspects of sign evaluation, it contains many features, and it takes time to master and learn to apply them. In addition, although this appears to contradict the claim that the method is 'easy to learn' and 'easy to apply', some of the participants did raise this issue because the semiotic concept was new to them. Moreover, they had received less training and were given less time to learn and practice using the SIDE framework. Again, some participants expressed that the SE may not reveled all kind of usability problems rather than the problems associated with the interface signs. Thus, for the best results, SE needs to be applied with any other usability evaluation method that indicates the dependability of SE. Furthermore, while many participants indicated that using many features in the SIDE framework helped them analyze the interface signs on different levels and reveal the root causes of problematic signs, other participants argued that using too many features (sub-heuristics) produces many false-positive results, if the method is not learned and applied properly. In case of the evaluation of mobile interfaces, the open-ended question also highlighted the issues related to the further refinement, to make it more suitable to the SE of mobile user interfaces The highlighted issues regarding further refinement are synthesized and presented in Table 9.

VI. DISCUSSION
Based on the findings of this study, we can state that the Semiotic Evaluation based on the SIDE framework is easy to learn and apply. The framework helps improve web and mobile user interfaces, and, based on evaluations, it helps improve interface performance. The Semiotic Evaluation detects usability problems at all severity levels, including problems that may not be identified using the Heuristic Evaluation method, which are mostly related to interface signs. In some cases, while a problem is predicted by both methods, the SE provide a better reason as to why the problem occurs and offers an alternative design recommendation to solve the problem. Integrating the SE with other tools, like the Heuristic Evaluation, yielded a significantly higher number of usability problems compared to when the individual methods (i.e., either HE or SE) are used on their own. Semiotic Evaluation based on the SIDE framework looks to be insufficient when it comes to evaluating the overall usability of web and mobile interfaces. On the other hand, the large number of features of the SIDE framework, while adding to its scope and usefulness, may lead practitioners to explore too many features, which also means it takes longer to apply the framework. The SIDE framework has to be refined to make it more suitable for evaluating mobile interfaces.
The contribution of this study to the area of human computer interaction is threefold. Firstly, because there are a number of semiotic and non-semiotic usability evaluation methods that are available, developing a comparative and integrative view on evaluation performance has been an important issue for both researchers and for practitioners. This may also show the significance and effectiveness of semiotic over non-semiotic evaluation for web and mobile interfaces. This paper presents two studies highlighting the effectiveness of the Semiotic Evaluation compared to the more traditional heuristic evaluation method. As discussed earlier that semiotic studies focused on human computer interaction covered several areas, such as proposing semiotic methods to analyze user interfaces, developing frameworks or models grounded on semiotic concepts for designing the user interfaces, and providing semiotic guidelines or heuristics  for evaluating the interface signs. Indeed, Scolari [86], Speroni [9], Bolchini et al. [21], Speroni et al. [44], and Gray & Salzman [79] suggest that semiotic evaluation can be used as a complimentary or integrated toolkit, together with the existing heuristic evaluation methods. However, to the best of our knowledge, very few studies so far have compared the evaluation performance of semiotic and non-semiotic evaluation methods [77] or integrated semiotics concept and non-semiotic approaches [79], [80] to evaluate web interfaces. Furthermore, no study has compared the evaluation performance of the semiotic and non-semiotic methods for mobile user interfaces. Therefore, our paper contributes to this body of literature by comparing the performance of semiotic based framework, SIDE and traditional Heuristic Evaluation approach. Our results clearly show the effectiveness of using both approaches together in order to find more usability problems.
Secondly, the study shows that the SIDE framework is useful for evaluating mobile user interfaces. The SIDE framework was originally developed to evaluate web user interfaces exclusively [58]. Literature shows that only a few similar studies have evaluated mobile user interfaces using semiotics, and that none of them have proposed a set of semiotic guidelines or a specific framework/model to evaluate mobile user interfaces. Each study has been conducted based on alternative semiotic theory and showed that semiotic concepts could be helpful in analyzing mobile user interfaces. In this study, we compared the outcomes to existing HE method that was originally developed to evaluate mobile user interfaces with the semiotic based SIDE framework. As such, our results confirm that the SIDE framework can be applied to mobile UI, although some further extensions are needed. An integration of Semiotic Evaluation and Heuristic Evaluation in mobile UI evaluation is recommended to secure the maximum benefit of UI evaluation.
Finally, this study suggest that determinants related to the syntactic, pragmatic, social, and environment levels of the SIDE framework, which require further investigation to make the SIDE framework more effective in evaluating mobile interfaces. As mentioned in the related work section, although some studies have analyzed mobile interface using semiotics [61], [62], [68], no specific framework or model or heuristics have so far been developed in literature. This means that the findings of this study can serve as a starting point for developing a new (or upgrading the existing) framework aimed at mobile interfaces.
This study has a number of practical implications as well. Firstly, the results raise awareness of the concept of semiotics in the design of interface signs and the evaluation of both web and mobile interfaces. Secondly, the SIDE framework provides a set of heuristics that evaluators can apply in practice and that are easy to use. And, finally, the results indicate that integrating Semiotic Evaluation with other methods provides additional value with regard to usability evaluation. Both techniques used in this study (Semiotic Evaluation and Heuristic Evaluation) can be combined in the following ways: Firstly,it is important to understand the application under study (i.e., the purpose and functionalities of the application, the application domain and what it is that the owners of the website/application need to communicate). Secondly, the user profiles have to be modeled with respect to their familiarity with ontologies. Thirdly, the application under study has to be evaluated or examined using the traditional approach of a heuristic evaluation (i.e., identify the problem, rate its severity and provide possible design solution based on a specific set of heuristics/guidelines). To combine the two techniques, practitioners can use the heuristics of the SIDE framework to evaluate the selected interface signs of the website/application, while other elements or usability issues of the studied web site/application can be evaluated based on a specific set of standard heuristics, for instance the heuristics provided by Nielsen [29] or Bertini et al. [30], [31].

VII. CONCLUSION
This research contributes to existing literature that has compared the performance of different usability evaluation methods [27], [28], by focusing on the outcomes of a semiotic-based approach in comparison to a heuristic evaluation approach that does not explicitly pay attention to the use of interface signs from a semiotic perspective. We have shown empirically that the SIDE framework can identify some unique usability problems that cannot be detected using a heuristic approach. Therefore, SIDE framework can complement the heuristic approach for developing intuitive user interfaces. Apart from these, The results showed that a reasonable number of major and catastrophic problems were detected using both techniques in different websites and mobile applications that need to be addressed to improve their usability and acceptability. The results also indicated that the selected e-government websites, as well as the Wellmo application, have significant usability problems and the intuitiveness of their interface signs needs to be improved to increase the overall accessibility, user experience and acceptability.
This study has some limitations that can serve as avenues for future research. Firstly, the analysis involved qualitative data related to the description of usability problems and the response of participants to the open-ended questions. Qualitative analysis is often subjective in nature, as it depends on an person's knowledge, inferences and assumptions. Secondly, we did not produce a standard usability problem set for each application based on user testing, against which Hits and False Positives could be measured, which leads to validation issues of the findings for each evaluation method. In the research process, we tried to address this limitation by conducting a collaborative investigation of each predicted problem, to determine the Hits and False Positives. In some cases, we discussed the results with experts and users to verify our assessments. Although we are confident of our results, future studies could focus on validating the results by using a standard set of problems (based on user testing). Finally, the number of websites and mobile app evaluated in two studies were somewhat unbalanced, with six websites and one mobile app being inspected. Furthermore, one website was inspected by eighteen participants and five other websites by eight participants, due to the different numbers of students enrolled in the courses being offered at different times and places. However, the focus of this study was not on how the evaluation performance may vary among the selected websites, the aim being to observe how the SIDE framework performs compared to the heuristic evaluation, which means that, although we recognize this imbalance, it is still acceptable for this research. It may be worthwhile to include a more balanced number of participants and applications in future studies.
Finally, the study also includes avenues for future research, including, but not limited to, research designed to support the revision of the SIDE framework for mobile interfaces. In this study, the semiotic evaluation was compared to the heuristic evaluation proposed by Nielsen's [29] and Bertini et al.'s [30], [31]. Further studies can examine the differences of the semiotic evaluation with other available heuristic evaluation methods. Thus, assessing the performance of usability evaluation by integrating the Semiotic Evaluation with other usability evaluation methods, like laboratory-based usability testing, opens up new research opportunities.
To conclude, the area of semiotics deserves to become a broadly accepted approach in HCI research, especially when designing and evaluating user interfaces. Our findings clearly show that Semiotic Evaluation based on the SIDE framework helps improve the performance of the usability evaluation of web and mobile interfaces. However, based on our findings, we realize that the SIDE framework needs to be refined for mobile user interfaces. The intention would be to conduct an extensive empirical study to identify more precise determinants and attributes with regard to mobile interface signs.

APPENDIX
A. SEMIOTIC HEURISTIC [11] See Table 10. B. HEURISTICS FOR EVALUATING MOBILE UIs [30], [31] 1) Visibility of system status and findability of the device 2) Match between system and the real world 3) Consistency and mapping 4) Good ergonomics and minimalist design 5) Ease of input, readability and glanceability 6) Flexibility, efficiency of use and personalization 7) Aesthetic, privacy and social conventions 8) Realistic error management