Selected Code-quality Characteristics and Metrics for Internet of Things Systems

Software code is present on multiple levels within current Internet of Things (IoT) systems.The quality of this code impacts system reliability, safety, maintainability, and other quality aspects. In this paper, we provide a comprehensive overview of code quality-related metrics, specifically revised for the context of IoT systems. These metrics are divided into main code quality categories: Size, redundancy, complexity, coupling, unit test coverage and effectiveness, cohesion, code readability, security, and code heterogeneity. The metrics are then linked to selected general quality characteristics from the ISO/IEC 25010:2011 standard by their possible impact on the quality and reliability of an IoT system, the principal layer of the system, the code levels and the main phases of the projectto which they are relevant. This analysis is followed by a discussion of code smells and their relation to the presented metrics. The overview presented in the paper is the result of a thorough analysis and discussion of the author’s team with the involvement of external subject-matter experts in which a defined decision algorithm was followed. The primary result of the paper is an overview of the metrics accompanied by applicability notes related to the quality characteristics, the system layer, the level of the code, and the phase of the IoT project.


I. INTRODUCTION
T HROUGHOUT the last decade, the Internet of Things (IoT) systems have grown in popularity and their user base. IoT systems are also increasingly being used to effectively support various enterprise domains [1]. One of the specifics of current IoT systems is a high potential heterogeneity of the technologies used, which could negatively impact the potential interoperability, seamless integration, and general reliability and security of an IoT system [2]- [4]. So far, the problem has been discussed from multiple perspectives, including an overall review [3], [5], integration [6] and security [7], [8]. However, this problem has not been adequately examined from the code-level point of view.
The programming code is an integral part of the IoT system and is usually present in various aspects. Similarly to software systems [9], [10], poor code quality and code heterogeneity in the technologies and coding practice used have the potential to impact the overall reliability and quality of an IoT system. The problem of code quality has been addressed in various software studies and an established set of metrics has been proposed [11]- [13] and consolidated [9], [10], [14], [15]. However, such a consolidated view has not been provided for IoT systems, despite code quality's relevance in this field.
IoT systems differ from software systems in a number of specific ways [1], [3]. As an example, we can discuss (1) generally larger heterogeneity of used technologies, protocols, devices, and programming languages, which create a significantly higher number of possible system configurations to be tested, (2) as a consequence, problematic interoperability of individual system parts and flawless integration, (3) a currently lower degree of standardization of communication protocols, as there are ongoing discussions and intensive work in this field, (4) increased risk of various privacy and security issues in the current IoT systems, arising from many factors [16], [17]. Despite certain overlap, established software metrics might be insufficient for current IoT systems and should be revised for all of these differences.
A systematic analysis and review of the problem and available metrics must be performed to reflect the unique technicalities of IoT systems, including the proposal of a new IoT code-specific metric. In this paper, we summarize the relevant available code-quality metrics for IoT systems, and analyze their possible impact on the general system quality characteristics from ISO/IEC 25010:2011 standard. We also discuss the role of code smells in the reliability and quality of the system and their relation to the presented metrics.
The motivation for this mapping is the following. First, it is a useful tool for the definition of a test strategy for an IoT system. Test strategy guidelines typically operate with quality aspects or quality characteristics (e.g. Business Driven Test Management [18], part of TMAP Next [19]). For the definition of static testing levels (code reviews, inspections, etc.), the test engineer is naturally interested in knowing which code quality metrics may relate to particular system quality characteristics. Second, the aspect of test reporting and the impact analysis of poor quality also play a role. When some deficiency in source code quality is detected, the natural question is what the particular impact of this deficiency would be. The quality characteristics of the system serve well in this point, since they are general categories in which the matter can be reported and explained at a managerial level, for example.
The code quality metrics presented and discussed in this paper have an inevitable overlap with the general code quality metrics that can be applied in the case of general software, firmware, control software in an IoT device, infrastructural part or an IoT backend system. This situation is logical, as good coding practices are generally valid. Only specific details would differ for specific types of IoT systems. However, part of the metrics is more specific to distribute IoT systems, reflecting their context. The paper is organized as follows. Section II discusses related work in the fields of code quality metrics and quality metrics for IoT systems. Section III explains the methodology of this study and provides a consolidated overview of code quality metrics applicable to IoT systems and their components. This section explores the potential impact of the metrics discussed on more general quality characteristics of the IoT system, levels of the IoT system, code levels, and principal project phases. Section IV discusses the issues related to the consolidation of code quality metrics conducted. Section V analyzes the possible threats to validity and limitations of this study. The last section concludes the chapter.

II. RELATED WORK
In the software-related literature, the first suggestions of code-quality metrics appear from the 70s. As a classic example, we can give metrics for sequential code such as cyclical complexity by McCabe [11], or later object-oriented code metrics such as coupling between objects by Chidamber and Kemerer [12], or Lack of Cohesion of Methods by Henderson-Sellers et al. [13]. Later, the published papers aggregated the metrics and demonstrated them in particular case studies. For example, Rosenberg and Hyatt [14] discussed selected aspects of system quality: efficiency, complexity, understandability, reusability, testability, and maintainability. In more recent studies, Heitlager et al. [20] is discussing metrics that impact the maintainability of software systems. The list provided by Heitlager et al. was further extended by Baggen et al. [15] in a study that also focuses mainly on system maintainability aspects. The established metrics such as code volume, code redundancy, unit size, code complexity, unit interface size, and component coupling are discussed [15]. This list partially overlaps with a selection by Pantiuchina et al., who adds cohesion and code readability to the discussion [9]. Code cohesion and coupling indicators are also the subject of an empirical study by Chaparro et al., which examines the impact of refactoring operations on code quality [10]. These two measures were also adopted by common Software Engineering textbooks, such as by Larman [21].
Nunez et al . [22] has recently presented a systematic overview of software quality metrics. However, the study focuses on software, and particular metrics definitions are not provided. Moreover, as the study focuses primarily on software, a revision is suitable for the IoT context. Also, the code quality for the web page user interface has been subject to investigation, and several quality metrics have been proposed, such as testability [23], maintainability [24] or feasibility of test automation [25].
Although all previously discussed code-related metrics are primarily aimed at software, they can reflect specifics of IoT systems applicable to code in these systems after revision. This applies to procedural code in firmware of various devices and object-oriented code usually used in gateways, server-side applications, or web-or mobile-based front-end user applications.
Kaur et al. [26] considered a broad and comprehensive detail of various metrics for Service-Oriented Architecture (SOA), particularly for their code, which is also an area closely related to IoT systems. Their work provides details of different types of cohesion (e.g., coincidental, logical, temporal, procedural, communication, sequential or functional) and their ratio-based perspectives (e.g., Ratio of Cohesive Interactions, Neutral Ratio of Cohesive Interactions, Pessimistic Ratio of Cohesive Interactions, Optimistic Ratio of Cohesive Interactions). Similarly, various coupling levels are given, ranging from Content Coupling (high), Common Coupling, External Coupling, Control Coupling, Stamp Coupling, Data Coupling to a Message Coupling (low), along with detail to available metrics the Coupling Between Objects and Response for class. Besides, they also consider size-based metrics (size of modules) based on counts such as source lines of code, volume, size, effort, length, etc. In addition, other metrics are mentioned, such as control-flowbased and data-flow-based metrics. Include this work in our overview as relevant for IoT systems as well. Another related work on SOA by Kumar et al. [27] targeted source code metrics. They highlight that the maintainability prediction of SOA solutions is relatively unexplored and emphasizes maintainability prediction using several source code metrics. They proposed a maintainability prediction model using a Multivariate Adaptive Regression Splines (MARS) method. This method bases on the divide-and-conquer algorithm. In particular, it separates the data into various regions. Compared with other models, in particular, Multivariate Linear Regression (MLR) and Support Vector Machine (SVM) methods provided better results. They underline that software maintainability comprises (1) analysability, (2) changeability, (3) stability, and (4)  Kreinovich et al. [29] recently proposed a new set of metrics based on complexity and vulnerability of the code; however, a more thorough empirical or experimental evaluation of the proposal is not present in their paper.
The impact of a particular programming language on code quality was recently examined by Berger et al., however, the metrics for this evaluation are based on the density of the defects and the number of commits to a source code repository [30]. Another proposal in this field is the idea of fuzzy software quality metrics by Masmali and Badreddin focusing mainly on code complexity [31].
Wyrich et al. examined code comprehensibility metrics and their relationship with code readability, concluding that the examined metric significantly influences developers' perception of code readability but, surprisingly, not their performance [32].
In the area of IoT-specific code-related metrics, significantly less work can be found. Most studies focus on IoT quality-related topics, which differ from actual code-level metrics. As an example, we can give the quality models by Kim [33] and Kim et al. [34], which serve to evaluate IoT applications and services. However, metrics that focus on different aspects in the fields of functionality, reliability, efficiency, and portability of an IoT system are presented, however, not explicitly focusing on code level [33].
Voas et al. commented on the general IoT metrology [35]. In their work (affiliated with NIST 1 ), they provide multiple 1 https://www.nist.gov/ perspectives on possible metrics. The driving question of their work is how current knowledge can be leveraged for the Internet of Things (IoT). They mention general options to count items such as nodes, clients, gateways, servers, which can give us a sense of size and complexity. However, in their article, particular suggestions of metrics expressed by formulas are not provided.
Devices connected to IoT systems have also been discussed from the point of view of quality and the respective metrics provided by Arauz and Fynn-Cudjo [36]. However, this work focuses on the network level of an IoT system. Regarding the network level, the quality of service (QoS) for IoT systems has been examined by several studies [37], also discussing relevant metrics for this field. These works discuss general IoT systems [38], or, specifically, wireless sensor networks [39].
From the related areas, cloud service quality, a topic also relevant to the IoT system, is being quantified in a study by Zheng et al. [40], where aspects such as availability, reliability, usability, responsiveness, security, and elasticity are discussed.
In 2022 Fizza et al. published a paper on evaluating data quality in IoT smart agriculture applications [41]. They proposed and defined three sensor data quality metrics: suitability, accuracy, and completeness [41].
There already exist some practical examples of using quality metrics to evaluate IoT system's quality. Chanhee et al. propose a Device-to-Device Framework for the Internet of Things, for which the metrics, such as Lines of Code, code complexity, duplication lines, maintainability, degree of the branch usages, duplicated source codes, and modification efforts, are measured using SonarQube platform [42].
Another way of using quality metrics were presented by Mishra and Kertesz. Authors discuss taxonomy categories to compare and analyze features of various MQTT protocol implementations use source code metrics. The measurement is performed using a source code analyzer called CLOC and the individual metrics are the number of files, blank lines, comment lines, and lines of code [43].
Minovski et al. introduce the term term Quality of IoTexperience (QoIoT), comprised from Quality of Experience (QoE) and Quality of Machine Experience (QoME). To measure the QoME the authors propose some code quality metrics, such as reliability or efficiency [44].
Matheus et al. utilized "long established code quality metrics", without stating which they used, to evaluate a proposed semantic network management prototype, called SNoMAC. This system should help developers or system providers to easily interconnect the variety of heterogeneous devices that make up the IoT [45].
To measure the quality and reliability of information processed by IoT sensors, Kuemper et al. propose in their Valid.IoT framework several quality metrics. The individual definitions are based on the several Quality of Information and Quality of Service metrics, such as completeness, timeliness, plausibility, artificiality, or concordance [46]. Several metrics to evaluate performance of an IoT system were specified by Alexopoulos et al.. They divide them into the following categories: (1) Processing speed/capacity, (2) Power Consumption, (3) Memory Requirements, and (4) Functionality Accuracy [47].
Although various aspects of the quality of IoT systems have already been discussed by recent studies and particular metrics suggested for their aspects, no paper specifically focused on a consolidated overview of code-quality metrics for IoT systems.

III. METRICS OVERVIEW
We start the overview of code-related quality metrics by summarizing our methodology. Then, we provide a consolidated overview of the metrics, followed by mapping the metrics to general quality characteristics of an IoT system by their possible impact.

A. METHODOLOGY
In the search for relevant literature on IoT-related metrics and code quality metrics, we followed the simplified methodology, as suggested by Kitchenham et al. [48]. We used six primary databases, IEEE Xplore, ACM Digital Library, Springer Link, Elsevier ScienceDirect, Web of Science, and Scopus. We performed the following general search: which we further adapted to the specifics of individual databases. Apostrophes denote an exact string to be searched for. No time or media type (e.g., journal or conference paper) restrictions were applied during the search.
In the literature search, we applied the workflow depicted in Figure 1. First, we downloaded all papers returned from the databases after the initial query and preliminarily filtered for relevance by title and abstract. This resulted in an initial set of 371 papers. Subsequently, two members of our lab independently analyzed the acquired papers and selected those that contained relevant information on the scope of this study were selected. This process resulted in a set of 79 papers to be further analyzed. We also added other relevant papers using a snowball sampling procedure during the reading phase. During the snowball sampling, 14 more papers were added. Papers were further analyzed in detail and particular metrics were identified, analyzed for duplicity, and extracted. The identified metrics were then stored in a collaborative spreadsheet. Then, final metrics selection and mapping of impact on quality characteristics were made due to the authors' thorough discussion and consensus.
In the presented consolidated overview of code-quality metrics applicable to IoT systems, we use the two-level hierarchy: 1) We define code quality categories, which captures particular aspects of code quality (e.g., complexity, coupling, or cohesion) and can be expressed by several metrics. 2) For each of the code quality categories, we provide a set of possible code quality metrics to quantify the discussed code properties. In our overview, we use the following code quality categories: Size determines the extent of source code of examined system in various measures (e.g., total size, size of the unit, size of its interfaces), Redundancy captures an extent to which the system code duplicates, Complexity measures a degree of code complexity from various viewpoints, Coupling measures a degree to which software components depend on each other and how closely are individual components connected, Unit Test Coverage and Effectiveness captures an extent to which the source code is covered by unit tests and also assessing the effectiveness of these tests, Cohesion measures the extent to which elements of a software module belong together; a different, but, in principle inversely correlating concept to coupling, Code Readability captures ease to understand the code and be able to do changes in it, Security expresses how many vulnerabilities, code security anti-patterns are present in the source code or how large the attack surface is, and, Code Heterogeneity captures how many languages and coding styles are employed if the complete source code is used in an IoT solution. In Section III-B, each code quality category is dedicated to a separate subsection, in which relevant metrics for this category are suggested. Suggestions are accompanied by references to studies that have previously defined and discussed metrics. Apart from those found in the recent literature, we have also suggested metrics that were not yet mentioned in the recent works. In Section III-C we discuss the impact of the code quality metrics presented on the selected general characteristics of the system [49] as well as their relevance to the general layers of an IoT system, the levels of the source code , and the main development phases of an IoT project. Then, in section III-D, we discuss the relation of the presented metrics to code smells, as these two fields overlap.

B. CODE QUALITY METRICS
We discuss the individual code quality metrics in subsections dedicated to each defined code quality category. If no reference is given, the metric is our suggestion.

1) Size
Larger systems might be harder to extend, maintain, and more prone to defects [50]. To express the size of the system under test (SUT) code, we have several options: Line of Code (LOC) metric counts all lines of source code that are not commented or blank lines [20]. Several variants of the metric can be used: Physical Lines of Code (P LOC) to capture the total number of lines in the source code or Logical Lines of Code (LLOC), which counts the actual number of instructions (one physical line of code may translate to several instructions). LLOC can be considered as a more accurate variant. Various open tools can be used to calculate LOC [51] in which a particular definition of LOC could differ. However, controversies are discussed regarding the application of the LOC metric, as this metric itself can have limits in expressing the quality of the system [51], [52]. The discussions concluded that the LOC metric could be useful when combined with other source code metrics.
Estimated Rebuild Value (ERV) captures an effort in man-years needed to build the system and could be more accurate than the sole LOC metric, due to the known fact that in different programming languages, the LOC needed to perform a particular task could differ [15], [20]. This is especially the case for low-level languages like C versus object-oriented programming languages like Java. Therefore, the estimated rebuild value is calculated from LOC taking into account the productivity of a particular programming language. For this, data from Software Productivity Research 2 are used [15].
Average Unit Size (US) is another option to express the code size of a SUT. The structure of the system units has an impact on its overall maintainability [53]. Ideally, individual units should be small, focused, and easy to test and understand. The average size of a unit can be measured as follows.
where KLOC denotes the number of code lines in thousands and n is the number of subsystem units [15], [20].
Unit Interface Size (UIS), i.e., a number of parameters declared in the interface metric, evaluates the unit interfacing property. A unit with an interface with many parameters may be a signal of bad encapsulation [15].
Number of Distributed Architectural Components (DAC) (or blocks) used for quantified metrics can be seen as the number of nodes, clients, gateways, or servers in the overall system. Although there is a lack of work on metrology in IoT, NIST recognizes Network of Things (NoT) [35], and five primitive blocks for things that can be mapped: sensors, aggregators, communication channels, e-Utility, or decision triggers. Using these metrics for these primitives can be a good fit. Thus, they propose to extend the general countable options with things and highlight that the interconnection perspectives matter as well, since counting things is crude and some things are large and others small.
Number of NoT Architectural Components (NAC) (or blocks) [35] extend the number of distributed architectural components/blocks metrics utilizing the number of the following Network of Things (NoT) components: • sensors -measuring physical properties, • aggregators -component that transform data from a sensors, • communication channels -for transmission, • e-Utility -agents to execute processes, • decision triggers -producers of final results. This metric can be easily applied to general IoT systems as well to bring a crude static view of system size and complexity.

2) Redundancy
Source code duplication naturally leads to inefficient maintenance and is the source of possible defects during system updates [54].
Code Duplication (CD) metric is one of the options that we can use to quantify code redundancy in a SUT: where N dup denotes the number of duplicate code blocks over six lines and KLOC denotes the number of code lines in thousands [15], [20].
Percentage of Redundant Code (RC) might be another option to express code duplication. Code is considered duplicated when a group of blocks over six lines appears unmodified more than once in the source code [15], [20]. The number of code lines from which the block is considered duplicate is a result of some empirical observations [15], [20] and can be modified by project context or programming language. VOLUME 4, 2016

3) Complexity
Not only size and duplication might play a role in system maintainability and reliability; complexity is another of these potential factors. To measure complexity, we can employ several options.
Cyclomatic Complexity (CYC) is the number of linearly independent paths through a source code [11]. The higher this number, the more complex the code. For example, the cyclomatic complexity of a method m is defined as: When constructing a control flow graph for m, E is the number of its edges (transfers of control), N is the number of its nodes (a sequential group of statements containing only one transfer of control), and P is the number of its connected components.
Cyclomatic Complexity per Unit (CCU) is defined as the sum of CY C(m) for all methods of a unit: Practically, CCU (u) expresses the number of linearly independent paths in all unit methods u [9], [14], [15], [20]. For object-oriented languages, the same metric as CCU (u) is also known as Weighted Methods per Class [9], [12], where a class is considered as a code unit.
Depth of Inheritance Tree (DIT), relevant for objectoriented languages, expresses the number of ancestor classes of an analyzed class. In terms of the inheritance tree, it is the distance from the discussed class to the root of the tree [14]. The maximum depth of the inheritance tree for all classes in the unit can express an inheritance complexity for a unit of code. An averaged depth of the inheritance tree can be employed as a complementary metric to get better accuracy.
Number of Children (NOC) is another inheritancerelated metric defined as the number of direct descendants of a class in the inheritance tree [14]. Interpretation of this metric requires caution: A higher value might indicate misuse of sub-classing. However, and at the same time, it signals code reuse, since inheritance is a form of reuse. At the same time, a more complex code hierarchy might require more tests.
The number of Thing Interconnections (NTI) brings insight to reliability in case a thing fails or to see the possible composability [35]. It also shows implications for performance. The interconnections are implemented via integration interfaces related to the code level.

4) Coupling
Coupling is the degree to which software components depend on each other. It measures how closely components are connected. Low coupling is often a sign of a well-structured system and good design. When combined with high cohesion, it supports the general goals of high readability and maintainability [55]. Several options are available to measure coupling: Coupling Between Objects (CBO) metric can be used to measure coupling based on class usage. Two classes are considered to be coupled when methods of one class call methods or instances of the other class [10], [12] For a class c, the coupling is defined as where C coup is the set of classes that are coupled to c. The low value of CBO(C) indicates low coupling [10], [12].
Response for a Class (RFC) is defined as the number of distinct methods and constructors invoked by a discussed class [9], [14]. This metric can act as a coupling degree indicator as well as might indicate code complexity.
The number of the incoming call per module (ICM) is another option to express the coupling property. The module is considered as a delimited group of units (e.g., classes) [15]. A higher value indicates a higher degree of coupling in the analyzed module.

5) Unit Test Coverage and Effectiveness
Unit tests' coverage of source code is not exactly a quality of the analyzed source code itself. However, it is an aspect highly relevant for the final reliability of a system. As Heitlager et al. suggested, including unit testing level in code quality measures [20], we follow their suggestion. Unlike Heitlager et al., we name this discussed property as unit test coverage and effectiveness, because, in our opinion, such a name more complies with standard terminology.
Line Coverage (LC) is an elementary metric that captures the execution of the source code by unit tests. A line of code is considered to be covered by a unit test when there exists a unit test that executes the code on that particular line. Line coverage then is defined as where u stands for a unit of code (e.g., class, package, or routine), LOC ex (u) is the number of lines of code executed by a unit test and LOC(u) is the number of code lines of u. However, caution must be taken when interpreting the line coverage figures, as no information is given about the unit tests' real power to detect relevant defects (i.e., the number of assertions or amount of exercised test data combinations are not measured) [56]. Hence, this metric shall be accompanied by a number of assert method metrics defined later for higher accuracy.
Branch Coverage (BC) works on the same principle as line coverage; only the execution branches of the code are measured. A branch is considered to be covered by a unit test when there exists a unit test that executes that particular branch at least once [57]. A definition can be made in analogy to LC(u).
Method Coverage (MC) for object-oriented languages is a more high-level metric expressing the ratio of methods, which are called by unit tests to the total number of methods. A definition can be made in analogy to LC(u).
Density of Assert Methods (DA) is a metric that increases the accuracy of the insight in the probability that unit tests detect defects [20]. We suggest using this metric to complement the line, branch, and method coverage metrics. The density of assert methods can be expressed as where u stands for a unit of code (e.g., class, package, or routine), A(u) is the number of assertion methods applied to the unit u and LOC(u) is the number of code lines of the u.

6) Cohesion
Cohesion is the assessment of the degree to which the responsibilities implemented in a class belong together. High cohesion is desirable, as it promotes the principle of single responsibility, promoting code maintainability [55]. Cohesion and coupling are in inverse correlation [58]. Several options can be employed to measure cohesion: Lack of Cohesion of Methods (LCOM) metric is based on shared variables among the methods of a class C and is defined as where a stands for the number of variables in a class C, µ(A j ) is the number of methods of C accessing the variable A j , and m stands for the number of methods in C [9], [13].
The value of LCOM (C) ranges from 0 to 1, where 0 indicates the highest cohesion of C and 1 indicates its lowest cohesion.
Lack of Cohesion of Methods 1 (LCOM1) is an alternative possible definition of lack of cohesion property. Given n methods, M 1 , M 2 , ..., M n contained in a class C 1 that also contains a set of instance variables {I i }, then for any method M i we can define the partitioned set of otherwise LCOM1 is the count of the number of method pairs whose similarity is zero [26].
Lack of Cohesion of Methods 5 (LCOM5) metric as another possible alternative is based on the field usage of methods of class C and is defined as where N A(C) stands for the number of attributes of C, N M (C) stands for the number of methods of C and T M A(C) is the total number of field accesses to these methods.
is the number of methods that use a field a i of class C. The zero value of LCOM 5(c) indicates the cohesive class C, while the higher values indicate the lower class cohesion [10], [13].
Conceptual Cohesion of Classes (C3) is based on the measurement of textual similarity between class methods. The concept of Marcus et al. employs a latent semantic indexing approach [59] to determine this textual similarity. The definition of the metric is composed of several steps and is explained in [59].
Ratio of Cohesive Interactions (RCI) [26] is the ratio of the number of the actual interactions to the number of all possible interactions [60] between data declarations themselves or the methods. A data declaration a DD-interacts with another data declaration b, if a change in a's declaration or use may cause the need for a change in b's declaration or use. We say that there is a DD-interaction between a and b.
There is a DM-interaction between data declaration a and the method m, if a DD-interacts with at least one data declaration of m. CI(c) (CI for cohesive interactions) is the set of all DD-and DM-interactions. M ax(c) is the set of all possible DD-and DM-interactions on the class interface [26]. For all classes c ∈ C, RCI is to be in the interval [0, 1], and 0 indicates that no cohesive interactions were found, while 1 indicates that cohesive interactions were found. The RCI metric does not take into account indirect interactions [61].
There are three extended measures to RCI: Neutral RCI, pessimistic RCI and optimistic RCI. These capture the degree to which the software modules share features. The underlying idea is that cohesion only matters with respect to exported or public interfaces. A module has a low cohesion because its public procedures and data (properties/variables) have few interactions or between the data themselves. The definitions are: Neutral Ratio of Cohesive Interactions (NRCI) is defined as follows. For all classes c ∈ C, the set of known cohesive interactions CI(c) of a c is denoted by K(c), and the set of unknown cohesive interactions CI(c) by U (c), where M ax(c) is explained in the definition of RCI. According to [62], the interaction is unknown if it is not detectable from the high-level design and is not signaled by the designers. In general, |M ax(c)| ≥ |K(c)| + |U (c)|, since some interactions are not detectable from the high-level design and the designers explicitly exclude their existence [26], [62], [63]. Then, for all classes c ∈ C, where K(c) and M ax(c) are explained in the NRCI and RCI definitions [26]. VOLUME 4, 2016 Optimistic Ratio of Cohesive Interactions (ORCI) can be used as another option to express the cohesion level of classes [26]. For all classes c ∈ C, The definitions imply that if P RCI(c), N RCI(c), and ORCI(c) are not undefined, it can be shown that for all c, it holds: 0 ≤ P RCI(c) ≤ N RCI(c) ≤ ORCI(c) ≤ 1. ORCI(c) and P RCI(c) provide the bounds for the admissible range for cohesion, and N RCI(c) takes a value in between [64]. According to Briand et al. [65] a low value of P RCI(c), N RCI(c), and ORCI(c) indicates that there is no cohesive interaction present, while a high value indicated all possible cohesive interactions present. If the unknown interactions become smaller, the interval [P RCI, ORCI] also decreases accordingly [62], [63]. However, in a situation with a single isolated object, the measures become P RCI(c) = 0, N RCI(c) = undef ined and ORCI(c) = 1, that is, N RCI(c) is undefined when all interactions are unknown and there is no information available on cohesive interactions. In this case, with P RCI(c) = 0 and ORCI(c) = 1 the P RCI(c) and ORCI(c) do not provide stricter bounds than those provided by the interval for cohesion. The fact that N RCI(c) is not defined can be interpreted as the possibility that N RCI(c) can take any value in the interval [0, 1].

7) Code Readability
The readability of the source code potentially enables easier maintenance and decreases a potential number of regression defects in the system [66]. Despite such a property of code that might be difficult to measure exactly, several attempts have been made for such a quantification [14], [66], [67].
Buse and Weimer's Model (BW) employs quantification of structural aspects (e.g., number of branches, loops, etc.) to express the code readability [66]. No simple formula can be presented as a definition; refer to the original study for details.
Model by Scalabrino et al. (SCA) takes a different approach and expresses the readability of the code using the lexical analysis of the source code [67]. For details, see the original study.
Comment Percentage (CP) is calculated by the total number of comments divided by the total lines of code [14].
where N comment is the total number of comments and LOC denotes the number of code lines.

8) Security
Coding practices primarily affect the system that is prone to broken data integrity, compromised privacy, or even data breaches. Security metrics on how to measure them can be formulated.
Number of Security Antipatterns per Line of Code (SAP) express the density of security antipatterns in the source code: where n antipattern denotes the number of security antipatterns and LOC stands for the number of codes lines. Generally, antipatterns in the code can be manually searched based on their description [68], [69], or they can be automatically searched for [70]. Security antipatterns can be detected by analogy [71].
Number of Common Vulnerabilities and Exposures (CVE) measures the density of commonly known vulnerabilities and exposures (CVEs) currently known in the source code: where n CV E represents the number of CVEs in the code and LOC is the number of lines of code.
For instance, a database managed by non-profit organization MITRE that acts as top-level editor and numbering authority for CVEs can be used as a reference 3 .
Number of Zero-day Vulnerabilities per Day (ZDV) measures the density of Zero-day Vulnerabilities [72] that were detected per day of system operation: where n ZDV represents the number of zero-day vulnerabilities found during t operation days of system operation. LOC is the number of lines of code.
Attack Surface Size (AS) quantifies the extent of the accessible system for various types of cyberattacks [73]. Manadhata et al. bases this metric on an automata model and presents an abstract method to quantify the attack surface. For more details, refer to their study [73].
Stall Ratio (SR) expresses an extent to which the source code contains activities that are not necessary to achieve the goal of the program [74]. Chowdhury et al. defines SR for loops as: where N P LOC loop denotes nonproductive lines of code in a loop and LOC loop denotes lines of code in this loop.
Coupling Corruption Propagation (CCP) expresses how many methods can be impacted by an error in the original method if these methods rely on each other due to coupling in the object structure of a program [74]. The metric is defined as a number of child methods called with parameters based on the parameters of the original method call [74]. Critical Element Ratio (CER) measures how many critical data elements, whose corruption might cause a security breach, are present in a class: where n critical denotes the number of critical data elements in a class and n all denotes the number of all data elements in this class [74].
Ratio of Third-party Code (3PC) measures an extent of a code produced and/or maintained by an external party in a SUT. This extent can be measured in the number of lines of code, for instance, as: where LOC external is the number of lines of code produced by an external party and LOC totall is the total number of code lines.
Number of outdated code components (OCC) gives an actual indication of how many components of the SUT are out of date [75]. An outdated component might contain possible security flaws known in the cybersecurity community, or might potentially cause some integration defects during system deployment or production run. At this point, the more the component is outdated, the higher the risk, so to express the actual risk, a metric based on this aging time can be used, the ACC metric presented in more detail.
Aging of code components (ACC) express overall aging of code components that are relevant to be updated in a SUT: where |C| is a set of components relevant to be updated and time ci denotes the time for which a component c i ∈ C will already be updated.

9) Code Heterogenity
Heterogeneity of source code, for instance, used programming languages and coding styles and conventions have a potential negative impact on maintainability. In an approximate way, their properties can be quantified. Number of Different Programming Languages (PL) is a rough indication of the heterogeneity of an overall IoT system. With a growing number of different programming languages, one can expect that maintainability might be negatively impacted. However, other factors might also play a role.
Number of Different Programming Styles (PS) can be considered as the more useful metric that influences the maintainability of the system. Heterogeneity of coding styles and conventions inside one block of code (class, module, or library) negatively impacts the readability of the code and makes it prone to defects. However, this heterogeneity is difficult to express by an exact formula, as distinguishing code styles is difficult to quantify in the source code. Hence, we propose to rate this property on a scale no, low, medium, high, based on the judgment of a senior developer or architect during a code review.
Number of Different Communication Protocols (DCP) is another indicator that reflects the heterogeneity of an IoT system. The larger this number is, the more integration issues will have to be solved, and also the attack surface of the system will increase. VOLUME 4, 2016 Source code quality, which can be expressed by metrics suggested in Section III-B, potentially impacts the quality, reliability, and security of an IoT system. This potential is significant, as the source code is present at many levels of a system, spanning from the device firmware to the server-side code.

C. IMPACT OF METRICS TO SYSTEM QUALITY AND THEIR RELEVANCE TO SYSTEM LAYERS, CODE LEVELS, AND PROJECT PHASES
Although it might be difficult to express an exact correlation between the quality of source code and the overall quality aspect of an IoT system (e.g., in terms of detected defects), this impact exists and can be analyzed. We do so in Table 1. Individual code quality metrics discussed in this study are linked to an IoT system's general quality characteristics, where possible impact might likely be expected. A cross mark on the table captures this fact.
The quality characteristics discussed are a subset of quality characteristics defined in the product quality model of the ISO/IEC 25010:2011 standard, revised in 2017 4 . From this standard, we have selected a set of quality characteristics that are relevant to be impacted by underlying source code quality, namely Analysability, Changeability, Stability and Testability from the Maintainability general category, Functional correctness from the Functional suitability general category, and Confidentiality and Integrity belonging to the Security general category.
Quality attributes from ISO/IEC 25010:2011 taxonomy are designed for general software systems. The programming code in IoT systems that is the subject of our study is a variant of a general software system, where this taxonomy can be used. Therefore, the ISO/IEC 25010:2011 taxonomy is relevant for use in this case. Figure 2 provides an overall overview of ISO/IEC 25010:2011 quality characteristics. These, potentially impacted by code quality aspects discussed in this study, are distinguished by underlined bold.
Mapping given in Table 1 is the result of the following algorithm. (1) Based on the analysis of the literature, their experience with IoT systems, system architectures, code quality, and software development process, the authors provided individual opinions on mapping. In this process, the authors were encouraged to consult other independent experts in the field, and in total, nine more specialists from research and industry fields were involved in these consultations. (2) The data were consolidated into a shared table document. (3) If consensus was reached on a particular code quality metric and the quality characteristic of the system was affected in the sense that more than 75% of the participants had the same opinion, the relation was established as final. (4) If less than 75% of the participants had the same opinion, a discussion was started. In this discussion, arguments for and against were raised, and after consideration of the discussion participants, their opinion was expressed independently again. Then the result was consolidated. Approximately 15% of the mapping items were subjected to this discussion, and consensus was found after the completion of Step (4). 4 https://www.iso.org/standard/35733.html Regarding the literature analysis, which is one of the inputs to the mappings (see Step 1 of the previous algorithm), several sources have been utilized. Savchenko et al. propose the architecture for a maintenance metrics collection and analysis system, where the metrics are presented as a set of 8 scores, based on the quality characteristics of ISO/IEC 2501 (suitability, performance, compatibility, usability, reliability, security, maintainability, and portability) [76]. Motogna et al. uses several object-oriented metrics to propose an approach to assess the maintainability characteristic by its own sub characteristics as defined by ISO 25010 [77]. The metrics they use are Depth of Inheritance Tree (DIT), Coupling Between Objects (CBO), Weighted Methods per Class (WMC) (which is similar to the Cyclomatic Complexity per Unit that is described in this paper) and the Tight Class Cohesion (TCC). Colakoglu et al. recently performed a systematic mapping study, in which they mention both the quality characteristics, as defined in ISO 25010:2011, and the code quality metrics. They provide answers to several interesting RQs, such as which metric levels can be applicable for which application domains, or which quality models have been used in the papers to measure software quality [78].
Another question is to which principal layers of an IoT system the discussed metrics are relevant. An overview is provided in Table 2. Number of system layers can be discussed and in this overview we selected five principal types.
These types are device firmware code using non-object and object-oriented (OO) programming language, back-end server part application, where we assume usage of some object-oriented language, system user interface implemented as a web application, mobile application or a thick-client application, and network infrastructure part.
The overview given in Table 2 was carried out as a consensus of the expert group following the same algorithm as employed in the case of mapping the code quality metrics to the ISO/IEC 25010:2011 quality characteristics (Table 1).
Apart from the principal SUT layers, the code level where metrics can be applied is another relevant topic to be discussed. We do this analysis in the left-hand side of Table 3. We use three principal levels: (1) a project level, where the whole IoT project is considered comprehensively, (2) class level, focusing on an object structure of the code, and (3) function level, where we consider individual class methods or functions in the case of non-object-oriented code.
The right part of Table 3 shows the mapping of code quality metrics to the main phases of the development of the IoT project, where they can be applied. We consider four principal development life cycle phases of an IoT project: (1) technical design and architecture definition, (2) system coding (3), system testing (3) and (4) system operation (live run in a production environment).
To construct the mappings given in Table 3, we employed the same method as was used in the case of previous analyses presented in Table 1 and 2.   In construction of this mapping, several sources in the literature have been used. Sae-Lim et al. conducted an investigation study to find out how developers filter and prioritize code smells [79]. They found that the most important factors for filtering are 1) task relevance, 2) smell severity, and 3) task implementation cost. For the prioritization, the most important factors are 1) the module importance, 2) the task relevance, and 3) the testability. Moreover, individual source code metrics are mentioned as being used to detect code smells in several papers [80]- [83]. We have used analysis of these papers when constructing the map presented in Figure  3.

D. RELATION BETWEEN METRICS AND CODE SMELLS
An important point to mention and discuss is the relation of code smells [84] to the discussed code quality metrics. Our study does not explicitly focus on the problem from the code smells viewpoint. Programming styles for IoT systems might slightly differ by technology or in particular details implied by the type of system or hardware. However, general principles related to code quality and smells are generally valid for source code. As code metrics naturally overlap with code smells, this field is relevant to be discussed, especially in IoT systems.
According to Fowler [68] a code smell is a surface indication in the source code that usually corresponds to a deeper problem in the system. Code smells are not necessarily a problem, but rather an indicator of a problem. They can be seen as code structures that indicate a violation of fundamental design principles and negatively impact the quality of the design [85].
Some code smells are difficult to quantify by exact metrics; however, some can be considered closely related to the metrics discussed in this study.
Based on [86] urgent maintenance activities prioritizing features delivered over code quality often leads to code smells. Thus, code smells are codebase anomalies and do not necessarily impact the performance or correct functionality; they are patterns of bad programming practice that can affect a wide range of areas in a program, including reusability, testability, and maintainability. Code smells must be detected and managed by refactoring [68]. Similarly, Gupta et al. [87] stresses that it is essential to identify and control code smells during the design and development stages to achieve higher code maintainability and quality. Regarding the growing complexity of IoT systems and the potential problem with the number of various possible system configurations, code maintainability is especially important at this point.
In the reverse direction, the presence of code smells in the source code has a potential negative impact on system maintainability [84]. As observed, code smells are strongly influenced by LOC [84].
Gupta et al. [87] identifies 18 common code smells to be found and identified the interrelationship and driving power among these code smells. The motivation behind their work was to identify code smells with high driving power to improve overall code maintainability and to identify which smells are dependent. The effect of their dependency is that developers can refactor one of the dependent smells with higher driving power than to address all found smells and still significantly improve code maintainability. It is common to identify code smells in monolithic systems using code-analysis. For instance, tools like SpotBugs 5 , FindBugs 6 , CheckStyle 7 or PMD 8 can detect code patterns that resemble a code smell. However, the common limitation is that a single code base can be fit. This can be limiting in the case of IoT systems, which typically consist of more components and individual system parts employing programming styles appropriate for different levels (practically from a device firmware to server-side enterprise application).
In a distributed environment, particularly for the microservice, commonly used for IoT middleware, multiple code smells have been identified.In [88], [89], these smells include improper module interaction, modules with too many responsibilities, or a misunderstanding of the microservice architecture. A good example is the existence of a legacy mediator module (similar to an enterprise service bus component in service-oriented architecture) to pass messages between modules, which is unintended. There may be too many standards involved across discrete teams of developers in the overall project when a single standard should be established for consistency across the modules. In the communication perspective, there might exist a wrong cut in the layers. There might be a missing manager component for connections between modules (API gateway), and rather, it used direct communication, which is a wrong practice. Hardcoded endpoints with IP addresses and ports are examples of malpractice. Regarding the development and design process, there might be no versioning of the API involved, or there exist too many MSA modules each for a rather small purpose. Another issue that can be found is shared persistence. This happens when using a single database for multiple modules instead of involving individual data storage. When one module accesses encapsulated data from another service, it can be seen as inappropriate service intimacy. Similarly, access to shared libraries is not preferred. Another smell to be found is a cyclic call dependency.
These smells can be detected manually, requiring assessment and a basic understanding of the system demanding considerable effort. However, these smells are easy to discover almost instantly with no previous system knowledge with code analysis instruments.Such a tool is not yet present. Similar to Microservices, these smells could be identified for IoT. However, this direction is an open research challenge.
To summarize, discussed metrics overlap with code smells to some extent. A poor value of a particular code quality metric can be considered a code smell in some cases.
We analyze the situation in Figure 3. The discussed code properties (and hence code-quality metrics that express this property) are presented on a scale. The left side represents close relevance to code smell, and the right side of our opinion is that the relation is more remote. However, due to certain overlap but no exact relations established and different possible opinions on this problem, the overview presented in Fig. 3 might be subjective and also depends on the particular situation. To give an example, if we discuss Code heterogeneity, different programming languages (and different programming styles) might not necessarily imply problems if an IoT system consists of different devices, each having firmware written in the different programming language. Still, these source codes are well-written and reliable. On the contrary, when different programming styles are mixed in one logical block of code (e.g., one device), such a situation is usually considered as a code smell. Consequences of such a mixture of coding styles in an IoT system are typically lower maintainability of the complete source code base and potential issues with interoperability and integration when individual components are changed during system development and operation.

IV. DISCUSSION
This section discusses some issues arising from creating a consolidated view on code quality characteristics.A significant part of the metrics presented overlap with the metrics for standard software systems, which is logical: Software development for IoT employs current programming styles and programming languages. This also implies that the requirements for good source code overlap between software and IoT. Part of the metrics presented are specific to distributed and IoT systems. As examples, we can give DAC, N T I, N AC, P L or P S.
The given overview may not be complete. Following our goal to include definitions and explanations of the metrics in the paper, a comprehensive overview of possible available metrics would be somewhat outside the limits of a standard journal article. However, due to our field's best knowledge, we tried to select a representative sample of metrics useful for IoT system development.
Regarding the presented categorization of the metrics in code properties (see Table 1), some metrics might potentially belong to more code property categories, depending on the context. As an example, U IS can be given: this metric might belong to the Size as well as Complexity category. Another example are N T I and N AC, which could potentially be assigned to the Size as well as Complexity category. Also, scaling questions arise when applying the suggested metrics for an IoT system. In some parts of the IoT system, lowlevel languages such as C are used (and hence only some of the presented metrics are relevant). On the contrary, objectoriented languages are usually used for the back-end (BE) server parts of IoT systems.Typically, LOC for C code in IoT device has to be interpreted differently than the size of Java EE code on BE server part.
Regarding the code duplication, in our proposal, we have considered code blocks that are identical or identical after leaving comments and white spaces. However, more hidden code duplication can be present in code: the blocks might not be matching syntactically, but in the semantic sense, the code blocks might perform the same action. Recently, we observed this effect in the case of automated tests [90], for example. When one detects the code duplication by more advanced methods, also considering semantic duplication, practically the same metrics can be applied; only the duplicate blocks will be identified by a different method.
Generally, a P S metric may be subjective, as it may be difficult to distinguish individual coding styles; sometimes there is no clear boundary between the styles. To avoid this problem, we used a three-level rating of heterogeneity, which has to be determined by a specialist developer during a code review.
When discussing general quality characteristics in the SO/IEC 25010:2011 taxonomy, we understand the possibility of effectively extending a discussed system as part of Changeability.
Unit test coverage and effectiveness (Section III-B5) is discussed as a relevant category for code quality, and a further detailed discussion can be conducted on the quality of the code of the created tests. As typical examples, unit tests code smells [91] and their fragility [92] can be expressed by metrics and further discussed. However, this field is outside the scope of our study. The unit test coverage metrics presented LC, BC, and M C shall be accompanied by the DA metric, because LC, BC and M C themselves do not explicitly capture a number of assertions in the unit tests, not talking about the relevance of the preformed unit tests in terms of a number of tested variants and data combinations.
Regarding the CV E metric, there are also alternatives for individual languages, for example, for C++ [93]. This proposal is based on the research of common vulnerabilities in C++ code. It considers generic code constructs such as the maximum nested cycles or the number of parameters of the methods and the amount of pointer arithmetic and other language-specific constructs.However, due to the language specificity, we have not included such proposals in the general overview provided in this study.
From a distributed perspective, we cannot expect metrics to add to each other when combined to obtain a holistic perspective. For instance, a SLOC of two modules or things of IoT can be easily measured; however, we cannot expect this to apply in general since one module or thing reliability summed with another does not behave that way. Queuing theory for static networks could be applied to this distributed perspective [35] to quantify reliability. Besides reliability, there are many dynamic metrics relevant to IoT and other distributed systems, such as maintainability, testability, security, safety, privacy, performance sustainability, fault tolerance, and resilience.Trust and security are even more challenging. In particular, in IoT, trust does not have an accepted definition, and security is unbounded as it suffers from unknown

V. THREATS TO VALIDITY
Several concerns may be raised about the completeness of the overview provided. This section discusses these possible limits and describes actions that shall mitigate these issues.
Impact of Metrics on System Quality Analysis we did (see Table 1 in Section III-C) could be subjective and not complete. The mapping was created using the algorithm described in Section III-C to minimize this effect. During this process, if a difference in opinion has been found, a discussion has been initiated, and the final version has been settled as a consensus. The same situation applies to the relevance of the metrics discussed to the main layers of an IoT system (see Table 2) and the mapping of the metrics to the code levels and the main phases of the project (see Table  3). In these cases, we took the same measures to minimize possible bias.
A similar concern might apply for the presented relation of discussed code properties to code smells (Figure 3), which is subjective and based on the analyzed literature and the experience of the authors. To minimize subjectivity, each author independently reviewed this positioning, and in case of different opinions, a consensus has been settled through a discussion.
Another concern can be raised regarding the completeness of the code quality metrics presented. However, a thorough search process and intensive discussions during the preparation of this study will minimize this possibility.
Another possible issue might be that some metrics may be included under a different name in the list. We have identified such an issue in the case ofCY C, published under different names, but practically meaning the same, only in a different context. Other examples might probably be found for other metrics discussed in this study. However, we believe that the description and explanation provided for the individual metrics are sufficient to identify such possible ad hoc cases.

VI. CONCLUSION
This study provided a comprehensive overview of code quality metrics relevant to IoT systems. In this effort, we consolidated various code quality metrics from the available literature, revised them in the context of IoT systems, and further suggested our metrics to capture the properties of the source code of the IoT system. This overview categorized the metrics into nine major code property categories: size, redundancy, complexity, coupling, unit test coverage and effectiveness, cohesion, code readability, security, and code heterogeneity.
A significant part of the presented metrics overlaps with previously published metrics for software systems, which is logical. In IoT systems, software code is developed. Then, part of the metrics are more aimed at IoT and distributed systems.
The second part of the article analyzes the possible impact of the code quality aspects captured by the metrics on the general quality characteristics of an IoT system.To use a unified and established classification of general system quality characteristics, we used the ISO/IEC 25010:2011 standard, recently revised in 2017. Furthermore, the relevance of the analyzed metrics to the principal layers of an IoT system is discussed.
Finally, we discussed the code smells area and its relation to the presented metrics, as there are significant overlaps; in some cases, a lack of code quality expressed by a particular metric can be considered a code smell.
We believe that this overview will be helpful to researchers and IoT testing practitioners in planning quality assurance strategies and discussing the impact of code quality on the overall quality and reliability of a system. MATEJ KLIMA is currently a PhD. student in Software testing Intelligent Lab (STILL), dept. of Computer Science and Engineering, Czech technical University in Prague. In his research, Matej focuses on reliability of Internet of Things and critical systems, for instance model-based testing techniques aimed to situations of limited or disrupted network connectivity, or system component outage, IoT system quality and related metrics or IoT reliability models.
MIROSLAV BURES leads the System Testing IntelLigent Lab (STILL) at the Dept. of Computer Science, Faculty of Electrical Engineering, Czech Technical University in Prague. He was appointed from 2010 at the Czech Technical University in Prague where he is currently the Associate Professor of Computer Science.
His research interests include quality assurance and reliability methods, model-based testing, pathbased testing, combinatorial interaction testing and test automation for software, Internet of Things, and mission-critical systems. He leads several projects in the filed of test automation for software and Internet of Things systems, covering the topics of automated generation of test scenarios as well as automated execution of the tests. MICHAL TRNKA Michal Trnka obtained master's degree in computer science from Faculty of Electrical Engineering at Czech Technical University in Prague and he is currently waiting for his Ph.D. defense at Czech Technical University in Prague. He is also a Fulbright scholar. His area of interest is software engineering and especially context-aware applications security. Lately, he has focused on using context information to enhance security rules, as well as methods to obtain context from trusted sources, such as devices on the Internet of Things.

KAREL FRAJTAK
XAVIER BELLEKENS (BSc, MSc, PhD, FHEA, MBCS, MIEEE) is the CEO and co-founder of Lupovis.io a company that provides dynamic cyber-deception, a Nonresident Senior Fellow of the Scowcroft Center for Strategy and Security at the Atlantic Council advising on criticalinfrastructures, maritime and naval cyber-security and a Lecturer (Assistant Professor) Chancellor's Fellow in the Institute for Signals, Sensors and Communications with the Department of Electronic and Electrical Engineering at the University of Strathclyde. His current research interests include critical infrastructure protection, defence as well as cyber deception and deterrence.
TOMAS CERNY is a Professor of Computer Science at Baylor University. His area of research is software engineering, code analysis, security, aspect-oriented programming, user interface engineering, and enterprise application design. He received his Master's, and Ph.D. degrees from the Faculty of Electrical Engineering at the Czech Technical University in Prague and his M.S. degree from Baylor University.
From 2009 to 2017, he was an Assistant Professor of Computer Science at the Czech Technical University, FEE, Prague, Czech Republic. In 2017, he was a Postdoctoral fellow at Baylor University, Texas, USA, and from the same year he continues as Assistant Professor with the concentration on Software Engineering. BESTOUN S. AHMED obtained his Ph.D. from University Sains Malaysia (USM) in 2012. Currently, he is an Associate Professor at the department of mathematics and computer science, Karlstad University, Sweden, and a researcher with the Czech Technical University in Prague. His main research interest includes Combinatorial Testing, Search-ased software testing (SBST), Machine Learning Testing, Applied Soft Computing, and quality assurance of smart devices and IoT systems. VOLUME 4, 2016