A Catalogue of Agile Smells for Agility Assessment

Background: The Manifesto for Agile Development has already inspired many software development methods such as Scrum, XP, and Crystal Reports. However, being “agile” is not trivial and only a few companies are capable of mastering so-called agile practices. Failure to apply the agile approach properly can do more harm than good and may jeopardize the benefits of an agile method. Thus, evaluating an organization’s ability to apply agile practices using an agility assessment tool is critical. Aims: In this paper, we extend the metaphor of code smell and introduce the term agile smell to denote the issues and practices that may impair the adoption of the agile approach. The focus of the paper is defining and validating a catalogue of agile smells that can support agility assessment. Method: A literature review and a survey were conducted to identify and confirm the characterization of agile smells. Once identified, the agile smells were organized in a structured catalogue. Results: The literature review found 2376 references published between 2001 and 2018. We selected 55 papers for full consideration and identified 20 agile smells. The survey consulted 20 participants to determine the relevance of the selected agile smells. Conclusion: We have identified a set of 20 agile smells that were ranked according to their relevance. For each smell, we proposed at least one strategy to identify the smell’s presence in real projects. The catalogue can be used by companies to support the assessment of their agility ability.


I. INTRODUCTION
The adoption of agile methods by the software development industry has increased significantly in recent years. Almost all software companies claim they are ''agile'' at some level and they are using agile practices in their software processes [1]. Being agile has become a critical factor for the Software Industry. Among the expected benefits of being agile are the acceleration of software delivery through the ability to manage requirement changes and productivity increments [2].
However, the proper adoption of an agile method (or agile practices) is not straightforward and the misuse of agile practices should not be ignored since it may jeopardize the benefits that an agile method should bring to the organization. It is quite common to find organizations new to agile software development techniques, adopt a few agile practices, adapt them in the way they prefer and convince themselves they The associate editor coordinating the review of this manuscript and approving it for publication was Resul Das . are doing agile software development until they eventually realize there are no or few improvements in their software processes [3]. Ambler [4] revealed numerous project failures associated with agile development. In the 2018 IT Project Success Rates Survey TM , 36% of the participants reported that they had experienced challenges in an agile project, and 3% of the participants reported complete failure [5].
Thus, an Agility Assessment (AA) tool is a critical approach to assist projects, organizations and even individuals in understanding their agility skills and identifying potential problems that should be resolved to improve the adoption of agile methods [6].
The problem we have observed is the lack of an objective criteria for conducting an agility assessment. Despite the substantial amount of content about agile development in both academic forums and industry, there are few contributions that focus on providing elements to support agility assessment. The Manifesto for Agile Development [7], for example, proposed a set of values and principles that have inspired many agile methods. However, using these values VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and principles as a base to assess the agility of a given organization or project is quite difficult. It is challenging and subjective to assess whether an organization or project is properly applying values such as the requirement to focus on ''individuals and interactions over processes and tools''. Agile methods such as Scrum [8], XP [9], Crystal Family Methods [10] and Open Up [11] or other studies that consolidated the body of knowledge around agile development do not provide objective requirements for assessing the adoption of agile practices. The so-called agile values, principles, practices and characteristics are typically described: (a) in a generic way; (b) to be used as reference for projects or organizations that aim adopting agile, or (c) to inspire discussions among the team in retrospective meetings. Agility assessment approaches need objective criteria otherwise the assessment may be threatened by biases imposed by the person(s) conducting the assessment. This paper tries to fill this gap by proposing a set of practices focused on agility assessment. We borrowed the term code smell [12] and extended it to agility assessment. A code smell denotes an indication that may correspond to a deeper problem in the software source code or architecture. The term was popularized by Fowler and Beck in [12]. The authors used this metaphor and proposed a catalogue of code smells that can be used to guide the identification of potential problems that could be fixed through the application of refactoring techniques. We are using the term agile smell to denote a practice that may impair the proper adoption of agile development.
This paper aims at identifying a set of agile smells and organizing them in a structured format. We are also proposing, for each agile smell, at least one strategy that guides the identification of the occurrence of that agile smell in an agile project. The methodology of this study is divided into three phases: (a) an elicitation phase that includes a literature review; (b) a confirmation phase that includes a survey with practitioners; and (c) a cataloging phase that attempts to organize the agile smells in a structured format.
This study tries to answer the following research questions:

RQ1:
What are the practices that impair the proper adoption of agile development and can be used to support the agility assessment of organizations, projects, iterations and agile teams? RQ2: How can we identify the occurrence of such practices?
The aim of RQ1 is to identify a set of items that we are naming agile smells, which are: practices that may jeopardize the adoption of agile development and that can also be used to support organizations and agile teams to assess how they are using agile practices. To answer RQ1, we are proposing a catalogue of agile smells that were identified through a literature review and confirmed by a survey. The aim of RQ2 is to propose strategies to identify the occurrences of agile smells. An identification strategy is important to make agility assessment less subjective and less compromised by evaluator bias. These strategies will aid and guide practitioners to quickly spot the occurrence of agile smells in an agile project. We sought to answer RQ2 by proposing at least one identification strategy for each agile smell. By answering these two questions, we expect to provide a baseline to support agility assessment at organizational and project levels.
An early version of this study was introduced in [13], which presented a preliminary version of the catalogue and a small set of agile smells. We have improved on the study in [13] by: • broadening the literature review; • expanding the number of participants in the survey; • consolidating the set of agile smells; • adding more information into the catalogue; and • adding the catalogue use guideline. The remainder of this paper is organized using the following structure: Section II presents the background for this research. Section III describes the study methodology. Sections IV and V describe, respectively, the literature review and the survey conducted to identify and confirm the agile smells. Section VI describes the catalogue design and presents a subset of the resulting catalogue. The catalogue use guideline is introduced in Section VII. Section VIII discusses the related work and Section IX presents the results and the threats to the validity of this study. Section X concludes the paper.

II. BACKGROUND
In this section, concepts that include Agile Development, Agility Assessment and Code Smells are discussed to provide background material for the reader.

A. AGILE DEVELOPMENT
In 2001, as a response to a community that demanded more flexible processes, a group of 11 practitioners and consultants in software development produced what they named Manifesto for Agile Development [7]. The values and principles that were the foundation of the manifesto and that were proposed as an attempt to influence the software development community are presented in the following four values and 12 underlying principles in Figure 1: While there is no formal agreement on the meaning of the concept of ''agile'', in this research, ''Agile Development'' means software development processes or methods that are shaped around the values and principles in Figure 1. These methods include, but are not limited to: XP [9], [14], [15], Scrum [8], [16]- [18], Crystal Family [10], Feature Driven Development (FDD) [19], [20], Dynamic Systems Development Method [21], Adaptive Software Development [22], and OpenUp [23].

B. AGILITY ASSESSMENT
The adoption of so-called agile practices may not be straightforward [24]- [26]. The 13th Annual State of Agile Survey TM [1] revealed that, although 94% of companies surveyed claimed they are using agile practices, only 4% of the companies indicated they are mastering agile practices. In this scenario, it is important for organizations to identify their gaps in agile practices, otherwise, the organization may not receive the benefits of adopting them [4].
Agility Assessment (AA) comprises assessment techniques, models and tools that focus on indicating problems in adopting agile practices at a project-level, organizationlevel or individual-level. There are many approaches for AA such as agility assessment models, agility checklists, agility surveys, and agility assessment tools [27]. In Section VIII we present some of these AA approaches and discuss how they relate to this study.

C. CODE SMELLS AND AGILE SMELLS
The term code smell was popularized by Fowler and Becker [12] to describe poor design solutions and code structures that should be analyzed carefully. The authors proposed a baseline catalogue of code smells that is divided in three categories (Application-level, Class-level and Method-level) and includes 22 smells such as Duplicated Code (identical or very similar code exists in more than one location), Shotgun Surgery (a single change needs to be applied to multiple classes at the same time), Large Class (a class that has grown too large), Feature Envy (a class that uses methods of another class excessively), Inappropriate Intimacy (a class that has dependencies on implementation details of another class), Cyclomatic Complexity (too many branches or loops), Too Many Parameters (a long list of parameters), and Long Method (a method, function, or procedure that has grown too large).
In this study, we extend the term Code Smell to the context of agile development and propose the term Agile Smell. An Agile Smell denotes a practice likely to impair the proper adoption of agile development.

D. AGILITY ASSESSMENT AND AGILE SMELLS
The catalogue of code smells [12] proposed by Fowler and Becker and other contributions [28]- [30] have been broadly used by the software industry to assess the quality of their source-code. However, manually identifying the occurrence of code smells in projects that could have hundreds of thousands of code lines is costly and neither effective nor efficient and a more scalable technique is needed [31], [32]. One of the approaches to optimize the source-code quality assessment is through automatically detecting code smells [29], [32]- [34]. These techniques require the specification of a code smell in a specific language. The DECOR [32] method proposed by Moha et al., for example, is organized in four steps: Description Analysis, Specification, Processing, Detection, and Validation. In the Specification step, the smells are coded in a specification language. These specifications are then used as input for the Detection step that assesses the code and finds potential code smells. The code smells identified in this step are confirmed in the Validation step.
In this sense, the catalog of agile smells proposed in this study could be used to guide agility assessments and be a prelude for automatic detection of agile smells.

III. STUDY ORGANIZATION
This section presents the research methodology followed to identify and confirm a set of agile practices that may impair the adoption of agile methods (AKA agile smells). The methodology of this research was based on the method proposed by Spinola et al [35] and consists of four steps divided into three phases as depicted in Figure 2: Phase 1 -Elicitation: The first phase, elicitation phase, was divided in two steps: (1) An informal literature review that was conducted to identify basic concepts that supported the definition of an accurate and comprehensive systematic literature review protocol; and (2) A systematic literature review that was planned and executed to identify a set of agile smells. The systematic literature review design details, the mechanisms and collected data and the set of identified agile smells are described in Section IV.
Phase 2 -Confirmation: In the confirmation phase, we conducted a survey with industry practitioners to confirm the agile smells identified in the elicitation phase and reveal their relevance. The survey is described in Section V Phase 3 -Consolidation: In this phase, the most relevant agile smells were organized in a structured format named the Catalogue of Agile Smells. This catalogue is presented in Section VI.

IV. ELICITATION PHASE: SYSTEMATIC LITERATURE REVIEW
In the elicitation phase, a systematic literature review (SLR) was conducted to explore the existing body of knowledge and identify a set of agile smells (ie. practices that may impair the proper adoption of agile development).
The methodology of the SLR was based on the method proposed by Kitchenham and Charters [36] and consists of three main phases: planning, execution and reporting.

A. SYSTEMATIC LITERATURE REVIEW PLANNING 1) AIM, RESEARCH QUESTIONS AND SCOPE
The aim of the literature review is to identify elements that allow us to answer the RQ1 and RQ2 questions. In other words, the goal of the SLR is to discover (i) a set of practices that may impair the adoption of agile development (AKA agile smells) and (ii) strategies on how to check for the occurrence of these practices in real projects. Since the literature does not use the term ''Agile Smell'', we extracted the agile smells from agile practices, rules, constraints or restrictions. The research questions for this SLR were derived from the RQ1 and RQ2 (presented in Section I) and can be summarized as:

SLR-RQ1:
What are the practices that impair the proper adoption of agile development? SLR-RQ2: How can we identify the occurrence of such practices?
The scope of this review was defined based on the population, intervention, comparison and outcome (PICO [37]) approach. The Population is the set of software development projects. The Intervention is the collection of agile software development processes. There is no comparison. The outcome is a set of agile rules, constraints, practices and techniques. Three papers obtained from a previous conventional literature review were used as control: The sources are collected from the following digital databases, including conferences, journals and technical reports indexed by ACM Digital Library, IEEE Xplore, Scopus, and Web of Science. The search string taken as the basis for all search engines, structured according to Pai et al. [37] was: (''software process'' or ''software project'' or ''software systems'' or ''software development'' or ''software engineering'') and (''agile methods'' or ''agile processes'' or ''agile approaches'' or ''agile methodologies'' or ''agile development'') and (''rules'' or ''constraints'' or ''restrictions'' or ''practices'' or ''technics'' or ''techniques'' or ''classification'') The set of formal literature studies includes all articles returned by the protocol that meets at least one of the following inclusion criteria (IC): (IC1) Documents must address one or more agile methods; (IC2) Documents must discuss practices, characteristics, rules or constraints related to an agile method.
Publications that satisfy at least one of the following exclusion criteria (EC) were omitted: (EC1) Documents not written in English; (EC2) Documents whose full text is not available; (EC3) Documents clearly dealing with topics irrelevant to the purpose of this review; (EC4) Documents merely reporting the use of individual software processes in development projects; (EC5) If the same study has been published more than once, the most relevant version, such as the one explaining the study in greatest detail will be used and the others will be excluded.

2) DATA EXTRACTION CRITERIA
To identify and extract the agile smells from the selected studies, we defined two data extraction criteria (DEC): (DEC1) an agile smell is a practice that may impair the adoption of agile methods; and (DEC2) the occurrence of an agile smell should be objectively verified. The DEC1 criterion defines an agile smell as a negative practice that should be avoided. The DEC2 criterion was introduced to reduce the risk of identifying agile smells that are vague or hard to be verified through objective strategies. Note that the gap this research is trying to fulfill is the lack of objective criteria to perform an agility assessment. The values and principles proposed by the Manifesto for Agile Development and the methodologies derived from the manifesto are described in a vague way [41], [42]. Therefore, identifying agile smells that are difficult to be objectively verified would not differentiate them from the body of knowledge already consolidated in this area.
The following information was extracted from each paper selected after running the data extraction process: document title, author(s), source, year of publication, agile method, agile smell name and agile smell description. The results were tabulated. Analysis was carried out to identify duplication.

B. SYSTEMATIC LITERATURE REVIEW EXECUTION
After the planning phase, seven steps were applied in the execution phase to select the primary studies: • Step 1: Initial Search. We applied the search string to the selected digital databases. A broad number of studies was retrieved in this phase: ACM Digital Library (438), IEEE Xplore (564), Scopus (2233), and Web of Science (1592).
• Step 2: Combination. Since the digital databases index many of the same publications [43], we combined the results and the total number of studies after this step was 2376. All the control studies were retrieved.
• Step 3: Filter by Title. This step aimed at applying the exclusion criteria EC1, EC2, EC3 and EC4 by reading the title of the studies. After this step, the number of papers was reduced to 261.
• Step 4: Filter by Abstract. This step aimed at applying the exclusion criteria EC3 and EC4 by reading the abstract of the studies. At the end of this step, 127 studies remained.
• Step 5: Filter by full text. It consisted of filtering the selected studies by reading their full text and applying the exclusion criteria EC3 and EC4. At the end of this step, 42 studies remained.
• Step 6: Removal of repeated studies. We applied the exclusion criterion EC5 and removed two studies. After this step, the number of papers selected for full consideration was reduced to 40.
• Step 7: Addition by Heuristic. We inserted 15 relevant studies from other sources, totaling 55 studies. These studies were added manually, based on our background knowledge. Appendix X shows the final list of studies considered in this literature review. Figure 3 shows the process and the results obtained in each step. The selected documents were fully read and the data extraction criteria applied to identify the agile smells.

C. LITERATURE REVIEW REPORTING
During the SLR, we identified many agile values, practices and characteristics. However, none of the studies investigated agile methods from the perspective of this study, namely, trying to identify a set of agile practices that may impair the adoption of agile methods. The SLR confirmed that most of the body of knowledge around agile development focused on adoption of agile development rather than agility assessment. The studies neglected to describe explicitly how to verify whether the values, practices and characteristics of agile development have been properly adopted.

3) Absence of Timeboxed Iteration: The Timeboxed
Iteration practice defines that all iterations should have a fixed time duration. Thus, an iteration should not be extended or shortened to fit planned or unplanned features. The Absence of Timeboxed Iteration smell is detected when an iteration is shorter or longer than the predefined duration. The presence of this smell may indicate the timeboxed iteration practice has not been applied properly. References: [8], [10], [16]- [18], [22], [38], [46], [48]- [50], [53]- [55]. 4) Absence of Timeboxed Meeting: This smell derives from an agile practice that states the meetings prescribed by the agile method (iteration planning, review, retrospective, etc) should have a predefined duration and the duration should preferably be the same during the entire software project. The Absence of Timeboxed Meeting smell is detected when a given meeting (prescribed by the agile method) is shorter or longer than the predefined duration. The presence of this smell may indicate the team is not properly conducting the meeting or they are not planning the meetings properly. References: [18], [44], [46], [48], [54]. [52], [57]. 5) Complex Tasks: Complex tasks should be avoided in agile projects. They should be decomposed by the development team into simpler tasks. The Complex Tasks smell is detected when there are complex tasks in a given iteration. The presence of this smell may indicate that the developers are not properly breaking complex tasks into simpler tasks. References: [8], [16], [17], [40], [46], [47], [49], [52], [55], [61], [62]. 6) Concurrent Iterations: In an agile project, the entire team should focus on the same iteration goal. Running two (or more) consecutive iterations means the team is divided and focused on different goals. The Concurrent Iterations smell is detected when there are two (or more) open iterations in the same project. The presence of this smell may indicate the development team is not focused on the same goal. References: [18], [55], [63], [64]. 7) Dependence on Internal Specialists: One characteristic of an ideal agile team is one in which any participant can work on any feature. Thus, the team should avoid the situation where a member becomes the only specialist in a feature or technology. The Dependence on Internal Specialists smell is detected when all tasks related to a given feature were assigned to the same developer. The presence of this smell may indicate the creation of an internal specialist and the project is becoming dependent on a specific developer. References: [2], [8], [9], [14], [15], [17], [18], [44], [45], [47]- [50], [54], [55], [65]- [69]. 8) Goals Not Defined: Agile development teams need to know exactly what they are working on and the goals of the project and iterations should be clear and welldefined. The Goals Not Defined smell is detected when the goals of the project or of a given iteration are not defined. The presence of this smell may indicate the development team does not have a clear view of the goals and therefore could not choose the most important work to do. References: [8], [10], [16], [17], [19]- [23], [38], [46], [48], [51], [69]. [52]. 9) Iteration Started without an Estimated Effort: The scope and duration of the iterations in an agile project are typically defined by the development team that must commit to the iteration goals and deadlines. The Iteration Started without an Estimated Effort smell is detected when an iteration that contains nonestimated tasks is started. The presence of this smell may indicate that the development team is committed to a deadline without a good understanding of the effort to deliver the iteration scope. References: [18], [61], [63], [64], [67], [69]. 10) Iteration Without a Deliverable: The practice of delivering products continuously and frequently is very important to agile methods and can be considered a mantra among agile software developers. The agile methods state the development team should deliver a new version of the software at the end of each iteration. The Iteration Without a Deliverable smell is detected when an iteration does not have an associated deliverable product. The presence of this smell may indicate that the continuous and frequent delivery practice has been jeopardized. References: [8]- [10], [14]- [23], [40], [44]- [46], [48], [49], [51], [53]. 11) Iteration Without an Iteration Planning: Iteration planning is an important success factor in agile methods.

16) Long Break Between Iterations:
To promote sustainable development and understand its productivity, the development team must measure all the work done.
Since the work done during the interval between iterations is typically not counted in productivity assessment, long breaks may impact the way the team measures its productivity. The Long Break Between Iterations smell is detected when there is a break between two consecutive iterations longer than a predefined and recommended size. The presence of this smell may indicate the development team is working on untraceable work that can harm the calculation of team productivity. References: [18], [46], [55], [57], [60], [63], [64]. 17) Lower Priority Tasks Executed First: In an agile project, the development team should focus on higher priority tasks. The Lower Priority Tasks Executed First smell is detected when tasks with lower priority are executed before tasks with higher priority. The occurrence of this smell may indicate that the development team has not worked on the highest priority tasks. References: [8]- [10], [14]- [17], [19]- [23], [44]- [46], [48], [49], [51]- [53], [55], [60], [69]. 18) Shared Developers: In an agile project, business people and developers must work together daily throughout the project. Developers are expected to become experts in the project scope and switching a developer across multiple projects does not contribute to the involvement of that developer in the project. The Shared Developers smell is detected when a developer is working on more than one project at the same time or when that developer is frequently switching between different projects. The presence of this smell may indicate the organization is not properly allocating the developers. References: [2], [18], [46], [52], [53], [55], [61], [65].

19) Unfinished Work in a Closed Iteration:
The entire scope of an iteration should preferably be delivered at the end of the iteration. But, as the iteration should be timeboxed, the development team must finish the iteration by the predefined deadline even if there is unfinished work. In that case, those unfinished work should be moved to the product backlog to be used in a future iteration planning. The Unfinished Work in a Closed Iteration smell is detected when a given iteration is closed even with unfinished tasks. The presence of this smell may indicate the team is not properly managing the backlog items and not moving unfinished work to the project backlog. References: [18], [46], [55], [63], [64]. [52], [69]. 20) Unplanned Work: Agile teams usually commit to delivering a set of features before an iteration begins. To achieve the agreed commitment, the teams must work without interference, following the iteration plan and unplanned work should be avoided. The Unplanned Work smell is detected when tasks are included in a given iteration after it starts. The presence of this smell may indicate the unplanned tasks are jeopardizing the VOLUME 8, 2020 commitment with the iteration deadline. References: [18], [44], [46], [48], [51], [52], [55], [60], [69]. To answer SLR-RQ2, we propose at least one identification strategy for each one of the agile smells identified in the literature review. For example, for the agile smell Complex Tasks, the following identification strategy was proposed: 1) Identification Strategy for the Complex Tasks smell: A strategy to identify the presence of the ''Complex Tasks'' smell is to verify whether the tasks estimates exceed an allowable threshold. The identification strategies for other agile smells are presented in the catalogue in Section VI.

V. CONFIRMATION PHASE: SURVEY
In order to confirm the results from the literature review, we conducted a survey with practitioners based on semistructured interviews [72]. The remainder of this section presents the survey that was based on the protocol proposed by Oishi [73]. The survey was divided into three phases: planning, execution and reporting.

A. SURVEY PLANNING 1) AIM AND RESEARCH QUESTIONS
The aim of the survey was to evaluate the relevance of the identified agile smells for an Agility Assessment. That is, how relevant is each of the agile smells to assess how an organization is using agile practices. The research questions for the survey are:    The questionnaire accepted the following answers: Each answer has an associated value that varies from 0 to 3 (based on the relevance of the identification strategy) and that is used to calculate the relevance of the agile smell. The relevance of an agile smell, to a given participant, is the sum of the answers of Question 1 and Question 2 as shown in Fig. 4. Thus, to a given participant, the most relevant agile smell achieves a 6-point score and the least relevant agile smell has a 0-point score.
The final relevance of an agile smell is the sum of the relevance for all participants as illustrated in the formula presented in Fig. 5.

3) PARTICIPANTS SELECTION
We applied a convenience sampling approach [74] and participants were selected from our professional and academic networks. The criteria for the selection of participants were: (a) the participant should have at least 5-years experience as a Project Manager or Quality Assurance Consultant and (b) the participant should work or have worked in an organization that adopts a software process based on agile methods. We avoid selecting participants that are aware of this research, so we excluded coauthors and coworkers. During the planning phase, we conducted a preliminary analysis using subjects from inside our research group. The data from this execution was not considered in the final results. Our goal was to collect feedback from the participants and assess the interview plan.

B. CHARACTERIZATION OF PARTICIPANTS
During the analysis phase, 20 candidate subjects were chosen to be interviewed. We focused on practitioners working on agile projects with relevant experience in this topic. Table 1 presents a summary of the participants characteristics.
The selected subjects included 15 Project Managers and 5 Quality Assurance Consultants. Regarding the highest schooling degree, 4 participants have doctoral degree, 6 participants have master degree, 7 participants have bachelor degree and there are 3 participants with associated degree. The distribution for years of professional experience is: 12 participants have between 5 and 10 years of professional experience, 12 participants have between 11 and 20 years and three participants have more than 21 years of professional experience. Regarding the geographic distribution, 16 participants are from Brazil and four from Canada.

C. SURVEY REPORTING
In the last phase, the data collected in the survey were organized, tabulated and analysed. Table 2 presents a summary of the data collected and analyzed in the survey. The table shows the agile smells in relevance order (the most relevant smells are shown first) and the column Rank indicates the order in the list. Columns S1 to S20 represent the raw data collected in the survey (ie. the answers that each participant provided). These columns are divided in two sides: the left value refers to the answer to Survey-RQ1 and the right value refers to the answer to Survey-RQ2. As explained in the research protocol section, the values vary from 0 to 3 (No relevant to Absolutely relevant). The Total column is the final degree of relevance for the agile smell and was calculated according to the formula in Figure 5.
Note that, as we did not define any tiebreaker criterion, the agile smells Shared Developers and Unplanned Work are technically tied. The same issue occurs with the agile smells Large Development Team and Long Break Between Iterations.

D. DATA ANALYSIS DISCUSSION
Most of the agile smells received a positive value for the degree of relevance (ie, they were considered Slightly relevant, Relevant or Absolutely relevant). If we take the top 10 most ranked agile smells in Table 2, they were all considered at least Slightly relevant to all the participants. Figure 6 shows the distribution of the degree of relevance the agile smells received in the survey. The number of Not relevant answers was considerably low (only 4.25%, or 34 in 800 responses). The numbers of Slightly relevant, Relevant and Absolutely relevant were, respectively, 32.25% (or 258 in 800), 43.75% (or 350 in 800) and 19.8% (or 158 in 800). These data reveal the identified agile smells are coherent with practices adopted by industry and they could be ultimately used to assess how the agile methods are being applied.
Most participants assigned different degrees of relevance for the presented agile smells. In other words, for most of the participants, some agile smells are more or less relevant than others. That perception was crucial to build the ranking of the most relevant agile smells. Indeed the difference between the relevance of the agile smells at the top and at the bottom of the ranking is significant. While the three top-ranked agile smells vary from 96 to 86 points, the three agile smells at the bottom of the ranking vary from 57 to 49. This difference VOLUME 8, 2020  between the degree of relevance illustrates that the agile smells may impact the adoption of agile methods in different levels.

VI. CONSOLIDATION PHASE
The aim of this phase was to consolidate, complement and organize the agile smells obtained and confirmed in the previous phases as a structured catalogue. Regarding the catalogue structure, the agile smells were described using a template adapted from [75] and shown in Table 3.
The Id and Name sections indicate, respectively, the unique identifier and the name of the agile smell. The Description section presents a brief description of the agile smell and contains: (a) the motivation behind the agile smell and (b) the likely consequences if the agile smell occurs. The Target section indicates which element is being assessed when an occurrence of an agile smell is identified. It can assume the values: Organization, Project, Iteration or Team. The Agile Methods section presents the agile methods practices that motivated the agile smells. Thus, this section establishes a connection between the agile methods analysed during the literature review and the agile smell. The Industry Perspective section discusses the agile smell relevance from the perspective of the consulted industry practitioners. The Relevance section represents the degree of relevance obtained in the survey converted to the percent of maximum possible score (POMP) [76]. The Identification Strategy and Parameter sections describe strategies to detect the occurrence of the agile smell in real projects. These sections are designed to support approaches aimed at automatically detecting the occurrence of agile smells in real projects.
For brevity, we have selected the 10 highest ranked agile smells to present in this paper: (Tables 4 to 13).

VII. CATALOGUE USE GUIDELINE
In this section, we describe an use guideline for the catalogue.
Who can use the catalogue? The catalogue can be used by researchers and practitioners that aim at executing an agility assessment and the users may include developers, project managers or quality assurance consultants.
How to use the catalogue? The guideline is based on the method proposed by Moha et al. [32] and is composed of four steps as depicted in Figure 7: • Step 1: Selection: In this phase, the agile smells that will be used in the next phases are selected. The selection of an agile smell should be based on the relation between the smell and the agile method adopted by the project. Agile smells that are not related to any agile practice adopted by the project should not be selected in this phase.
• Step 2: Identification: This phase aims at identifying the occurrence of the agile smells selected in the previous phase. In the remainder of this guideline, an occurrence of an agile smell will be denoted as an issue. The Identification Strategies (presented in the catalogue) are manually applied and the project data assessed to identify the issues.
79248 VOLUME 8, 2020   • Step 4: Reporting: The positive issues are consolidated and a report is generated.

VIII. RELATED WORK
Studies that aim at consolidating the body of knowledge around agility assessment are in some level related to this paper. We divide those studies in two categories: (a) studies that aim at identifying agile practices and characteristics to support the adoption of agile methods, and (b) studies that aim at identifying requirements to support agility assessment.

A. MAPPING AGILE PRACTICES
This category includes studies that identified agile practices or characteristics to support the adoption of agile development. We consider these studies related to our research because the identified agile practices could be ultimately used as input to an agility assessment process. Miller [38] conducted one of the first studies that aimed at investigating characteristics of agile development. Among the identified characteristics, we can mention: modularity on development process, iterative with short cycles, timebound with iteration cycles from one to six weeks, parsimony in development processes removing unnecessary activities, adaptive with possible emergent new risks, incremental process approaches that allow functioning application building in small steps, convergent (and incremental) approach, and people-oriented. VOLUME 8, 2020  Collaboration. This method was one of the first structured approaches to guide the use of agile practices. However, it differs from our study mainly by specifying the practices in a generic way and by not giving an indication of how to check if they were properly adopted.  In [78], Shore and Warden described a set of practices to guide the adoption of XP and other agile methods. The practices are divided into two categories: Practicing XP and Mastering Agile. The first category is focused on the XP Although many others papers aim at mapping agile practices [46], [67], [79]- [82], these studies do not investigate agile methods from the perspective of this study, namely, trying to identify practices that impair the use of agile development and indicate how these practices can be checked in real scenarios. The agile practices presented in the studies above are typically described for educational purposes rather than focusing on agility assessment. That is, the intention of the authors is to guide development teams and organizations in how to apply the agile practices. There is little focus on verifying whether the agile practices have been properly adopted.

B. AGILITY ASSESSMENT REQUIREMENTS
Studies in this category describe practices and requirements to be checked in an agility assessment.
Yatzeck proposed in [63] a two-checklist method to aid the adoption and assessment of the agile process in large companies. The first checklist is focused on guiding the adoption of Scrum and it is composed of 10 items. The second checklist, called ''You Should Immediately Be Suspicious If'', describes 8 practices that may indicate misuse of agile practices: 1) ''There is no high-level architecture'', 2) ''There is no plan'', 3) ''There is no project dashboard, or you don't have access'', 4) ''You aren't invited to an iteration planning meeting and a showcase for every iteration'', 5) ''You don't get any escalations coming out of the planning workshop'', 6) ''The team performs perfectly in Iteration 1'', 7) ''You aren't welcome to join daily standup Scrum meetings as an observer'', and 8) ''You can't get metrics about software quality''. These items are similar to the Bad Agile Smells proposed in this study since they describe practices that may jeopardize the adoption of agile methods. However, the study differs from ours in three aspects: (a) the practices are described in a generic way and there is no indication of how they could be checked in real scenarios. Therefore, the verification of these practices may be threatened by the bias of the person performing the agility assessment that has to interpret the practice and determine how to check it; (b) the items are focused on the Scrum method; and (c) there is no clear relation between the items and the agile practices that motivated them. The author did not explain the origin of the items. The author also failed in not describing the criteria that define the target companies (large companies).
Hermida proposed an online agility assessment approach called ''Abetterteam'' [83]. The tool has a questionnaire composed of 30 three-option questions. The author claimed the tool is able to verify the adoption of the practices proposed by Shore and Warden in [78]. However, the author did not indicate how the questionnaire is related to the practices proposed by Shore and the rationale behind the assessment result. Krebs et al. proposed an agility assessment model called Agile Journey Index (AJI) [84] that aids organizations in improving their application of the agile method. The model looks at 19 key practices and divides them into 3 categories: Plan, Do and Feedback. The assessment consists of rating each practice on a scale of 1 to 10. Although the model specifies criteria for each score, the evaluation of these criteria depends on qualitative analysis and there is no indication of how to identify the occurrence of these practices in real projects. Another drawback of this model is that it considers only Scrum practices and neglects other agile methods.
In [85], Williams et al. proposed the Comparative Agility TM (CA) method to aid organizations in determining their relative agile capability compared to other companies who responded to CA. The tool, that is available as a surveytool, assesses agility using seven dimensions: Teamwork, Requirements, Planning, Technical Practices, Quality, Culture, and Knowledge Creation. Each dimension has between three and six characteristics (32 in total) and each characteristic is made up of approximately four agile practices (125 in total). For each practice, the respondent indicates the truth of the practice using a five-point Likert scale: True; More true than false; Neither true nor false; More false than true; or False. Although the approach uses an innovative assessment technique (by comparing the answers given by the company with a global trend), the authors neglected to indicate how the practices were identified, how they are related to the agile methods, and how they can be verified. One of the questions that composes the method, for example, is ''Team members leave planning meetings knowing what needs to be done and have confidence they can meet their commitments''. There are no clear criteria to check the occurrence of this practice.
The Enterprise and Team Level Agility Maturity Matrix [86] is an agility assessment method available as a spreadsheet divided into two sections: one for describing the Organization and another for describing the Development Team. There are a number of agile indicators for each section (14 organizational indicators and 37 team indicators) and each indicator ranges from a '0' (impeded) to a '4' (ideal). For each cell in the matrix, there is a simple explanation of what it means to be at that level for that indicator.
IBM DevOps Practices Self-Assessment [87] is another agility assessment approach available as a web application. The solution contains 15 questions divided into 4 areas: Demographic, Practices, Strategies, and Motivation. The authors claimed the tool can ''evaluate the state of an organization's software delivery approach''. However, there are no indications of how the questions were formed, how the answers should be analyzed and how the results are related to agile practices.
The Scrum Checklist [88] is a tool to help development teams getting started with Scrum, or assessing their current implementation of Scrum. The checklist is made up of 80 items divided into 4 groups: The Bottom Line; Core Scrum; Recommended But Not Always Necessary; Scaling; and Positive Indicators. According to the author, the items on the checklist are not rules and therefore were not designed to be verifiable or to produce a measure that indicates the level of compliance with Scrum. Instead, they are guidelines that might be used by the team as a discussion tool at the retrospective meetings. Examples of items on the checklist are ''Whole team believes plan is achievable?'' or ''Having fun? High energy level?''.
There are also models that aim to assess team members individually. In [89], Campbell and MacIver define a selfassessment model named Agility Maturity Self-Assessment that intends to identify the skills of individuals in six areas: Agile Teams, Agile Leadership, Agile Project Management, Agile Communication/Promotion, Business Value, and Risk Management. The questions have the following structure ''How experienced are you in the given area. . . ''. The author did not provide any indication of how to analyze the answers. In [90], Ribeiro proposed a survey where the participants can assess their skill in agile development by answering a questionnaire composed of 25 questions (including an open question). The author did not provide indications of how to analyze the answers and to assess the skill of the individuals. Other self-assessment tools and methods are proposed in [91], [92] and [93]. In [94], an extensive case study was conducted to evaluate 22 agile maturity self-assessment surveys according to seven criteria namely: a) comprehensiveness, b) fitness for purpose, c) ability to discriminate, d) objectivity, e) conciseness, f) generalizability, and g) suitability for multiple assessment. The authors concluded none of the evaluated approaches fully satisfy all of the criteria but are helpful to some degree.
Although these studies may be useful for guiding the assessment of developers in specific skills related to agile development, they differ from this study in not considering the assessment of projects and organizations and not providing objective indications of how the practices can be checked in an agility assessment.
Other studies proposed methods, tools or requirements to support agility assessment [95]- [99], and [100]. However, these proposals differ from the catalogue of agile smells in three ways namely: (a) the practices are usually described in a vague way which makes their verification jeopardized by the bias of the professional performing the assessment. (b) there is no indication of how to verify the occurrence of these practices; and (c) there is no clear relationship between the proposed criteria for agility assessment and any agile method.
A comprehensive study conducted by Chronis [101] investigated three questionnaire based approaches to measure the agility of development teams: Team Agility Assessment (TAA) [102], Perceptive Agile Measurement (PAM) [103], and Objectives Principles Strategies (OPS) [104]. These approaches do not focus on a specific agile methodology. Instead, they try to evaluate the degree of agility based on generic agile principles and values. The approach proposed by Leffingwell [102] (TAA), for example, assesses the agility Other studies aimed at assessing the agility of a software development methodology. Nebe and Baloni [105] investigated the integration of agile development and User-Centred Design (UCD) and proposed a checklist to assess the usercenteredness of agile processes.
Qumer and Henderson-Sellers [106] introduce a framework to assist the selection and assessment of agile development methods. The framework is composed of seven components: 1) Knowledge-Base, 2) Process Fragment and Process Composer, 3) Publisher, 4) Registry, 5) Agility Calculator, 6) Knowledge-Transformer, and 7) Visualizer. The Agility Calculator, for example, generates a report and may assist organizations in making decisions about the selection or adoption of an agile method or method fragments.
It is important to note that most of the agile smells proposed in this study are intimately related to some agile practices described in the studies presented previously. For example, some of the smells presented in Section VI are related to the Timeboxed practice (mentioned in several studies already discussed). This relationship is expected as almost all the agile practices derived from the same set of values and principles proposed in the Manifesto for Agile Development [7] and shown in Section VIII. However, we consider our study an important contribution because the proposed catalogue is focused on providing elements (agile smells) that can be directly checked in an agility assessment. Thus, for each smell, the catalogue indicates the motivating agile practice and at least one strategy to identify its occurrence in a real software project.

IX. RESULTS AND THREATS TO VALIDITY
In this section we present the results of this study and discuss some threats to validity.

A. RESULTS
The main contribution of this study is the catalogue of agile smells shown in Tables 4 to 13. These agile smells were organized to aid managers and developers assessing the agility of their projects, iterations and agile teams. This catalogue of agile smells fills a gap in the agile literature by making explicit an important set of agility assessment requirements. While the Manifesto for Agile Development [7] and other agile methods [8]- [10], [20]- [23] present the agile values, principles and practices in a rather vague description [41], [42], this catalogue presents objective criteria to assess whether some agile practices have been properly adopted. We present the agile smells in a uniform vocabulary using templates that were designed to support practitioners in agility assessment. The Identification Strategy section provides a guideline to detect the occurrence of these agile smells in real projects.
The agile smells can be used to assess agility in the level of organization, project, iteration or team. Agile smells such as Goals Not Defined and Dependence on Internal Specialists are designed to assess the project. Occurrences of these agile smells indicate adjustments in the project may be necessary. The agile smell Dependence on Internal Specialists may reveal a failure in the project planning.
Agile smells such as Iteration Without a Deliverable, Iteration Without an Iteration Retrospective, Absence of Timeboxed Iteration and Iteration Without an Iteration Planning are designed to assess the iterations. Occurrences of these agile smells indicate adjustments in the iteration planning and iteration execution may be necessary. The agile smells Iteration Without an Iteration Retrospective and Iteration Without an Iteration Planning indicate the lack of important meetings prescribed by the agile methods.
The Lower Priority Tasks Executed First agile smell assesses the agile team. The occurrence of this agile smell may indicate the agile team is not working on the most important tasks as prescribed by almost all agile methods.
The Shared Developers agile smell targets the organization. An occurrence of this smell may indicate the organization is not properly allocating developers based on the agile values.

B. THREATS TO VALIDITY
This section discusses the threats to the validity of this research and the actions that were taken to avoid them.
External Validity. This refers to the degree to which the identified agile smells are relevant to the industry. To confirm the lack of bias of the extraction method used in the literature review and to confirm the relevance of the identified agile smells to the industry, a survey with experienced practitioners from two different countries was conducted.
Construct Validity. This validates whether the research explores what it claims to be exploring. A threat in this category is not reaching the ''state of the art'' about agile development. As a significant part of the body of knowledge about Agile Methods is created by software engineering (SE) practitioners that usually do not publish in academic forums [107], we decided to include in the literature review the grey literature (non-peer-reviewed material).
Internal Validity. This validates whether the agile smells identified in the literature review are internally valid. A risk to this validity came from the fact there is no use of the term ''smell'' in the current literature. We thus sought to mitigate this threat by defining objective criteria to extract the agile smells from the selected papers.
Conclusion Validity. This threat is related to problems that can impact the reliability of our conclusions. A risk in this category regards the survey sampling size. The survey was conducted with a sampling that is not representative enough to allow us to affirm that the set of identified agile smells represents the most relevant. So, there may be some variation in the ranked list whether we conduct a survey with a more representative sampling.

X. CONCLUSION
The goal of this study was twofold: (a) identify a set of practices that may impair the proper adoption of agile development (AKA agile smells); and (b) propose strategies to identify the occurrence of such practices in real projects.
The study was organized into three phases: Elicitation, Confirmation, and Consolidation. In the Elicitation phase, we conducted a literature review including peer-reviewed academic publications and ''grey'' literature that extracted an initial set of 20 agile smells. In the Confirmation phase, this set of agile smells was the subject of a survey that aimed at characterizing the smells according to their relevance. Finally, in the Consolidation phase, the agile smells were consolidated and organized as a catalogue.
The catalogue aims at helping practitioners conduct agility assessment by (a) describing a set of practices that may impair the proper use of agile methods and (b) presenting strategies to identify the occurrence of such practices in real projects.
The catalogue can be used by researchers and industry practitioners who intend to execute agility assessment of their projects or organizations. The catalogue can also be used in research that aims at automatically detecting agile smells. As proposed by methods such as DECOR [32], making the smells explicit through a structured catalogue is a prelude to an automatic approach toward detection.
In future work, we will (a) evaluate the practical use of the catalogue guideline with case studies in the context of industry; (b) investigate how to measure eventual ''technical debts'' caused by these smells; (c) extend the catalogue including new agile smells; (d) identify the most relevant smells through a more comprehensive survey with a more representative sampling; and (e) investigate the relationship between the agile smells and specific agile methods.

APPENDIX LITERATURE REVIEW SELECTED STUDIES
The 55 studies selected for full consideration in the systematic literature review are listed below. . He is also a Visiting Researcher with the University of Waterloo, Canada. He is also a Chief Technology Officer (CTO) at OWSE Web Software Engineering, Inc., an Experienced Developer with more than 20 years working as a Software Architect. As a Technology Evangelist, he has actively participated in the promotion of Java User Groups across Brazil and around the world.
TOACY OLIVEIRA received the degree in electrical engineering, M.Sc., and Ph.D. degrees from the Pontifical Catholic University of Rio de Janeiro, Brazil, in 1992, 1997, and 2001, respectively. He spent three years at the University of Waterloo as a Postdoctoral Fellow. He is currently an Assistant Professor with the Federal University of Rio de Janeiro with the Systems and Computing Engineering Program, COPPE/UFRJ, Brazil. He is also an Adjunct Professor with the David R. Cheriton School of Computer Science, University of Waterloo, Canada. He has published more than 70 refereed publications, and has been a member of program committees of numerous conferences and workshops. Besides being a leading investigator in Brazilian research projects supported by CAPES and CNPq, he has also exercised entrepreneurship by founding several companies in Brazil. His current research interests focus on harnessing knowledge intensive processes such those realized as modern software systems. His research attempts to envision new techniques to capture, assess, and implement such type of processes.
PAULO ALENCAR is currently a Research Professor with the David R. Cheriton School of Computer Science, University of Waterloo, and also an Associate Director of the Computer Systems Group (CSG). He has received international research awards from organizations such as Compaq and IBM. He has published more than 200 refereed publications and has been a member of program committees of numerous highly-regarded conferences and workshops. His recent research in information technology and software engineering has focused on highlevel software architectures, design, components, and their interfaces; software frameworks and application families; software processes, automated workflows and work graphs, and evolution; Web-based approaches and applications; open and big data applications; context-aware and event-based systems; software agents; machine learning; cognitive chatbots; artificial intelligence; and formal methods. He has been a Principal or a Co-Principal Investigator in projects supported by NSERC, ORF-RE, IBM, SAP, CITO, CSER, Bell, and funding agencies in Canada U.S., Brazil, Germany, and Argentina. He is a member of the Association of Computing Machinery (ACM), the Institute of Electrical and Electronic Engineers (IEEE), the Association for the Advancement of Artificial Intelligence (AAAI), the Waterloo Water Institute (part of the Global Water Futures initiative), and the Waterloo Artificial Intelligence Institute.

DON COWAN is currently a Distinguished
Professor Emeritus of computer science with the University of Waterloo and also the Director of the Computer Systems Group. He has made contributions to computer science in areas such as computer science, software engineering, and complex applications. He has authored or coauthored or an editor of more than 300 refereed articles and 17 books in computer/communications, software engineering, education, environmental information systems, and mathematics. He has supervised more than 120 graduate students and postdoctoral fellows. His group has developed over 80 webbased and mobile software systems for many applications in areas such as volunteerism, environment, socioeconomic development, tourist, population health, aboriginal affairs, arts and culture, and built heritage. The University of Waterloo recognized his contributions to development of graduate students by presenting him with the Award of Excellence in Graduate Supervision. He was recognized for his research and support of the development of computer science in Brazil by being awarded the National Order of Scientific Merit (Grand Cross), the country's highest civilian scientific honor, by the President of Brazil. In 2009, he received the Waterloo Award, the City of Waterloo's highest civic honour, for his contributions to the City of Waterloo. In 2010, he was named a Distinguished Scientist by the Association for Computing Machinery and, in 2011, he received a Doctor of Science (Honoris Causa) from the University of Guelph for his contributions to Computer Science and Software Engineering. VOLUME 8, 2020