A Systematic Study to Improve the Requirements Engineering Process in the Domain of Global Software Development

The software organizations are outsourcing their development activities across the geographical border due to huge business gains. However, the adoption of the global software development (GSD) paradigm is not straightforward; various challenges are associated with it, particularly related to the requirements engineering (RE) process. The objective of this study is to identify the barriers to the RE process faced during GSD. To achieve this, we have conducted a systematic mapping study and questionnaire survey to identify and validate the barriers of the RE process with industry practitioners. A total of 20 barriers were identified and validated with the experts. Moreover, we have performed organization types (client and vendor), organization size (small, medium, and large) and experts’ levels (junior, intermediate, and senior) based analysis to provide a clear understanding of the RE barriers in the three different context. Besides, we have also developed a theoretical framework by mapping the investigated barriers into six core knowledge areas of software process improvement. The mapping results indicated that project administration is the most significant knowledge area of investigated barriers. We believe that the findings of this study will provide a framework that assists the GSD practitioners in developing an effective plan and strategies to improve the RE process in the GSD context.


I. INTRODUCTION
Requirements engineering (RE) is a significant phase of the software development life cycle (SDLC). In the RE phase, the software requirements specification (SRS) is finalized, which is acting as the foundation for all the other phases of SDLC [1]. Kotonya and Sommerville [2] suggested that the RE is the most important phase as the whole development activities are based on it. A definition provided by Britton and Doake [3], requirements engineering is the feature and the behavior of the system expected by the stakeholders. Thanasankit [4] described that RE is the most important activity of SDLC that establishes an overall view of the project. Khan et al. [1] emphasized that the RE process The associate editor coordinating the review of this manuscript and approving it for publication was Xiao Liu.
consists of five core phases (i.e., ''requirements extraction or elicitation, requirements analysis and design, requirements specification, requirements validation, and requirements management''). They further emphasized that all the phases of RE process have significant impact on the software requirements specification. According to Shafiq et al. [5] the activities of the RE process are entirely communication and coordination oriented. They further reported that requirements collection within one geographical area (single country or state) is not an easy task, but it becomes more complicated while carrying across the geographical border, due to language and cultural differences.
The software organizations are increasingly outsourcing their development activities across the geographical boundaries to gain the economic and strategic benefits of global software development (GSD) paradigm [6], [7].
The GSD refers to the ''plan of action in which the knowledge workers preform the software development activities beyond the geographical, cultural and temporal boundaries'' [8]. The GSD are being adopted due to the development of quality products at low cost [9]. The GSD paradigm is also very beneficial for reducing the development time by managing the round the clock development activities across the world [10]. Moreover, Niazi et al. [11] indicated that GSD is helpful for in-time delivery of software projects due to availing round the clock development time. Khan et al. [12] highlighted that the outsourced development activities provide the opportunities to practitioners for the adoption of the latest technologies [13], availability of skilled human resources, and attraction with the international market. Besides this, the software organizations also faced many challenges while outsourcing their development activities in GSD sites [12], [14]. RE process is considered more challenging in the GSD paradigm as it includes more communication and collaboration activities [12], [14], [15]. The physical distance between the development teams and client organizations make the requirements engineering process more challenging [12], [16]. Ramasubbu [17] underlined that the lack of friendly relationships among distributed teams and lack of face to face meetings are the main causes of poor RE.
In spite of the significance of RE in SDLC, the RE process is not standardized yet in the context of GSD [9], [18], [19]. Very few empirical studies were conducted to explore the barriers of the RE process in the domain of GSD [15], [20], [21]. We believe that a better understanding of barriers helpful in tackling the problems of the RE process in the GSD environment. This study has three main objectives: (i) to identify the requirements engineering barriers form literature and to validate them with real-world practitioners. (ii) To check the significance of the identified barriers concerning organization types, organization size, and expert levels. (iii) To classify the investigated barriers into six different knowledge areas of software process improvement. This, analysis of barriers provides to practitioners and researcher with a body of knowledge which helps address the problems associated with the successful implementation of requirements engineering activities in the domain of GSD. Besides, the reported barriers can helpful for practitioners to develop the tactics to handle the problems faced by requirements engineering teams in a geographically distributed development environment.

A. STUDY OBJECTIVES AND RESEARCH QUESTIONS
Step-1: The objective of this step is to identify the barriers faced during the requirements engineering process in a geographically distributed environment by adopting a systematic mapping study. RQ1. What are the barriers faced by requirements engineering practitioners in the context of GSD, as reported in the literature?
Step-2: The basic objective of this step is to validate the findings of step-1 with real-world practitioners and explore the additional requirements engineering barriers of the GSD paradigm.
RQ2. What are the barriers faced by GSD practitioners in requirements engineering process?
Step-3: The objective of this phase is the deep understanding of investigated requirements engineering barriers with following different prospects: RQ3. Is there any difference between the findings of the systematic mapping study and the questionnaire survey?
RQ4. How the identified barriers related to the types of organizations?
RQ5. How the identified barriers related to the size of the organization?
RQ6. Do the investigated barriers vary across different levels of experts?
RQ7. How the identified barriers be categorized into a theoretical framework?

B. STUDY STRUCTURE
The rest of the article is organized as: Section 2 contains the background and motivation of the study. The proposed research methodology is discussed in Section 3. The Findings of the study are discussed in Section 4. The summary of the research findings is presented in Section 5, and the study's implications are discussed in Section 6. Section 7 contains the limitation of the study. The future work of the study is discussed in Section 8, and the conclusions of the study are summarized in Section 9.

A. REQUIREMENTS ENGINEERING PROCESS
According to Zave [22], ''requirements engineering is the branch of software engineering concerned with the real-world goals for functions of, and constraints on software systems. It is also concerned with the relationship of these factors to precise specifications of software behavior, and to their evolution over time and across software families.'' Furthermore, Britton and Doake [3] emphasized that in the RE process, the requirements are gathered, specified, validate, and engineer the set of expectations from the stakeholders. Carlshamre and Regnell [23] highlighted that the requirements engineering process activities are very critical especially in the domain of GSD due to physical separation between practitioners. The requirements engineering phase has a long term research background, and various studies are carried out to address the complications of the RE process. Beecham et al. [24] developed a requirements process improvement model, namely as a requirements capability maturity model (R-CMM). The model contains a detailed description of the RE process. The maturity level of the R-CMM model enabled the software firms to access and improve their RE activities. To address the issues in the software industry, R-CMM also provides the best practices, which are useful to address the faced complexities during the RE process [21], [25]. Pandey et al. [26] introduced a model VOLUME 8,2020 for requirements collection and management process. This model consists of four core phases, namely, ''requirements elicitation and development,'' ''documentation of requirements,'' ''validation and verification of requirements,'' and ''requirements management and planning.'' The model provides a brief mechanism for the requirements engineering process. Mellado et al. [27] introduced a security requirements engineering process (SREP). We found another model based on the ''common criteria (CC) (ISO/IEC 15408)''. The model provides a standard-based framework that has an agreement with the requirements, security at the primary phases of software development is an iterative and systematic way by using a security resource repository (SRR), together with the combination of the common criteria (ISO/IEC 15408). This model is useful to address the barriers related with RE process with respect to the requirements security at initial phase of SDLC [18]. Various other studies are carried out to manage the challenges of the RE process [28]- [32].

B. EVIDENCE-BASED STUDIES
To conduct an evidence-based study, a systematic literature review (SLR) is an important research approach. Kitchenham et al. [33] proposed a method to integrate the practical experience and human values related to a specific research problem in the domain of software engineering. Dyba et al. [34] define the core five phases to collect evidence-based data related to a specific research topic. The phases include: (i) ''Converting a relevant problem or information need into an answerable question,'' (ii) ''Searching the literature for the best available evidence to answer the question,'' (iii) ''Critically appraising the evidence for its validity, impact, and applicability,'' (iv) ''Integrating the appraised evidence with practical experience, and the values and circumstances of the customer to make decisions about practice,'' and (v) ''Evaluating software development performance and seeking ways to improve it.'' Several other studies were conducted to refine the SLR approach, such as Zhang et al. [35] conducted a study to define the search process to collect the most related literature to the research problem. They proposed two important steps (i.e., ''quasi-gold stand'' and ''quasi-sensitivity) for assisting the researchers to improve their search mechanisms to conduct the evidencebased study. The methods assist the researchers in developing and evaluating their searching process. Furthermore, Afzal et al. [36] conducted an SLR study and identified the methods of software test improvement. We noted that they search related literature using three different steps. In the first step, they collect data using digital libraries and perform all the phases of the tollgate approach to refine the collected literature. In second step, they contact the authors of final selected studies of the first step by considering the Email ids mentioned in the paper. The authors were requested to provide the published research work related to the research problem domain. In the third step of data collection, they used the snowballing approach by considering the references list of the final selected studies of the first and second steps.
However, we used the same process to collect the data to address the research question of this study.

C. STUDY MOTIVATION
Most of the client software firms outsource their development activities in developing continents, which offer almost one third less development cost than developed continents [11], [37]. Amongst the many other interests of software firms, economic gain is one of the main concerns of GSD adoption [1], [12]. However, in the GSD environment, many risks are involved, e.g., communication and coordination, physical separation of teams, cultural and social differences, and hidden costs [38], [39]. There are many other issues and solutions for the requirements engineering process in the GSD paradigm. One of the main problems faced by software firms while GSD is that client firms sign contract with the vendor firms without checking the criticality of requirements engineering process at multiple GSD sites [11]. As the requirements engineering process demanded rich communication infrastructure, but the communication channels vary concerning the continent, such as in China, most of the social media are blocked. However, the frequent communications between the GSD sites are rare, which causes miscommunication and confusion among distributed teams. Understanding and managing the complexities of communication faced during the RE process are significant for the successful completion of software requirements specifications (SRS) [40], [41]. The cultural difference is also considered one of the main problems in GSD environment [1], [42], [43]. The practitioners belong to dissimilar cultures reveal diverse working, norms, habits, values, work ethics, patterns of behavior, types of communication, terminologies, types of hierarchy, quality standards, etc.
To the best of our knowledge, no explicit research work (i.e., systematic mapping study and questionnaire survey) has been conducted to indicate the barriers of the RE process in the GSD environment. Though the increasing trend of outsourcing encouraged us to investigate the barriers of the RE process in the context of GSD. We widely reviewed the existing literature and found the key berries faced by the RE process in the context of software outsourcing. Therefore, an in-depth study is important for researchers and practitioners to understand state-of-the-art literature in the domain of GSD. The current study provides the requirements engineering barriers with various categorizations, which help understand the faced barriers with different prospects.
However, the importance of software requirements engineering activities in the domain of GSD motivated us to develop a software requirements engineering maturity model (SRE-MM). The proposed model (SRE-MM) is based on the existing maturity models [44]- [47]. In this paper, we conducted a preliminary step towards the SRE-MM, i.e., barriers of the requirements engineering process in GSD. We believe that the proposed model will assist the GSD practitioners in handling the requirements of engineering activities more effectively and efficiently.

III. RESEARCH METHODOLOGY
To meet the objective of the study, a systematic mapping study (SMS) and an empirical study (questionnaire survey) were adopted [11], [44]. The proposed protocols of both methodologies are briefly discussed in the subsequent sections. Figure 1 indicates the hierarchy of steps adopted by selected research methodologies.

A. SYSTEMATIC MAPPING STUDY
In the first step, we have performed all the steps of systematic mapping study by following the step by step guidelines of Petersen et al. [48] and Petersen et al. [49]. The purpose of this step is to notify a state-of-the-art overview of the requirements engineering process and to investigate the faced problems of GSD. An SMS provides a systematic way to explore the most potential literature related to the study objective. The steps adopted to conduct the SMS study are graphically indicated in Figure 1 and briefly discussed in subsequent sections.

1) SELECTION OF DIGITAL LIBRARIES
Based on the personal research experiences and by considering the suggestion of Chen et al. [50], Niazi et al. [11] and Khan et al. [46], the most appropriated digital libraries were selected. The purpose of digital libraries selection is to collect the most suitable and high impact research studies related to the domain of research questions. However, to approach the most potential and large population of published data, we have selected the following seven well known digital libraries: • IEEE Xplore (http://ieeexplore.ieee.org) •

3) INCLUSION AND EXCLUSION CRITERIA
To include the selected literature in the initial stage, we have used the following criteria [11], [46]: • The research paper should be published in journals or conferences.
• The research article should contain a deep analysis of the requirements engineering process.
• The study should highlight the important challenges of the requirements engineering process.
• The study should be in the English language.
To exclude the studies, the following criteria are used [11], [46]: • If the paper does not describe the detail reason for highlighted challenges.
• If two papers have similar findings, only the complete study should be included.
• If the findings are not based on empirical evaluations.

4) STUDY QUALITY EVALUATION (QE)
The quality of the selected articles was measured by following the guidelines of Peterson et al. [48] and Petersen et al. [49]. The quality of the selected literature renders that which extent the selected studies are significant VOLUME 8, 2020 to address the proposed research questions. However, to evaluate the quality of the selected primary studies, we have developed the following checklist.
• Does the adopted research method address the research questions?
• Does the study discuss any factor of RCM?
• Does the study discuss the RCM framework and its implementation in GSD?
• Is the collected data related to RCM in GSD?
• Are the identified results related to justification of the research questions? If a study answered the research questions of the checklist, a score of 1 is assigned. If a study partially answered the checklist questions, a score of 0.5 is assigned, and if a study does not answer any of the checklist questions, a score of 0 is assigned.

5) FINAL SELECTION OF STUDIES
The collected resort paper was further refined by using the guidelines of Afzal et al. [36]. To refine the final primary studies for data extraction process, all the phases of the tollgate apache proposed by Afzal et al. [36] were carefully performed, and the results are presented in Figure 2.
Initially, a total of 2348 studies were collected. They were selected by executing the developed search string (section 3.1.2) on the selected digital libraries (section 3.1.1), after applying the inclusion and exclusion criteria. All the phases of the tollgate approach [36] were performed and finally, 87 studies were selected for the data extraction process ( Figure 2). These 87 selected primary studies were used for further data extraction process. The quality of the article was also measured simultaneously in the final study selection process. The results of the quality assessment were presented in Appendix C.

6) DATA EXTRACTION AND SYNTHESIS
The data were extracted from the 87 final selection of primary studies. The data extraction team consists of threemember (author No 1, 3, and 4). In the data extraction process, the statements, main themes, and highlighted challenges of requirements engineering process were recorded in Microsoft Excel Sheet. Initially, 34 statements of challenges were recorded in the Excel sheet. However, after a discussion with the research advisor and research team, we merge the 34 statements of requirements engineering challenges into 20 core categories.
After the completion of the data extraction process, we performed the inter-rater reliability test to remove the interperson bias. Therefore, three external reviewers were invited for the inter-rater reliability test. The external reviewers selected 15 studies from the initial phase (P1) of the tollgate approach [36] and carried out all phases of the SLR process. We determined a non-parametric Kendall's coefficient of concordance (W) to check the inter-rater agreement between the reviewers. The value of W=0 represents a complete disagreement, and W=1 represents a complete agreement. The results of the inter-rater reliability test for ten selected studies indicated that W=0.84 (p=0.0013), which showed the signed agreement between the authors and the external reviewers. The code used to perform Kendall's coefficient of concordance is given at this link: https://tinyurl.com/sr8htq7.

7) REPORTING THE REVIEW a: QUALITY EVALUATION OF THE PRIMARY STUDIES
The QE score for each selected primary study was calculated based on the five QE questions (section 3.1.4). The list of the selected primary studies, along with their QE scores, is provided in Appendix C. The final score is the cumulative score of each QE question. The results given in Appendix C shows that 79% of the selected primary studies scored ≥ 70% against the QE questions. The given analysis shows that the selected primary studies are sufficiently important to address the research questions of this study. Furthermore, we used a QE score of 40% as the threshold for selecting primary studies, as shown in Appendix C. The aim of dividing the publication period into two sub-periods is to check the frequency of publications in the requirements engineering domain of GSD in recent years [12], [52]. Twenty-seven (31%) selected studies from the first sub-period and sixty (69%) studies were published in the second sub-period. So, there is a 38% increase in the number of research papers about RE and GSD as compared to the last sub-period. The upward trend of the publication rate indicated the significance of the RE process in the software industry and academic research.
According to Figure 3, the percentage of the research approaches adopted by the selected articles are questionnaire survey (QS) 17%, Case study (CS) 41%, Grounded theory (GT) 02%, Interview learning (ILR) 11%, Content analysis (CA) 06%, Action research (AR) 04% and Mixed method 19%. This analysis shows that the case study is the most widely used research methodology in the selected primary studies, and a questionnaire survey with 17% was declared as the second most adopted research method in the requirements engineering research ( Figure 3).

B. EMPIRICAL STUDY
To validate the findings of a literature survey and to explore the additional barriers of the requirements engineering process in the context of GSD, we applied an empirical study (questionnaire survey). The steps adopted in the empirical study are presented in Figure 1 and briefly discussed in the subsequent sections.

1) SURVEY QUESTIONNAIRE DEVELOPMENT
In light of the informal literature investigations, we developed a questionnaire for online surveys. To develop an online survey instrument, we use the services of Google forms (docs.google.com/forms). The survey method was able to obtain information from a large population [52]. Besides, through the survey method, we can get data that are hard to acquire by using observational methods [53]. We created a close-ended questionnaire to gather information from RE practitioners working in the software outsourcing paradigm. The survey questions were based on the barriers identified through a literature survey. We employed a five-point Likert scale, with the following reactions: ''strongly agree,'' ''agree,'' ''neutral,'' ''disagree,'' and ''strongly disagree.'' According to Finstad [54], a neutral option is significant for the collection of pure opinions. Moreover, most of the researchers agreed with the neutral option as, without a neutral option, respondents did not have any impartial option [55]. If there is no neutral option, it may force respondents to make a negative or positive decision, which makes the result one-sided. Moreover, the survey questionnaire contained statistical information on the identified barriers. All participants were assured that the collected data would remain confidential. The information was utilized only for research purposes and will not be disclosed to anyone under any circumstances.

2) PILOT ASSESSMENT OF QUESTIONNAIRE
For the pilot assessment of the developed questionnaire, three experts were involved. In the expert's selection process, we avoided random selection rather than we used the invitation letters to request to get the services of top researchers and practitioners. However, three experts were engaged from real-world practices (i.e., belong to Octal IT solution, QSoft Vietnam, and Affle Enterprise) and one from the educational sector (IIT (ISM), Dhanbad, Jharkhand, India). The survey tool was altered to enhance the clarity and protocol because of the input given by experts. An example of the survey questionnaire is shown in Appendix A.

3) DATA SOURCES
The purpose of this study was to explore the barriers to the RE process in the context of GSD. Hence, it was important to collect information from various professionals of the RE process in the context of GSD. The identification of appropriate sampling frame is hard for surveys as there is no exhaustive register for the target population available [56], [57]. However, it is hard to identify the experts involved in requirements engineering process activities in GSD domain. Coolican [58] underlined that if it is impossible to collect a representative sample, the research should be conducted to address the research problem as much of the sample is possible. The survey participants were invited by using the snowball strategy [59], [60]. The snowball is an effective technique to collect data from the dispersed and targeted populations [60]. All participants were connected via different methods, VOLUME 8, 2020 e.g., email, Facebook, LinkedIn, and through their professional contacts. The data were collected from April-2019 to August-2019. We have collected a total of 82 responses from which three responses were found incomplete. However, 79 complete responses were used for further data analysis process. We observed that most of the client firms were located either in Asia and Europe. Development team members ranged from software development to software project managers. All of them have some experience in the RE process as well as in GSD. Thus, we have confidence in the accuracy of their responses about barriers that influence RE process in GSD context. The demographics of the respondents are given in Appendix B.

4) SURVEY DATA ANALYSIS
We applied a frequency analysis approach to statistically analyze the collected data. To present the frequencies and percentages of the data, the frequency table is used. Frequency analyses are useful for the investigation of variable groups and ordinal and numeric information [38]. To assess the importance of identified barriers, we checked the agreements of respondents concerning every barrier. We compare the responses of survey participants among each other concerning every enlisted barrier in the questionnaire. This method has been utilized by different researchers in several research areas [11], [52], [54].

IV. RESULTS AND DISCUSSION
The findings of the study are discussed in this section.

A. FINDING OF LITERATURE REVIEW
The requirements engineering is an important and complex activity of SDLC, especially in the GSDenvironment. According to Niazi et al. [9], the requirements engineering process is not standardized in the context of a geographically distributed environment. However, due to the dispersed nature of development in the context of the GSD, the practitioners faced several problems, and they found that fluent communication is one of the major barriers [9], [15], [31]. Niazi et al. [9] indicated that the majority of the organizations that carried their development activities across the globe followed an informal RE process. Shafiq et al. [5] and Thanasankit [4] highlighted the importance of standards and procedures for the RE process while adopting a software outsourcing paradigm. Various barriers are associated with software outsourcing, particularly that related to the requirements engineering process [15], [30]. A study conducted by Khan et al. [38] indicated that the barriers of the RE process are the root causes of project failure.
Prikladnicki and Audy [61] and Vogel et al. [28] highlighted the social and cultural aspects of the RE process and underlined that the lack of trust as a critical barrier between the GSD practitioners (client and end-user). Similarly, Shameem et al. [62] highlighted the lack of trustworthiness of practitioners while adoption of the geographically distributed development environment. Most of the existing literature indicated that the lack of economic maturity is a barrier to the execution of the RE process [39]. Khan et al. [38] indicated that the usually small and medium vendor firms faced budget problems, and they don't pay sufficient attention to the RE process. Due to the lack of budget, the RE process is affected badly, which is the main cause of poor requirements collection [38]. Through the literature survey, we found that the lack of management relationships as a critical barrier to the RE process in the GSD environment [9]. The physical distance between development sites is the root cause of a lack of good relationship between client and vendor organizational management, which causes the hesitation among both management to share the confidential material [27], [38].
Furthermore, Minhas and Zulfiqar [63] indicated that the geographical distance between the development sites causes a lack of trust among GSD practitioners. Kumar and Kumar [64] and Lai and Ali [65] indicated that the lack of trust is the major barriers in software development, especially in RE phases, as RE is dependent upon the communication and coordination if there is a lack of trust among RE practitioners, the elicitation of pure requirements are very hard. Also, we found cultural differences as a major barrier in the RE process in the software outsourcing paradigm [30], [52]. Overcoming these cultural differences is important for the successful execution of the RE process. For instance, distributed sites may not be able to communicate with each other using their native languages [1], [64]. Furthermore, misunderstandings may occur as a result of cultural differences, which can create confusion among different teams [20].
Khan et al. [12] highlighted that the language barrier between geographically distributed teams is a significant barrier for proper requirements collection. Šmite et al. [66] emphasized that the RE is more dependent on communication, so the language difference among the geographically distributed practitioners is a major barrier. Face to face communication is a well-known practice of requirements collection [64]. However, due to the geographical distance between the RE teams, faced to face communication is rear [28]. Niazi et al. [9], Niazi et al. [11], and Khan et al. [1] indicated that lack of face to face communication is a critical barrier for the RE process while adopting software GSD paradigm. All the other investigated barriers are given in Table 1, and their frequency analysis is graphically presented in Figure 4.

1) TIME BASED ANALYSIS
We have performed the time-based analysis by dividing the total publication time of selected primary studies into two sub-periods i.e. (2000-2009) and (2010-2018) [12], [52].
The key aim of this time-based analysis is to check the criticality of the identified barriers concerning time. The same analysis has been conducted by various existing studies of another software engineering [12], [21], [67], [68]. The data collected via a systematic mapping study is ordinal in natures, though we have applied the Chi-square test to analyze the significant differences in identified barriers concerning time [12], [21], [67], [68].
The results show that all the investigated barriers were reported in both sub-periods to some extent. According to the results of the chi-square test (Table 2), there are more similarities between the barriers of both sub-periods instead of BA 7 (Political factor across the overseas sites, p=0.020). The results show that BA7 has the highest frequency in the second sub-period (2010-2018). This shows that the political factor across the boundaries is one of the critical, challenging factors in the current era for the successful execution of requirements engineering activities in GSD context.
We noted that the majority of the investigated barrier is highly reported in the first sub-period (Table 2). This shows that technological advancement in the current era, have a positive impact on the execution of requirements engineering activities in GSD context. We further noted that BA4 (Lack of standard and procedure of requirements engineering, 78%, and 77%), BA (Lack of trustworthiness, 33% and 32%), BA11 (Lack of management relationship, 59% and 58%) and BA18 (Time zone differences across the world, 37% and 35%) are most common reported barriers in both sub-periods ((2000-2009) and (2010-2018), respectively ( Table 2). The results indicated that BA15 (Lack of common communication infrastructure, 96% and 87%) is the highest reported barrier in both sub-periods (Table 2). This indicated that the advancement of information technology does not significantly affect the impact to address the communication issue across the overseas development sites.

B. RESULTS OF EMPIRICAL INVESTIGATION
To explore the barriers of the RE process, we used a questionnaire survey approach. The questionnaire is based on the investigated RE barriers faced in GSD environment. The collected data from the survey participants were categorized as ''positive,'' ''negative,'' and ''neutral''. The positive consists of (''strongly agree'' and ''agree'') and the negative category includes (''strongly disagree'' and ''disagree''). The positive category presents the responses of the survey participants who consider the investigated barriers through a literature survey to harm the RE process. The negative category presents the responses of the survey respondents who didn't consider the investigated barriers as a challenge for the RE process in GSD. The responses of the neutral category present the results of the survey participants who didn't sure about the investigation of the literature survey. The responses are evaluated by applying the frequency analysis method, and the outcome is presented in Table 3.
The results and analyses of the survey study indicated that about ≥70% of respondents agreed with the findings of a literature survey instead of one barrier, i.e., BA16 (lack of client-vendor relationships in GSD, 66%). The practitioner's opinions of Table 3 indicated that the investigated barriers harm the requirements engineering process in the geographically distributed development environment. An important observation in the survey result is that all the respondents of the survey study agree with BA20 (lack of face to face communication between overseas teams),as it harms the requirements engineering process in the software outsourcing context. Khan and Keung [69] indicated that the requirements engineering is the communication and collaboration oriented phase of the SDLC.
Niazi et al. [11] also highlighted the significance of face to face communication in the requirements engineering process. Niazi et al. [9] underlined that face to face communication is important to access the pure expectations of the stakeholders. Due to the geographical distance between the practitioners, the frequent and face to face communication is very hard and cost-oriented. The second highest reported barrier of the RE process in the context of software outsourcing is BA10 (language barrier among distributed teams, 90%). As mention by various researchers [1], [40], [41], [44] the requirements engineering is the most communication-oriented phase of SDLC. So, the language difference is one of the main problems faced by geographically distributed practitioners. The language difference problem was also highlighted by VOLUME 8, 2020  Dingsoyr and Smite [70]. The survey results indicated that BA4 (lack of standard and procedure of requirements engineering, 89%) was the third highest reported barrier for the successful execution of RE activities in the software outsourcing paradigm. Geisberger et al. [71] emphasized that standards and procedures for RE defined the road map to collect and manage the pure requirements. Lai and Ali [65] underlined that a comprehensive process\ framework is useful to access and execute the RE activities successfully. Sangwan et al. [72] also highlighted the importance of standards and procedures for the requirements engineering process. Furthermore, the remaining highest reported barriers are BA15 (lack of common communication infrastructure, 81%), BA2 (lack of economic maturity, 80%), and BA6 (lack of trustworthiness).
Moreover, BA9 (lack of familiarity with tools and techniques, 16%) was considered a significant barrier in a negative category (Table 3). This indicated that 16% of the survey respondents agree that the BA9 is not a barrier to the RE process in the context of software outsourcing. Similarly, BA1 (lack of knowledge management at distributed sites) and BA5 (new regulations and de-regulations across the boundaries, 15%) has resulted as the second most reported barriers in the negative category. 53382 VOLUME 8, 2020 According to the results presented in Table 3, BA16 (lack of client-vendor relationships in GSD, 20%) was the highest cited barrier in the neutral category. This indicated that 20% of the respondents were not sure about the effect of BA16 on the requirements engineering process while adopting a software outsourcing development environment. BA7 (political factor across the overseas sites, 16%) and BA11 (lack of management relationship, 18%) were the second-highest reported barriers in the neutral category.
Besides, we have an open-ended section in the questionnaire survey in which we requested from the survey participants to put additional barriers that are not enlisted in the questionnaire survey. So, during survey analysis, we found the following four additional barriers from the practitioners: • Inexperienced requirements engineering staff. • Personality clashes. • Extra workload on requirements engineering practitioners.
• Lack of feedback from overseas sites.

C. COMPARISON OF SMS AND EMPIRICAL RESULTS
We have performed a comparative analysis between both data sets (literature review and questionnaire survey study). The frequency comparison of both data sets is presented in Figure 5. We have further performed Spearman rank-order correlation to measure the similarities and differences between the findings of both data sets. As the findings of both studies are based on two different analysis approaches (i.e., qualitative and quantitative). Though, for a similar scale, we have determined the ranks of both data sets ( Table 4). The same analysis approach has been adopted by various other studies in other software engineering domains [11], [44], [52]. The results of the Spearman correlation given in Table 5 (r s (20)=0.501, p=0.025) shows that there is a moderate positive correlation between the ranks of both data sets. The significant value p=0.025 indicated that the variances between the ranks of both data sets. For example, the rank of BA1 (Lack of knowledge management at distributed sites) is 13 in literature and eight an empirical study, and the ranks of BA3 (Project specific constrains in GSD sites) are 3 in literature and 11 an empirical study. However, to graphically show the obtained ranks of both data sets, we have drawn a scatter plot, as shown in Figure 6. Also, we have performed an independent t-test to evaluate the mean difference between the literature and empirical study ranks.
The results presented in Table 6 show (t=0.811, p=0.023) that there are more similarities among the ranking order of both data sets. For example, BA2 (Lack of economic maturity, ranked as 6 and 5), BA4 (Lack of standard and procedure of requirements engineering, ranked as 2 and 3), BA5 (New regulations and de-regulations across the boundaries, ranked as 10 and 10) respectively in both SMS and questionnaire study. The results of group statistics are given in Table 7.

D. CLIENT-VENDOR ANALYSIS OF INVESTIGATED BARRIER BASED ON SURVEY RESPONDENTS
The investigated barriers were categorized based on client and vendor GSD organizations. Khan et al. [52] identified the barriers of software process improvement (SPI) and reported that there are more similarities between the barriers concerning client and vendor organizations. Similarly, Niazi et al. [11] conducted a study to investigate the success factors of software project management in the domain of GSD and reported that there is a significant difference among the success factors concerning client and vendor GSD organizations. Shameem et al. [62] classified the success factors of agile software development paradigm of the distributed development environment and underlined that there is no significant difference among the investigated success factors in organization types (client and vendor).    The key motive of client-vendor based categorization of the investigated barriers was to check the significance of each barrier concerning the organization type. A total of 79 responses were collected during the questionnaire survey study, from which 31 respondents were from client organizations, and 48 were from vendor organizations (Figure-7). However, we observed a client-vendor relationship between the reported barriers in an empirical study.
Though, the chi-square analysis technique (Linear-by-Linear Association) was used to check the significant difference between the barriers with respect to the types of organizations. The same classification approach was previously adopted by various other researchers [11], [12], [44], [52]. However, we developed the following hypothesis to check the significant variances between the RE barriers: Null hypothesis (H0): There is no significant variances between the RE barriers concerning organizations types. The alternate hypothesis (H1): There are significant variances between the RE barriers concerning organization types.
If the significance value ''p'' of any barrier is >0.05, then H0 will be accepted, else H1 will be accepted. The results of client-vendor classification are demonstrated in Table 8.
The results presented in Table 8 indicate that the Null hypothesis (H0) is accepted for all the investigated barriers except two barriers, namely BA15 (lack of common communication infrastructure, p=0.020) and BA17 (lack of training activities at distributed sites, p=0.049). However, the alternative hypothesis (H1) is accepted for BA15 and BA 17. This rendered that both barriers have significant   differences with respect to the types of organizations. According to the practitioners, BA15 (lack of common communication infrastructure) was most significant for vendor organizations. This indicates that client organizations have advanced and common communication technologies that are useful for smooth communication. The common communication channels make frequent communication easier. As in China (vendor country), most of the social networking websites are banned (e.g., WhatsApp, Facebook, IMO, etc.), so these types of restrictions make it hard for frequent communication among the distributed teams. However, in development (client countries), all the communication channels are freely available, which make communication easier.
Similarly, in the practitioners' view, BA17 (lack of training activities at distributed sites) is more significant in vendor organizations rather than a client. As most of the vendor organizations are in developing countries, and the organizations of developing countries faced budget constraints. Due to the limited available budget, the organizations don't consider the training \seminars as the significant activities in software development. Geisberger et al. [71] also highlighted that due to the budget limitations in vendor organizations, the training session is ignored. VOLUME 8, 2020 However, according to the practitioners, the most common requirements engineering barriers were: BA5 (new regulations and de-regulations across the boundaries, 71%, and 73%), BA13 (cultural differences, 71%, and 69%), BA16 (lack of client-vendor relationships in GSD, 74%, and 73%) client and vendor organizations, respectively.
An important observation in client-vendor classification analysis (Table 8) is that BA20 (lack of face to face communication between overseas teams, 100%) was considered by all the survey respondents as a critical barrier for requirements engineering process for client organizations. Moreover, 85% of respondents of vendor organizations reported that BA20 (lack of face to face communication between overseas teams) is a barrier for the requirements engineering process. So, we noted that BA20 is the highest reported barrier in both types (client, vendor) GSD organizations. Khan et al. [1] and Niazi et al. [11] also highlighted BA20 (lack of face to face communication between overseas teams) as a critical barrier in geographically distributed teams.
Besides, we followed the classification model developed by Khan et al. [52] and Shameem et al. [62], for client and vendor GSD organizations to map the reported factors in both types of organizations. To do this, we calculated the percentage of all the investigated barriers (Table 8) and mapped them based on their higher significance to client and vendor organizations, as shown in Figure 8. For example, 74% of respondents of client organizations agreed with BA1 as the significant barrier of requirements engineering process for client GSD organizations. Though 79% of respondents of vendor organizations are considered BA1 as the critical barrier in the requirements engineering process. Hence, BA1 is highly reported in vendor organizations category, so it is allotted to vendor organizations. The same procedure is adopted for all the reported barriers and mapped them in the domain of client and vendor organizations (Figure 8).

E. ORGANIZATION SIZE BASE ANALYSIS (BASED ON SURVEY RESULTS)
By following the studies conducted by Khan et al. [73] and Khan et al. [12], we categorized the investigated barriers concerning the size of software firms. The objective of size based categorization is to check the significance of each investigated barrier concerning the organization size. By considering the definition of the Australian Bureau of Statistics [74], we categorized the investigated barriers as ''small organizations (0-19 employees)'', ''medium organizations  employees),'' and ''large organizations (≥200 employees)''. Through a survey study, we observed that 20 respondents from small, 28 respondents from the medium, and 31 respondents belonged to large organizations (Table 9). Moreover, we employed a chi-square test (Linear-by-Linear Association) to check the similarities and differences among the core three sizes of organizations with respect to investigated barriers. The same analysis technique was adopted by various researchers in different domains [11], [12], [44], [52]. So, to check the significant difference among the reported barriers with respect to the size of organizations, we developed the following hypothesis: Null hypothesis (H0): There is no significant differences between the reported RE barriers concerning the organization's size.
The alternate hypothesis (H1): There is a significant difference between reported RE barriers concerning the size of organizations.
However, if p>0.05, then H0 will be accepted, else H1 will be accepted. The analysis of organization size-based classification is presented in Table 9.
The results presented in Table 9 demonstrated that there is no significant difference in the reported barriers concerning the size of organizations. In this case, the Null hypothesis (H0) is accepted, and the alternate hypothesis (H1) is rejected. However, this indicated that all the investigated barriers harm three sized organizations in the context of GSD. However, the most common barriers in all sized organizations are BA7 (political factor across the overseas sites, 70%, 77% and 75%), BA14 (environmental constrain at overseas sites, 65%, 77%, and 75%), BA17 (lack of training activities at distributed sites, 75%, 74%, and 71%) and BA19 (lack of risk assessment at distributed sites, 65%, 71% and 86%) in small, medium and large organizations respectively.
Furthermore, BA2 (lack of economic maturity, 85%) is the highest reported barrier in small GSD organizations. This rendered that smaller organizations faced budgetary problems while executing the activities of the requirements engineering processes in a geographically distributed environment. Dingsoyr and Smite [70] and Lai and Ali [65] also indicated that due to the budget problems in small organizations, the activities of requirements engineering processes didn't perform accurately. According to the practitioners, BA9 (lack of familiarity with tools and techniques, 80%) was the second most significant barrier in the implementation of the requirements engineering processes in an GSD environment. This indicated that the practitioners of small organizations didn't know the latest tools and techniques. Shameem et al. [62] indicated that due to the lack of workshops and seminars in small organizations, the practitioners did not aware of the updated available tools and their right usage. According to the survey results, these are the main hurdles in the successful execution of requirements engineering activities in software development.
In medium scale GSD organizations category, BA15 (lack of common communication infrastructure, 90%) was the most significant reported barrier. The activities of the requirements engineering process are more concerned with communication, though, in a geographically distributed environment, common communication infrastructure is important for frequent communication among distributed teams. Ramzan and Ikram [75] also highlighted the importance of common communication infrastructure in the geographically distributed development environment. The second highest cited barriers in a geographically distributed environment are BA4 (lack of standard and procedure of requirements engineering, 77%), BA7 (political factor across the overseas sites, 77%), BA12 (lack of trust, 77%), BA14 (environmental constrain at overseas sites, 77%) and BA16 (lack of client-vendor relationships in GSD development, 77%). These are the most significant barriers in medium-size organizations while employing requirements engineering activities.
Also, BA4 (lack of standard and procedure of requirements engineering, 89%) is quoted as the most significant requirements engineering barrier in the large organization category. Due to the number of GSD sites in large organizations, the management of requirements engineering activities is very complex. However, the standards and procedures provide the road map to handle the requirements of engineering activities in a geographically distributed environment. Through the literature survey, we noted that various other researchers also indicated the importance of requirements engineering standards to access and improve the requirements of engineering activities in the GSD environment [26], [65]. BA10 (language barrier among distributed teams, 82%), BA15 (lack of common communication infrastructure, 82%) and BA20 (lack of face to face communication between overseas teams, 82%) are declared as the second most critical barriers for the execution of requirements engineering process in a geographically distributed environment. Niazi et al. [9] emphasized that effective communication is the most important element of the requirements engineering process. The BA10, BA15, and BA20 are related to communication activities of the requirements engineering process. This indicated that large software organizations faced more communication problems while carried out the development activities in geographically distributed environment. Jamaludin and Sahibuddin [41] also highlighted the importance of communication setup in large software organizations while conducting their activities in a geographically distributed environment.
Also, we mapped the reported barriers into three sizes (small, medium, and large) GSD organizations. The mapping procedure is based on the percentage analysis conducted for each reported barrier (Table 9) [52], [62]. For example, BA1 is reported 55% in small, 65% in medium, and 71% in the large organization categories. This indicated that BA1 has the highest (71%) occurrence in the large organization's category. So, BA1 is assigned to the large organization's category (Figure 9). Similarly, all the reported barriers are mapped concerning the frequency of occurrence for all sized organizations categories (Figure 9).

F. CATEGORIZATION OF BARRIERS BASED ON RESPONDENTS EXPERIENCE LEVEL
Furthermore, we followed the study conducted by Khan and Niazi [76] and categorized the requirements engineering barriers based on the survey respondents' experience, as presented in Appendix-B. We have collected a total of 79 complete responses from the participants. So, we have classified the experts into three different categories by following the criteria discussed by Khan and Niazi [76] as a junior-level expert, having experience range from 1-5 years, intermediate level experts having experience 6-10 years, and senior-level VOLUME 8, 2020  experts having experience more than ten years. By following the discussed criteria, Appendix-B shows that out of 79 total respondents, 34 junior, 26 intermediate, and 19 were declared as senior-level experts. Besides, the chi-square test (Linear-by-Linear Association) was used to check the significant difference between the experts concerning the reporting requirements engineering barriers. The results of experts based categorization is presented in Table 10, and we developed the following hypothesis to check the significant difference among the reported barriers concerning experts' levels.
Null hypothesis (H0): There is no significant difference between the reported barriers concerning the experts' levels.
The alternate hypothesis (H1): There is a significant difference between the reported barriers concerning the experts' levels.
The results indicated that there are more similarities between the coated barriers concerning the experts' levels. Hence, based on the chi-square test results (Table 10), the Null hypothesis (H0) is accepted, and an Alternate hypothesis (H1) is rejected. However, according to the analyses of the results, all levels of experts are equally agreed that the reported barriers harm the requirements engineering process in the context of GSD.
According to the results (Table 10), BA4 (lack of standard and procedure of requirements engineering) is the most significant requirement engineering barrier because of all categories of experts. The standard and procedure provide the roadmap to practitioners for the successful execution of RE process activities in the GSD environment. Khan and Niazi [76], highlighted that the standards and procedures assist the practitioners in accessing and improving the requirements engineering activities. Beecham et al. [24] indicated that the standards and procedures are helpful for the elicitation and management of pure requirements according to the stakeholder's expectations. Niazi et al. [9] and Ramzan and Ikram [75] also underlined the importance of standards and procedures for the requirements engineering phase.
According to the practitioners of the junior category, BA7 (political factor across the overseas sites, 91%) is the second most significant barrier for the successful execution of requirements engineering processes in software GSD.
The requirements collection is purely based on communication with the client organizations staffs. Due to the political victimization issues with the staff members, they hesitated to provide pure requirements, which is a big problem for requirements engineering teams. Ramasubbu [17] and Shafiq et al. [5] also highlighted the harmful effects of political factors in the requirements engineering process, especially in overseas sites.
BA15 (lack of common communication infrastructure, 88%) is the second most significant barrier in intermediate experts' category. Most of the requirements engineering activities demanded a rich communication and coordination environment. In overseas sites, the development activities are carried out in different countries, so different communication tools are used, which badly affect the smooth communication environment.
An important observation in senior-level experts' category is that BA15 (lack of common communication infrastructure, 100%) is the only barrier that doesn't have any negative or neutral response. All the senior level experts agreed that the common communication infrastructure is significant for the successful implementation of requirements engineering activities in the GSD environment.
Furthermore, we mapped the reported barriers into three levels of experts (joiner, intermediate, and senior) based on the percentage of occurrence (Table 10). The same mapping method was adopted by various other researchers in other software engineering domains [52], [62]. The barriers are allotted based on their significance to experts' level as BA1 is reported 85% in junior, 73% in intermediate, and 74% in senior experts' category (Table 10). This percentage of occurrence indicated that BA1 is highly significant to the junior experts' category. Hence BA1 is allotted to joiners' experts' category ( Figure 10). By adopting the same criteria, all the other reported barriers are also mapped into junior, intermediate, and senior experts' categories ( Figure 10).

G. MAPPING OF INVESTIGATED BARRIERS INTO SIX KNOWLEDGE AREAS OF PROCESS IMPROVEMENT
Ramasubbu [17] classified the success factors of software process improvement into six different categories. The similar studies were conducted by Khan et al. [52] and Shameem et al. [62] to categorize investigated different factors into six knowledge areas i.e. project administration, coordination, software methodology, human resources management, knowledge integration, and technology factors. By following the same concept, we have categorized the investigated requirements engineering barriers into six core categories.
All the authors of this study participated in the mapping process, and the identified 20 requirements engineering barriers were mapped according to the understanding of barrier effects (Figure 11). According to the mapping results, ''coordination'' is the most significant knowledge area of investigated barriers. The mapping of the identified barriers has both industrial and research implications. The categorization of the barriers provides a theoretical framework that is useful for researchers and practitioners to focus on the area of the most significant barrier of the requirements engineering process in the GSD domain. It will also helpful for requirements collection teams to develop useful policies and strategies to cover requirements engineering challenges in GSD. The practitioners can consider the barriers concerning their nature of designation. This framework assists the researcher in attaining the most significant category of the investigated barriers with respect to their research interest.

V. RESEARCH SUMMARY
The objective of this research work is to investigate the barriers to the requirements engineering process in the domain of GSD. The results of this study make available a body of knowledge for researchers and practitioners, which is helpful for the successful implementation of requirements engineering activities in overseas sites. The investigated barriers indicated the key areas which need to be addressed for overseas teams for the successful execution of requirements engineering activities in the geographically distributed development environment. The basic purpose of this study is to develop a software requirements engineering maturity model (SRE-MM) in the domain of global software development. This study gives an initial step towards the development of SRE-MM, i.e., the barrier of the requirements engineering process in the GSD environment. The detailed addressed research questions are provided in Table 11.

VI. IMPLICATION OF THE STUDY
The study provides a state-of-the-art overview of requirements engineering barriers in GSD environment. This study provides a framework of requirements engineering barriers which presents the key categories of the barriers that can serve as Knowledge for researchers and practitioners working VOLUME 8, 2020 on requirements engineering processes in GSD. This framework will assist the global software development firms in paying more attention to the barriers concerning their specific categories.
Moreover, this research work provides a deep understanding of the investigated barriers with organization types (client and vendor), organization size (small, medium and large) and experts' level (junior, intermediate and senior). The reported barriers can assist the practitioners in considering the most relevant barriers concerning the organization types, size, and the experts' levels. In summary, this study provided a detailed overview of the available requirements engineering literature survey in the context of GSD that has not been conducted before. Finally, this study contributed to the development of software requirements engineering maturity model (SRE-MM), which assist GSD organizations in assessing and improving their requirements engineering program effectively.

VII. THREATS AND VALIDITY
In this study, we used a literature review approach to investigate the barriers to the requirements engineering process. Some related studies might be missed during literature collection process. By considering the other systematic review studies, this is not a systematic omission [12], [21], [67], [68]. Besides, in an online survey study, we just received 79 complete responses. This may be attributed to small sample size. But by referencing the existing empirical studies [11], [44], [52], [77], [78], the sample of the present study is enough to justify the results of the empirical study. Similarly, an informal method was adopted to categorize the investigated barriers into six knowledge areas. This may be a threat to the validity of the barriers categorization process. But we found that most of the researchers of other domains of software engineering also adopted the same process to categorize the identified factors into different categories [11], [44], [52], [62].

VIII. FUTURE WORK
The basic motive of this study is to develop a software requirements engineering maturity model (SRE-MM) in the context of GSD. The proposed SRE-MM is based on the existing maturity models of other software engineering domains [44]- [47]. The proposed structure of the SRE-MM is presented in Figure 12. The maturity levels of the SRE-MM are based on critical barriers (CBS) and critical success factors (CSFs). The present study just contributed to the initial section on the SRE-MM, i.e., barriers.
However, in the future, we have planned to conduct a systematic literature review and an empirical study to investigate the additional barriers of the requirements engineering process in the GSD domain. Also, we will conduct a study to identify the success factors and the best practices of the requirements engineering process, which are useful to address the components of the proposed SRE-MM. We also plan to conduct an interview study with experts to validate the categorization process of RE barriers ( Figure 11). We believe that the proposed model is useful to assess and manage the activities of the requirements engineering process in the GSD environment.

IX. CONCLUSION
The increasing trend of GSD paradigm motivated us to investigate the barrier of the requirements engineering process. The software requirements engineering is an initial phase of the software development life cycle, so it requires more consideration for producing quality software. However, by using the informal literature review approach, a total of 20 barriers were identified. Moreover, to validate the findings of the literature survey, an empirical study (questionnaire survey) was conducted. We have received a total of 79 complete survey responses, and they consider that the investigated barriers of systematic mapping study harm the requirements engineering process in GSD environment. Moreover, during survey results analysis, we received five additional barriers from the real-world practitioners (i.e., inexperienced requirements engineering staff, personality clashes, the extra workload on requirements engineering practitioners, and lack of feedback from overseas sites. Furthermore, we categorized the identified barriers based on client and vendor GSD organizations. The results (Table 8) demonstrated that there are more similarities than differences in the reported barriers concerning the organization's types. Moreover, we found a significant difference only for two barriers, i.e., ''lack of common communication infrastructure'' and ''lack of training activities at distributed sites.'' Similarly, we categorized the investigated barriers based on organizations' size (small, medium, and large). The results (Table 9) demonstrated that there is no significant difference in all the reported barriers concerning the organization's size. Moreover, we observed that the medium and large organizations experience a more similar barrier. However, small organizations experience somewhat different barriers compared with medium and large organizations.
We have also categorized the investigated barriers concerning the expert's experience level (junior, intermediate, and senior). The results indicated that there is no significant difference in identified barriers concerning the experts' level. Also, we observed that all levels of expert's experience somewhat different barriers compared with each other.
Also, we categorized the identified requirements of engineering barriers into six different knowledge areas. The results ( Figure 11) demonstrated that most of the identified barriers are associated with coordination knowledge. However, coordination is the most significant knowledge area of the investigated barriers.
We believed that the findings of the current study help address the challenges faced by GSD organizations in the requirements engineering process.