Robotic Process Automation: A Scientific and Industrial Systematic Mapping Study

The automation of robotic processes has been experiencing an increasing trend of interest in recent times. However, most of literature describes only theoretical foundations on RPA or industrial results after implementing RPA in specific scenarios, especially in finance and outsourcing. This paper presents a systematic mapping study with the aim of analyzing the current state-of-the-art of RPA and identifying existing gaps in both, scientific and industrial literature. Firstly, this study presents an in-depth analysis of the 54 primary studies which formally describe the current state of the art of RPA. These primary studies were selected as a result of the conducting phase of the systematic review. Secondly, considering the RPA study performed by Forrester, this paper reviews 14 of the main commercial tools of RPA, based on a classification framework defined by 48 functionalities and evaluating the coverage of each of them. The result of the study concludes that there are certain phases of the RPA lifecycle that are already solved in the market. However, the Analysis phase is not covered in most tools. The lack of automation in such a phase is mainly reflected by the absence of technological solutions to look for the best candidate processes of an organization to be automated. Finally, some future directions and challenges are presented.


I. INTRODUCTION
Although the term ''Robotic Process Automation'' (RPA) encourages thinking about robots doing human tasks, really, it is a software solution. In the context of RPA, a ''robot'' corresponds to a software program. For business processes, the term RPA means the technological extrapolation of a human worker, whose objective is to tackle structured and repetitive tasks (very common in ERP systems or productivity tools), quickly and profitably [31], [88], [98]. It is possible to say that ''RPA aims to replace people by automation done in an outside-in manner. This differs from the classical insideout approach to improve information systems'' [93]. Adopting RPA implies a low level of intrusiveness since, according to the Institute for Robotic Process Automation and Artificial Intelligence (IRPA-AI) [46], this technology is not part of the information technology infrastructure of a company, but rather sits on top of that [30].
The associate editor coordinating the review of this manuscript and approving it for publication was Tai-hoon Kim .
With relation to cost, Capgemini [16] suggests that an RPA software license may cost between 1/3 and 1/5 of the price of a full-time employee. In addition, Lacity and Willcocks [59] argue that a robot can perform structured tasks equivalent to two or five humans. Anyway, the use of RPA by companies provides the following advantages [98]: • RPA is easy to configure, so developers do not need programming skills.
• The RPA software is not invasive, it is based on existing systems, without the need to create, replace or develop expensive platforms.
• RPA is secure for the company, RPA is a robust platform that is designed to meet the IT requirements of the company in terms of security, scalability, auditability and change management. Considering the great variety of researches present in the literature such as: [4], [6], [13], [14], [43], [44], [52], [58], [60], [62], [72], [74], [95], [97], [99], it is noticed that there is a clear tendency for companies of different environments beginning to include RPA software in their processes trying to: (1) leverage the advantages that VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ RPA provides with the aim of reducing costs and (2), improve production. Although the benefits in cost savings are significant [6], not all business processes are suitable for their use. Fung [31] suggests that the business processes tasks where RPA may be applied should meet the following criteria: those with a low level of knowledge, those which are high-frequency executed, query different systems and applications, those which are standardized with a low level of exceptions to control and those susceptible to end in error caused by human errors. Considering these criteria, the best candidates for implementing a RPA are the companies which business is based on back-office areas [33], [77].
As mentioned above, there are several scientific proposals in which an implementation of RPA is presented for a specific domain. However, to the best of our knowledge, it appears that RPA is being more used in industrial than scientific contexts. In this sense, opening a discussion about the disparities and coincidences between RPA and similar technologies, and formally classifying what is being investigated relative to this technology, is of vital importance for the community to grow and open new research lines.
This study addresses the need to know the state-of-the-art of the RPA solutions offered by the literature. More precisely, it deals with RPA when focusing on two parallel (but complementary) work lines: (i) the global improvement of the back-office processes with a Lean Management approach, and (ii) the automation of some activities especially focused on the back-office context which are done manually. Both are intensive in low-skilled labor and with no replicability. Therefore, this study allows the reader to have a clear idea of several issues: (1) specific knowledge about what is RPA, (2) knowledge of the scientific solutions that propose RPA and (3) the ability to assess each of these solutions based on a classification framework.
To this end, this paper presents a Systematic Mapping Study (SMS) [78] that aims at answering the following general Research Question (RQ): What approaches and tools have been proposed in both, literature and industry, to support companies to adopt RPA?
The SMS method is a specific form of Systematic Literature Review (SLR) [53] with a broader aim, whose results provide to researchers a global view of a specific topic. Furthermore, it allows to show up a set of research necessities and trends in the field. SMSs are typically used as a starting point for doing more work with a higher level of rigor.
It is important to point out that in this work, two different scopes are presented: scientific and industrial. Bearing this perspective in mind, it will be possible to identify whether the developments being carried out in the industry are aligned with the works proposed by the researchers.
The method has been executed in two iterations (c.f. Figure 1). In the first one, a review with a percentage of the articles found with the aim of strengthening the research line and is carried out, in the second one, after reviewing and refining the process, it is executed for the rest of the articles and the new ones if any. Considering the extension of the document and to improve its readability, the results of this SMS will be grouped into the three main blocks that SLR method defines: planning, conducting and reporting for both, scientific and industrial scopes.
The rest of the paper is organized as follows. Section II introduces the background and context of RPA. Section III describes the entire process carried out in both scientific and industrial scope. Section IV details possible threats to the validity of the method executed. Section V summarizes the closest related work found in the literature. Section VI shows the more outstanding conclusions of this study and finally, Section VII draws up strategic views regarding future research lines in the field of RPA.

II. BACKGROUND
One of the most current, concise and complete definitions of Robotics Process Automation (RPA) is the one provided by the IRPA-AI Institute. It defines RPA as ''the application of technology allowing employees in a company to configure computer software or a 'robot' to capture and interpret existing applications for processing a transaction, manipulating data, triggering responses and communicating with other digital systems'' [47].
The underlying idea of the previous definition is that any workflow can be automated using a software robot when this process can be a definable and repeatable process, as well as executed based on rules by a human. In this context, the application of RPA in any company allows to improve the productivity of business processes where human performance is decisive and repetitive. However, it is important to mention that any RPA technological solution is not included in the organization's information systems, but that RPA is located at a higher technological level.
Using this layered architecture, some authors related RPA concept with BPM (Business Process Management) strategy as a mechanism to improve the competitiveness of the organization and its productivity [29]. The relationship of these concepts is usually carried out through the integration of the lifecycle of both strategies. The lifecycle concept means a systematic process for building any artifact within any domain (e.g., software) that ensures the quality and correctness of the artifact built. In software domain, process lifecycle aims to produce high-quality process-oriented software which meets customer expectations.
BPM is a well-known management strategy since several decades ago. In fact, it has been implemented in numerous environments and applied by different user profiles [40], [81], which has caused many authors to propose different perspectives of lifecycles to carry out this management [38], [41], [94]. These perspectives are oriented towards process management, but when our domain is circumscribed to RPA, the lifecycle aims to systematically deploy procedures to automate manual business processes following customer specifications.
Anyway, RPA could be considered in the set of strategies to process management. In fact, RPA could be considered a process-oriented optimization and management strategy with a clear multidisciplinary nature because this strategy involves multiple stakeholders (Subject Matter Experts -SME -, Business Analysts -BA -, Software Developer -SD -, etc.) at different moments which could be organized in a lifecycle to apply RPA techniques. For example, in general, once an organization determines that needs to automate a business process, BA should work with SME to document the process. To apply RPA techniques, it is necessary to have all details on what application(s) is/are being used, where then end user clicks, business rules, logic, proper exception handling information and what data the end user enters. Later, this information is provided to RPA developers who work with BA while developing the automated process. The BA should coordinate delivery of test data from the business to the developer as the development nears a close. Once the developer has finished developing the process and has processed the test data, the process should be prepared for a production release. After testing is completed BA and SD should meet with the organization to show off the automated process and ensure that it meets the business' needs. Finally, after deploying the automated process in a production environment, the process is handed off to a support team who should monitor the automated process and manage changes, among other aspects.
The previous example of application of RPA techniques emerges a lifecycle based on the classic idea of Deming's cycle [23] which is also known as PDCA (plan-do-check-act or plan-do-check-adjust) cycle. Deming's cycle is an iterative four-step management method used in business for the control and continuous improvement of processes and products. In this context, it is possible to identify some papers in scientific literature that propose different perspectives on stages of a RPA lifecycle. For instance, Flechsig et al. [29] study to combine RPA with BPM (Business Process Management) strategy [94] as mechanism to optimize processes. In this sense, authors propose a methodology to combine the classic BPM lifecycle [26] and RPA. These authors firstly propose to analyze the process and, if this one is suitable for RPA, carry out development, testing, release, execution and control phases.
In this context, it is possible to identify and group different ideas and proposals to allow continuous improvement using a complete RPA lifecycle. This lifecycle is used to compare and categorize the primary studies identified in this SMS. Then, the RPA lifecycle has the following phases: • Analysis Phase. This phase consists of analyzing and determining the viability of carrying out the automation of a certain process by means of a detailed analysis of the effort involved in the self-motivation of such process considering the execution characteristics of the process itself.
• Design Phase. The process design phase begins for those processes that have passed the previous feasibility analysis. The purpose of this phase is to detail the set of actions, data flow, activities, etc., that must be implemented in the RPA process.
• Construction Phase. This phase consists of implementing each of the automatable parts of each process identified in the design phase.
• Deployment Phase. The robots obtained as a result of the construction phase need an environment in which to be executed, just as a human operator needs an environment in which to perform his work. This environment, in the context of RPA, usually corresponds to a computer that has an installation of one or more information systems. Each robot must be executed in its own execution environment since the replacement between human operator and software is direct.
• Control and Monitoring Phase. Once the robots are deployed in their respective execution environments, this phase oversees controlling and monitoring the performance of each robot. In this phase, the execution of robots is launched, it stops in case of serious errors, the execution status is monitored, etc., until they have finished their work.
• Evaluation and Performance Phase. The last phase of the process consists of the evaluation of the robots' performance.
Finally, it is interesting to mention that, although there are promising researches in RPA published in the last decade (for instance researches that are compiled in this systematic review) in different business environments, today it is possible to glimpse a promising future. Devarajan [24] argues a great growth in the application of RPA in companies because of the growth in unstructured data, repetitive business tasks and evolution of new business processes, among other factors. RPA's future is geared towards the significant improvement of the quality, operational scalability and productivity of employees through integration with cognitive technologies (such as Artificial Intelligence or Machine Learning) and its integration with structured, unstructured and semi-structured data, natural language processing capability to enhance human interaction and skills to adapt extensive list of scenarios that are dependent on business rules.

III. SMS EXECUTION
The following sections describe the process which is conducted for both, the scientific and industrial scopes, focusing mainly on the final iteration of the process. As explained above, in this iteration the systematic process is carried out completely and exhaustively.

A. PLANNING
To understand the existing research proposals for the automation of robotic processes, it is necessary to formulate some research questions (RQ). The RQs guide this study and are clearly focused on the treated topic. In addition, their answers will synthesize multiple results so that a unified vision of them can be obtained. Table 1 lists the proposed RQs for both scientific (SRQ) and industrial (IRQ) scopes.
To define the search phase, two main concepts have been defined: (1) the digital libraries and general search engines on which the searches of the scientific studies are carried out, and (2) the keywords that help to build the queries for each library.
In this SMS, six digital libraries (i.e., Scopus, IEEE Explore, SpringerLink, Web of Science (WOS), ACM Digital Library and Google Scholar) and five general search engines (i.e., Google, Bing, Yahoo!, Ask and AOL Search) have been selected. Table 2 defines the keywords that have been used to construct the queries to be executed in digital libraries and search engines. In this table, a series of synonyms are included for each keyword so that the combination of each of the main words or their synonyms guides the construction of the different queries. The definition of such keywords was made after performing an initial search of studies. Keywords are defined based on three main criteria: (1) a noun that is the main search term, (2) an attribute that complements the noun, and finally (3) the action to perform using this noun and adjective. In this case and being the objective of this study so clear, ''process'' is taken as the noun, ''robotic'' as an attribute, and ''automation'' as the action. Regarding to the scientific scope, during an initial test period with the queries and the libraries, noisy data were noticed in the results of some queries since there was no literature related to the context of RPA. Therefore, a subset of the queries which produced the most relevant results was selected. In this way, the final list of queries that were used to perform the searches is: ''robotic process automation'', ''repetitive operational tasks'', ''service delivery automation'' and ''robot process outsourcing''. Due to the limitations offered by certain libraries, it was necessary to design specific search strings for each library and manipulate the search results. The searches were carried out in the title and the abstract of the documents, except in those databases that lack such functionality. In such cases, the searches were performed in the full text. Table 3 shows the specific queries used for each database.
Regarding to the industrial scope, the Forrester's comparative study ''Robotic Process Automation, Q1 2017'' [62] was considered as starting point. The Forrester entity ''is one of the most influential research and advisory firms in the world'' [62] whose purpose is to develop strategies that promote business growth. To do this, it raises rigorous and objective methodologies to evaluate the technological advancement and innovation of software tools in different business areas. Concerning the current work, the Forrester report evaluates a series of criteria in different suppliers that provide support in RPA. Specifically, this report analyses and investigates the 12 most important tools that currently exist in the worldwide market so that any entity or company can decide which tools best suit their needs.
To set up the quality assurance criteria that will corroborate the scientific rigor of the study, the following indexes were taken as reference: (1) ''Journal Citation Report (JCR)'' [67], (2) the ''Computing Research and Education Association of Australasia (CORE)'' [15] and, finally, (3) the ranking of relevant congresses for the Computer Science Society of Spain (SCIE) [86] which advises the use of the ranking prepared by the Italian associations GII and GRIN [36]. Although this work puts the focus on this kind of indexed studies, the recommendations of Wieringa et al. [96] have also been considered. Herein, the authors reinforce that a category of proposals would contain papers with no empirical evidences and grey literature.
In addition to these quality assurance criteria, which also serve as inclusion and exclusion criteria, the following criteria are defined for the inclusion or exclusion of a publication: • C1: the year of publication must be 2012 or later, i.e., the year of the oldest relevant publication that it was found in the previous searches. • C2: the classification of the publication must be ''Computer Science'' or ''Information Systems''.
• C3: it must be related to the automation of robotic processes., i.e., it is possible to obtain publications that are not directly related to this field due to the construction of the search strings. Finally, some recommendations of subject-matter experts in RPA have been also considered so that, in case of not finding these recommended studies after the execution of the searches, they are included anyway.

B. CONDUCTING
Regarding to the scientific scope, for each digital library, the search strings, the metadata (i.e., title, author, year of publication, etc.), and the abstracts of the resulting documents were stored. First 64 documents of 650 were eliminated because of duplicity. Then, another 81 documents were eliminated as they were published before 2012 (i.e., C1 criteria). Thereafter, the inclusion/exclusion criteria were applied to the remaining 505 elements, and then 286 documents were eliminated since they were not classified in the computer science or information systems category (i.e., C2 criteria). The last filter was applied seeking the intimate relationship with the automation of robotic processes and 110 documents were discarded, leaving 109 candidates (i.e., C3 criteria). Finally, when merging the results of the different libraries 54 studies were duplicated and eliminated, leaving, therefore, a total of 54 primary studies to read the full text.
Primary studies selection was executed by one of the authors of this study. To corroborate that they were correctly selected, another author chose a 30% of random results and applied the same criteria to the results. The doubts that arose during the selection of studies were solved among all authors. Table 14 shows the 54 primary studies that were selected grouped by the following categories: journals, conferences and grey literature. Fig. 2 illustrates the complete process of selecting the primary studies. Once the primary studies were selected, the keywording using abstracts activity was carried out to generate the scientific classification scheme to analyze them. As result, a set of characteristics (cf. in Table 5) are selected to answer each of the RQs which were formulated in the planning phase.
Regarding to the industrial scope, and due to the great activity in the area of RPA in recent years, it was necessary to apply certain criteria when selecting tools such as the maturity of the tool and its actual use in the industry. In this sense, the selected tools were those that (1) were detected during the review process of the scientific field mentioned above and, in addition, (2) were identified in the comparative study which was conducted by Forrester [62]. Table 6 shows the final 14 tools which was selected to be classified.
Considering the large number of possible criteria to be evaluated for each tool in the industrial scope, this study focused on a specific set of functionalities which allows characterizing each of the phases of the RPA lifecycle. For this purpose, the RPA lifecycle that is presented in Section 2 has been considered. In addition, to define the industrial classification scheme that classifies each of the selected tools, the focus was put in the second IRQ showed in Table 1. In this way, the industrial classification scheme is shown in Tables 7,8,9,10,11,12, and 13 (to facilitate the reading, the features or functionalities have been grouped into categories). VOLUME 8, 2020  To objectively evaluate each characteristic of the industrial classification scheme, the range of possible values was dimensioned. The following values were considered for each characteristic with the exception of the first characteristic of each table since it refers to the type of distribution: • Full support: the tool clearly offers the analyzed functionality.
• Partial support: the functionality which is offered has limitations or cannot be verified through the accessible material.
• No support: the tool does not offer any kind of support to such functionality.

C. REPORTING
This section reports the data that has been obtained once all the primary studies and all the selected tools have been classified. 1 Regarding to the scientific scope, the report is organized by research question: • SRQ1: The first research question looks for the methods, techniques and/or tools that have been investigated for the automation of robotic processes or RPA. According  to the classification framework, the results which were obtained (cf. Fig. 3) show that the highest result -53,57% of the total studies-proposes a theoretical study as a solution to support RPA. In turn, there are a large number of studies -24 of 54, i.e., 42,86% of the total studies-which propose a software platform. Finally, only 3,57% of the studies propose hardware components.
• SRQ2: The second research question aims to determine if the research works are mainly practical or theoretical, and to identify opportunities for future research work. According to the classification framework, the results which are obtained (cf. Fig. 4) show that the vast majority of the primary studies -55,36% of the total studies-presents a validation of the proposal, the rest -41,07% -do not present validation. Regarding the validation, it is interesting to observe the context where they take place, i.e., academic context with an experimental validation to obtain results and see how the proposal works, or industrial context where the proposals are validated against real case studies (cf. Fig. 5). In this sense, results show that most of the studies -21 of 31 validated-present an academic validation compared to industrial validation -10 of 31 validated-.
• SRQ3: The third research question identifies the nature of the methods, techniques and tools which are found for RPA, and assesses their status (cf. Fig. 6). In this   sense, the lowest classification is obtained by the CRM with only 1,79% of the studies, followed by the BPMS and Sensors both with 3,57%. The next is the one that represents frameworks with 5,36% of studies. As expected, one of the highest scores is related to those studies that propose the use of libraries and software robots, with 32,14%. Solutions based on libraries represent the 19,64%. Finally, it is important to highlight that the majority of the studies -57,14%-are classified in the Others category, that represent methods or models among others. This is due to the fact that most of the studies which are found do not propose new solutions, but just theoretical studies. VOLUME 8, 2020   • SRQ4: The fourth research question aims to uncover the main point of interest of the research and the areas which have been less investigated. In this sense, two classifications have been made according to whether they deal with (1) Front or Back-office issues (cf. Fig. 7), testing or scientific approaches. Back-office is located in a quite superior position -35,71% of the studies-and the opposite for Front-office -10,71% of the studies-. It is remarkable that there are some studies that do not deal with any of these aspects. Scientific approaches, such as methods or models, represent the 21,43%. Finally, testing takes the last place with the 5,36% of the total of studies. After that, the context of the primary studies was studied (cf. Fig. 8). There are two main contexts which are faced in the 42,86% of the studies: BPO and Finance. Moreover, the unclassified category represents 17,86%. This percentage is understandable since there are 10 purely academic studies that could not be classified into specific categories. The last big group, with the 7,14% of the primary studies, are those based on health. Public administration, motoring, standardization, insurances and telephony represent 3,57% of the total. Finally, there are several contexts which only appear in one primary study -1,79%-: tourism, eCommerce, facial recognition, quality, military and the wireless industry scopes.
There are other relevant results that are not directly related to a research question. On the one hand, Fig. 9 depicts the trend of publication in topics related to RPA. This figure clearly states the increasing interest in this topic by the scientific community. On the other hand, Fig. 10 shows the summary of papers grouped by digital libraries. Scopus ranks the highest since it indexes the majority of digital libraries. Finally, Google Scholar, ACM, Springer Link, IEEE Explore and Web of Science follow in that order. VOLUME 8, 2020    In summary, in light of the results, the research that is being developed around RPA presents a clearly growing trend which reveals the high interest that is awakening in the scientific community. As stated, the main works deal with theoretical studies or software proposals but they are focused on specific environments, which shows that there are still challenges when transferring the results to the industry. In addition, most of the papers present both industrial and academic validations that suggest a high degree of alignment between industry and research in the field of RPA since many works deal directly with companies' solutions, and mainly in the back-office. Furthermore, in this mature field of RPA, there is a lack of proposals that discuss and build based on RPA and its processes rather than in the application to existing solutions. This fact seems to be due to the high industrial   protection (e.g., patents) that the RPA companies exercise on their ideas.
Regarding to the industrial scope, since the main existing tools in the RPA context (IRQ1) and the characteristics or functionalities of the lifecycle of the processes in RPA (IRQ2) are already covered in Section III-B, the following paragraphs will go into the details of the results obtained for the IRQ3.
After applying the classification framework to the tools, the weights 1, 0.5 and 0 were assigned to the answers ''Full support'', ''Partial support'' and ''Without support'' respectively. Therefore, Fig. 11 serves as an indicator to the extent of support that is provided to the different lifecycle stages by the RPA tools nowadays. Each column represents the mean of support that each tool offers for each phase of the lifecycle. Particularly, it can be observed how the Control and Monitoring and the Evaluation and Performance phases are practically covered by all the tools. However, a clear   deficiency is revealed in the rest of the phases, overall in the Analysis phase whose support is below 20%.
Going deeper into each phase, Fig. 12 shows the average of the degree to which each characteristic is covered in the Analysis phase. In addition, the average line (18%) is included to show those functionalities that are below such average and thus, being susceptible of improvement. In this case, a particularly low average degree is observed, revealing one of the main shortcomings of the RPA tools, i.e., the Analysis. Within this phase, the least supported functionalities (i.e., FA1, FA2 and FA3) are those related to the support for the analysis of the processes, although the values observed in obtaining predictions (i.e., FA4 and FA5) are not very high either.
Within the design phase, a moderate coverage level can be observed (cf. Fig. 13). Below the average are some features such as, for example, user monitoring in real time (i.e., FD4) or the reproduction of videos for the identification of activities (i.e., FD5). However, all of these are above 50% which indicates that it is covered by most of the tools. In the construction phase, it is observed how most of the functionalities are covered above 50 • % (cf. Fig. 14). Only the versioning of robots (i.e., FC2) is presented as an unusual feature among the analyzed tools.      15 reveals the high degree to which the functionalities of the deployment phase are covered. The lowest one is related to the continuous monitoring of the availability of environments prior to the deployment of robots (i.e., FDS4) although this functionality is observed in nearly 70% of the tools. The last two phases are the most covered by the tools, obtaining results above 90% (cf. Fig. 16 and Fig. 17). There is only a lack of integration with external BI tools for visualization (i.e., FE2) that, nonetheless, appears in 75% of the analyzed tools.
As observed in the previous figures, not all functionalities appear in all the tools and, sometimes, only partially. Fig. 18 shows a summary of all the tools, indicating the degree to which they cover the 48 functionalities which were analyzed, whether completely (green), partial (yellow) or uncovered (red). As shown, the RPA tools which were analyzed have a high degree of coverage in a complete or partial manner, although a group is observed that clearly offer greater functionality (i.e,. Blue Prism, WorkFusion and Automation Anywhere).
For the sake of clarity, the following analysis divides the tools into 3 groups according to their availability of   functionalities. In the first group (cf. Fig. 19), the first 5 tools share that they show greater deficiencies than the other tools. These deficiencies are mainly present in the Analysis and Design phases. It is especially relevant that only Kofax and Softmotive offer marginal support (i.e., lower than 50%) in the Analysis phase. As mentioned, the Control and Monitoring and the Evaluation and Performance phases are practically covered by all these tools.
The second group of tools (cf. Fig. 20) incorporates improvements in the Design, Construction and Deployment phases. Not so much in the Analysis where only NICE tool shows a degree of coverage of 40%. Finally, the last group (cf. Fig. 21) is the only group that has a greater degree of coverage in the Analysis phase. Specifically, AssistEdge shows levels greater than 60%. Regarding the above data and in order to answer to the IRQ3, the general coverage that the RPA tools provide to the analyzed functionalities is 39124 VOLUME 8, 2020  acceptable. However, analyzing the phases of the lifecycle separately, there is a high deficiency in the Analysis phase where almost no tool offers support, few offer partial support, and no one offers complete support.

IV. THREATS TO VALIDITY
Considering the recommendations presented in some works such as the ones proposed by Wohlin et al. [100], Kitchenham and Brereton [53], and Petersen et al. [78], this section presents some threats to validity of the present research.
Publication bias is the consideration of elements that may make the research developed tendentious or unrepresentative. In this sense, it is important to consider that during the development of a research, the researchers may tend to emphasize the positive results in relation to the performance of the approach proposed by them, which means that the experimental results may not be completely transparent. To avoid bias, some of the authors' colleagues, experts in the field, have reviewed the work thoroughly, in order to ensure that the criteria described in the planning phase are met.
The definition of the RQs (c.f. Table 1) may be another threat to validity. For the context of this research, the SMS was conducted as generally as possible in terms of publications and dates. In this way, the study was carried out as completely as possible, since it does not privilege certain publications.
The primary studies selection quality may be influenced by the search strings (see table 3) since these studies are obtained through the execution of these strings in the different digital libraries. Incorrect keywords definition that conforms to the search strings may result in a search that is not broad enough. To mitigate this threat, the execution of the SMS was carried out in two iterations, the first to obtain preliminary results, and the second to execute the complete study after refining the activities defined in the first phase, including the definition of keywords.
The threat to the reliability of the synthesis and results of the data is mitigated, as far as possible, by the definition of scientific and industrial classification schemes (c.f. Table 5  and Tables 7, 8 , 9, 10, 11, 12, and 13). These schemes are also considered as a contribution of the present work. Both classification schemes may be understood by any reader as subjective and not objective. To mitigate this threat, the four authors of the paper performed his own classification schemes and then, they were put in common. Thus, it was tried to bring together as many of the characteristics discovered as possible. Finally, as mentioned above, this threat is covered by the execution of the study in two iterations, carrying out an even more exhaustive evaluation.

V. RELATED WORK
In this section, some specific reviews related to RPA found in the literature are described.
The work [22] aims to determine the current status of RPA and areas that may be applied. Surveys are carried out to investigate the problem and determine the tasks that can be automated. As a result, it is concluded that RPA can be used for some specific scenarios like the billing process and the maintenance of supplier data. Nonetheless, it concludes that the quality of the databases is a current barrier to the application and use of RPA. Frank [30] present a theoretical review of RPA. It introduces a degree of automation of RPA being: 1. Manual Execution; 2. Scripting; 3. Orchestration; 4. Autonomic; 5. Cognitive. However, no case study and no systematic way of executing the method are shown in this review.
Differently, Lacity [57] reports on the current status of the adoption of RPA in industry. The authors conclude that the implementation of RPA has been slow, at least until the years 2014 and 2015. It is indicated that the research community must identify the business problems that RPA can solve. Finally, they propose that researchers should determine what people do best, how the processes can be redesigned, and where and how the RPA can best be implemented. In this context, Ansari et al. [5], perform a comparative study on technical aspects of RPA tools present in industry. In addition, authors expose a set of advantages and disadvantages of the technology and describe how RPA is being used by business organizations in different sectors, such as banking, hospitals or education among others.
More broadly, [28] reviews the literature of 4 disruptive technologies (i.e., artificial intelligence, robotics, networking and advanced manufacturing) from 3 points of view (i.e., academic journals, professional experiences and government publications) and their potential impact on industry and corporate world. It leaves open the debate about who will be the winners and the losers of the future of the industries but assures is that its impact will be greater than that of the VOLUME 8, 2020 Industrial Revolution. This work introduces the concept of RDA (Robotic Desktop Automation) which differs from RPA in that it automates many more front-office functions.
Gami et al. [32] presents a review with the aim of highlighting the importance of RPA and providing content on future research lines of this technology. Concretely, Specifically, the authors focus on two concepts that are believed to be an extension of Robotic Process Automation: Artificial Intelligence (AI) and Smart Process Automation (SPA). Gotthardt et al. [35] describes the current state and challenges of RPA and AI in the context of accounting and auditing. To illustrate the current applications that exist in the market, the authors describe two case studies. Finally, in the discussion, a table showing some business and automation risks caused by the use of RPA and AI such as data leakage and privacy, cyber threats or automation strategy and governance, are presented and analyzed. These studies do not present a systematic process of executing the literature review.
The closest to that presented in this proposal is a recent work carried out by Ivančić et al. [49], where a SLR about RPA is performed, following the recommendations of Boel and Cecez-Kecmanovic [11] and Kitchenham and Brereton [53]. However, this study is only focused on the scientific scope. The search was only centered between 2016 and 2018, considering this bias a significant threat to validity of the study.
To sum up, the literature presented differs, mostly, in the following points: (1) the scope of execution: while the related work usually is focused on the scientific or the industrial scope, this study put the focus on both, evaluating research papers and frameworks or tools of the current market; (2) the objective of the study: while the related work aim to know specific questions (e.g., the state and progress of research on RPA and how is defined), this paper aims to cover a more general view about RPA, focusing on what has been researched, used and which is the nature and objective of the discovered proposals; (3) the number of primary studies reviewed: while the related work presents a low number of primary studies, this paper presents a fairly acceptable number of primary studies and tools reviewed.

VI. CONCLUSION
The objective of this research is to offer a systematic review of both the academic literature and the available market solutions in the RPA field.
For the academic scope, this work has been carried out following widely accepted processes in the field of research, thus granting high scientific rigor to the results obtained. For this, 54 scientific papers obtained from well-known bibliographic sources have been analyzed. Results showed that: (1) there is a high interest of the scientific community in this area and (2) there is an increasing tendency regarding publications related to RPA. This is evidenced by the growing volume of scientific papers that are published year by year since 2012. In particular, the scientific production in the last year has almost doubled the scientific production of 2018. However, most of these papers have a relative scientific interest since many of them only describe theoretical foundations on RPA, and others describe industrial results or experiences of having implemented RPA in specific scenarios.
Taking as a reference the results obtained after the analysis of the primary studies, it can be observed that the contexts of application most used to carry out the validation of the proposals found are: BPO, Financial and Health.
One of the most relevant facts that this review has revealed is that any of the considered papers propose or discuss functionalities in RPA platforms. This could be motivated by industrial protection or patents on these functionalities or platforms. Nonetheless, it is not possible to confirm since no information has been found on related patents in the field of RPA.
In turn, a review in the industrial field has been made. To do this, first, the main market solutions in RPA have been identified (i.e., ActiveBatch, Automation Anywhere, Blue Prism, UiPath, WorkFusion RPA Express, WorkFusion SPA, Nice, Pega, Leo Platform, AssistEdge, Redwood, Kofax, Contextor and Softomotive).
Second, using the results of the scientific scope, the main functionalities that the RPA platforms must offer were detected. The 48 detected functionalities were grouped into the following 6 phases of the RPA lifecycle: Analysis, Design, Construction, Deployment, Control and Monitoring, and Evaluation and Performance.
And third, each of the 15 solutions has been evaluated to find which of the 48 functionalities were covered. Results of this industrial review showed, on the one hand, that there are many phases of the RPA lifecycle that are clearly solved in the market, e.g., Control and Monitoring, and Evaluation and Performance where the average support of the tools is above 80%. However, on the other hand, the Analysis phase is neglected in most of the platforms. Note that in this phase, among other things, the viability of the RPA project is studied, the benefits of such robotization are foreseen and support is given to the understanding of the process to be analyzednecessary to make a correct design-. In particular, the average support of the Analysis phase in the existing platforms is below 15%. These functionalities are only partially covered by some of the major solutions on the market such as NICE, AssistEdge or Kofax. That is the main gap that has been revealed in the industrial review.
Considering the aforementioned results in both scientific and industrial context, it is demonstrated that: regarding the solutions available in the market, the majority of software products for RPA fully cover the phases of Deployment, Control and Monitoring, and Evaluation and Performance, and only a few of them embrace partially the phase of Analysis; the lack of presence in the Analysis phase represents a great technological gap in the sector since none of them extends the support desired to entirely manage the RPA lifecycle.

VII. FUTURE RESEARCH LINES
The research presented in this systematic review raises a number of interesting research lines that must be considered by the community.
Nowadays, one of the most researched topics by the community is to make the software more and more intelligent. In this sense, the application of Artificial Intelligence (AI) to RPA is presented as a very interesting challenge in different fields of application, e.g., execute unstructured versus structured tasks. The application of AI or concepts such as Data Mining or Machine Learning would help RPA not to rely on strict rule-based methods.
In the Analysis phase of the RPA lifecycle, interfaces would be provided to simultaneously manage several candidate processes to automate, such as managing, modifying, deleting and/or searching for several processes simultaneously to evaluate, subsequently, if they are susceptible to be automated. The irruption of the RPA tools in the Analysis phase would offer an exceptional added value to the product, based on the fact that existing proposals had not been able to integrate it until now and the analysis is one of the most important phases in a software project. Achieving substantial improvements in the execution of the analysis process when an RPA implementation project is carrying out means reducing economically and temporally the cost of the whole project. Thereafter, achieving the automation of this phase would be beneficial for the current offer of RPA, by incorporating additional interfaces that allow to document in detail the characteristics of each process to be automated, including information such as objectives, metrics, deliverables, hypotheses, team members, scope, stakeholders, customers, input data, output data and customer requirements.
Another important research line is related to software testing. In traditional software development methodologies, testing environment before deployment in the production environment, however, this environment is rarely offered in RPA, which involves a high risk for deployment phase. When running an automated process, it is fundamental to ensure that there are no errors during the execution of the robots in production environments. However, only a few proposals have been found in the literature that cover this topic.
RPA appears to be designed to improve organizational performance and reduce human resource costs when performing repetitive tasks. However, it would also be interesting to measure how the application of RPA in a company affects levels of competence, development, research, etc., so that the cost of applying and maintaining RPA is less than the reduction obtained.
Last but not least, it is necessary to investigate what impact the application of RPA has on the company's employees, and consequently on the organizations themselves, so that an optimum balance can be found.