The Use of Chatbots in Digital Business Transformation: A Systematic Literature Review

The research on chatbots has gained momentum over the past few years. Academics and practitioners investigate how these tools for communication with customers or internal team can be improved in terms of their performance, acceptance, and deployment. Although there is a plethora of recent studies available, not all of them deal with the digital business transformation implications of chatbots. The main aim of the research presented in this paper was to conduct a systematic literature review of high-quality journal research papers in order to summarise the current state of research on chatbots, identify their role in digital business transformation and suggest the areas warranting further attention. 74 papers were included in the research. Topical (focus and applications), methodological (methods used, sample size, sample type, and countries studied) and bibliometric (publication outlet, citations, and Altmetric Attention Score) aspects are evaluated and described. Scholars and practitioners can use the results to identify topics, areas, and applications that are intensely discussed in the literature and require further attention, select a methodology for their research that is well established in the field or is emerging, identify the most influential publications not to be missed in their research or identify publication outlets for publishing their research on chatbots.

chatbots are developed as Machine Learning (ML) or AI driven chatbots [12], but the advantage of deploying AI driven chatbots is that they give the impression of being intelligent as they get smarter with increased data and user interactions [13].
Chatbots can be defined as a 'software that accepts natural language as input and generates natural language as output, engaging in a conversation' [14]. Another definition accentuates their attempted human-liked character: 'Chatbots are interactive virtual characters whose mission is to assist people in high-profile environments' [15]. Apart from engaging in written conversations (text-based chatbots), chatbots also have the ability to mimic human speech (voice-based chatbots) to improve user experience and cultivate customer loyalty [15], [16].
Chatbots can be found on websites, social media or instant messaging apps [17], [18]. They can be deployed within an organisation to assist with various services and processes such as internal support systems, IT Service Management (ITSM), learning or human resources management (HRM) [15], [19]- [24].
For external communication, standalone chatbots can also represent an alternative to branded websites [25]. They have been deployed to provide services in many areas such as customer relationships management (CRM), customer service or sales and marketing [26]- [30]. Chatbots are used to make product or service recommendations regarding shopping, financial or health related decisions [25], [31]- [34]. Researchers are, amongst others, focusing on investigating how to build better social bots for interaction in business or commercial environments, how to improve services with chatbots, which factors affect user perceptions of chatbots or how to encourage repetitive use of chatbots [35]- [42].

II. OBJECTIVES OF THE STUDY
Researchers have been examining the various uses of chatbots, the factors affecting their acceptance by users, and the creation of new algorithms and frameworks for chatbot deployment to increase their efficiency. The number of studies on chatbots has increased significantly over the past few years which can make it difficult for researchers to navigate the space and identify areas that need further attention. The aim of this paper is to fill in the gap and provide a comprehensive overview of academic studies on chatbots.
Although many papers have been written that focus purely on the development of chatbots, our research recognises the need for interdisciplinary research and therefore focuses on papers that identify clear business implications of chatbot use and development, both inside an organisation (internal environment), and targeted at various external stakeholders, mainly customers.
The paper provides an overview of relevant research in high-quality journal research papers, in order to summarise the current state of research on business implications of chatbots and identify the research gap that requires further attention. The paper aims to answer the following research questions: RQ1: What are the focus areas and applications of the existing research on chatbots? RQ2: Which methodologies have been used in the current research and what are the characteristics of the samples used? RQ3: Which journals publish most of the research from this field and which publications are the most influential? RQ4: What are the potential directions for future research in this area?

III. METHODOLOGY
A systematic literature review (SLR) was selected as the best method to achieve the defined objectives [43]. The process of identification and analysis of relevant papers for the purpose of this SLR consisted of three steps: i) Initial database search; ii) Title and abstract screening; iii) Detailed full-text analysis. These steps are described below.

A. INITIAL DATABASE SEARCH (IDENTIFICATION)
The Web of Science database was selected as the source of papers for this SLR. To list possibly matching papers, the following search query was entered into the new (Beta) interface of Web of Science: chatbot * (Title) or chat bot * (Title) or chatterbot * (Title) or chatter bot * (Title) The results were refined to include only articles, by setting the Document Types filter to 'Articles'. In this step, 298 papers were identified. This step of the SLR was completed on 27 April 2021.

B. TITLE AND ABSTRACT SCREENING (SCREENING)
In the second step, the appropriateness of the papers for this SLR was determined by reviewing their title and abstract. Only papers with direct business implications were included. The following exclusion criteria were applied: i) Specific application to an unrelated industry such as health care, disaster management, forensics, or hospitality (restaurants); ii) COVID 19 and religion perspectives. The title and abstract screening resulted in 158 selected papers. This step of the SLR was completed by 10 May 2021.

C. DETAILED FULL TEXT ANALYSIS (ELIGIBILITY)
For each of the 158 publications, full texts were obtained, read, and analysed.
The exclusion criteria applied at this stage included: i) Paper type -although the filter in WoS was set to show only journal papers, a few other than journal papers were included by WoS, e.g. book chapters, and therefore needed to be excluded at this stage; ii) Content -papers with purely programming/ technical perspectives such as algorithm improvements; other too narrow implications such as pedagogy, psychology, humour; iii) Language -papers written in languages other than English; iv) Quality -papers with missing or insufficient methodology, literature review or other major deficiencies; and v) Full text unavailable -no full text can be obtained.
The decision to exclude papers written from a programming/technical perspective only, was based on the following assumptions: Although these papers could produce very interesting results, e.g. the ability to build chatbots based on smaller data sets or making the chatbots more human while not increasing the requirements for resources drastically, they have limited application beyond the IT/programming field.
The decision on which papers to keep or exclude was made through consensus between the authors. Based on the consensus of the first two authors, records for exclusion were identified in the screening and eligibility phases. The third author performed quality control and served as a mediator in case a dispute resolution was required.
The flow of information through the different phases of this SLR is depicted in the PRISMA flow diagram (Figure 1).  To eliminate potential bias, key characteristics of the SLR were determined by the authors prior to initiating phase 1. This included the definition of research questions. Involving the third author in the discussions regarding the research framework also helped eliminate the risk of bias, as he has not specialised in chatbot research before.
The research sample included 74 papers from 54 different journals. Twelve of these journals published more than one paper. The most popular journals were Computers in Human Behavior (6 papers), International Journal of Advanced Computer Science and Applications (4 papers), Electronic Markets (3 papers), and Journal of Business Research (3 papers). Journals that published more than one paper are shown in Table 1.
A protocol developed by Hao [43] was adopted for the purpose of this SLR and used to collect and evaluate data about i) research focus and design (research type and terminology used); ii) methodology (data collection, sample size, sample type, and countries studied); and iii) bibliometric aspects (publication outlet and citations). A similar methodology was used in a paper written by Sepasgozar et al. [44] about the systems developed and technologies used for smart homes, in which they i) reviewed relevant papers published between 2010 and 2019, within databases such as Scopus, ii) analysed the papers in terms of bibliography and content to identify more related systems, practices, and contributors, iii) used a systematic review method to identify and select the relevant papers and iv) reviewed these relevant papers for their content by means of coding.
To assess the research impact, citations and Altmetrics were used. Citations are the traditional way of determining the influence of academic work [45]. Google Scholar was used to determine the total number of citations. The evidence shows that Google Scholar is still the most comprehensive source of citations, outperforming both traditional (Web of Science, SCOPUS) and new (Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations) sources of citations [46]. After gathering the total number of citations for each publication on Google Scholar, the annual average number of citations was calculated for every publication.
The Altmetric Attention Score was used as a metric to complement the citation analysis, thereby providing additional insights into the research impact and reach. Altmetrics are metrics and qualitative data that include (but are not limited to) peer reviews on Faculty of 1000, citations on Wikipedia and in public policy documents, discussions on research blogs, mainstream media coverage, bookmarks on reference managers like Mendeley, and mentions on social networks such as Twitter [47]. The Altmetric Attention Score is an automatically calculated, weighted count of all the attention a research output has received, based on three factors: volume, sources, and authors [48]. Both the number of citations and the Altmetric Attention Score data were collected on 25 May 2021.

IV. FINDINGS
The findings part of the paper is organised to provide data for answering the research questions. Firstly, we present the analysis of focus areas and applications of the research, followed by details on methodologies used. Lastly, the most influential publications are identified by means of citations analysis enhanced by Altmetric data.

A. RESEARCH FOCUS AND APPLICATIONS
The focus was identified for every paper in the research sample. If a paper covers more than one area, the most dominant area was selected. The highest number of studies focuses on user perceptions of chatbots and their acceptance by users (16 papers), followed by communication (8 papers), the use of chatbots for customer service (7 papers), performance of and satisfaction with chatbots (7 papers), and learning (6 papers). The focus, along with the references, are shown in Table 2.
In Table 3, various applications of chatbots were identified. The main application accentuated in the paper was used to map references to the applications in the table. 41 papers could be mapped in total, the research in other papers had more general applications that were not restricted to a certain area. One paper [19] identifies multiple applications of chatbots and therefore, this paper was also not added to one particular category.

B. METHODOLOGIES USED
The most frequently used research methods within the sample of papers are experiment (26 papers) and questionnaire (16 papers), followed by development/prototyping (10 papers). Details on the methods used, along with the references are listed in Table 4.
In research studies where participants were involved, the sample size ranged from 4 [49] to 6255 [50]. In most cases, the structure of the research sample was diverse. Students were used as participants in case of 11 papers [11], [31], [32], [51]- [58]. For many research studies, the participants resided in various countries, or the details of their residence were not disclosed. For 26 studies, the country of focus was disclosed. Details are offered in Table 5.

C. JOURNALS AND RESEARCH IMPACT
In Table 1, the journals that published two or more papers from the SLR are displayed. Table 6 shows the 20 most cited papers, ranked in descending order by citations per year.
In Table 7, papers with the highest Altmetric Attention Scores are presented.

V. DISCUSSION
In this section we indicate how the first three research questions are addressed.

A. RESEARCH FOCUS AND APPLICATIONS
The first research question (RQ1) refers to the identification of research focus and applications of the research on chatbots. Various focus areas have been identified (24 in total), as listed in Table 2. The focus of the research studies mostly relates to perceptions and acceptance of chatbots (16 studies). Researchers investigated parameters and features that make chatbots more (or less) accepted by users and where their usage ultimately resulted in their higher (or lower) acceptance. Examples include studies about chatbot gender perceptions [59], attitudes towards warm versus competent chatbots [32], discomfort when using chatbots and comparing reactions to a simple and animated avatar chatbot [60]. The studies that are centred around communication (eight studies) focus on analysing chatbot communication from various perspectives, comparing various means of communication, the use of emojis [11] or properties that make chatbots more human (anthropomorphism) - [27], [52], [61]. Customer service was at the centre of seven studies. The researchers investigated improving customer service via effective chatbots [62], extracted feelings from chatbot data [63], developed a chatbot with advanced learning skills [64] or identified the factors affecting satisfaction with customer service [65]. More popular topics included performance/satisfaction (seven studies), learning (6 studies), and development/deployment (five studies).
Chatbots can play a role in digital transformation of many areas of the business. Identifying applications of chatbot deployment aims to determine in which processes and environment, whether it is internal or external, can and should chatbots be deployed. As per Table 3., human resources were the area which was featured in the highest number of studies (eight studies), followed by e-commerce and Learning Management System, both in the focus of six studies. Studying the impact of social presence and enjoyment of mobile messenger chatbots on consumers' purchase intentions [78], customer purchasing behaviour and trust in chatbots [79] or usefulness of chatbots for shopping [80] were some of the phenomena investigated in the e-commerce application area. In reference to LMS, researchers looked for example at using voice messages in learning with chatbots [81] or suggesting he best e-learning content to the user including multimedia [82].
Researchers also investigated how customer service and customer experience can benefit from chatbot deployment (five studies). Next, the use of chatbots in financial services and insurance was investigated (five studies), for example which three factors affect customer satisfaction with chatbots in the banking industry [83] or the use of chatbots in insurance [84]. A chatbot that recognises user perceptions via connected cameras, useful by conducting presentations [85] is an example from the sales application (five studies). Two studies examined marketing applications of chatbots; one study focused on one of the other four applications each: CRM, internal support/ITSM, innovation management, and multiple touchpoints.

B. METHODOLOGIES USED
This section presents answers to the RQ2: 'Which methodologies prevail in the current research and what are the characteristics of the samples used?' As Table 4 reveals, experiment is the most frequently used method to examine chatbots and their business implications.  26 studies used experiment as their main method, followed by questionnaire (16 papers). Some of the least utilised methods include patent analysis [86], content analysis [29], conceptual framework creation [20], secondary research [12], [23], [24], [87], and case study [22], [67], [88], [89].
The research samples consisted of diverse types of participants. Students were the most frequent participants in research with people involved. A significant number of studies included participants from various backgrounds, and they were often recruited via a crowdsourcing marketplace, such as Amazon Mechanical Turk. The sample sizes with participants ranged from 4 to 6,255.
For most studies (48 out of 74), the country of focus was either not disclosed or the participants were from various regions and countries. For the 26 studies focusing on one country, USA was the most studied country (7 papers), followed by Germany (3 papers) and South Korea (3 papers). Canada, India, and the Netherlands were all investigated in two studies. Other studies included participants from China, Great Britain, Greece, Italy, Japan, Romania, and Turkey (one country each).

C. JOURNALS AND RESEARCH IMPACT
To answer the third research question (RQ3), we analysed which journals publish most of the research on chatbots and their business implications. There were 12 journals that published more than one paper from this SLR. Computers in Human Behavior (six studies) and International Journal of Advanced Computer Science and Applications (four studies) were most popular.
Identifying the most influential publications was the core of RQ3. Here 20 papers with the most citations per year were listed in Table 6. There are five papers with more than 40 citations per annum [51], [53], [55], [60], [61]. The total number of citations varied significantly between studies. There were three studies with more than 200 citations identified in the Google Scholar database [22], [51], [53], and further six papers with more than 100 citations [50], [55],  [60], [61], [66], [71]. 24 papers were cited between 10 and 100 times, and there were 13 papers without a citation. Table 7 also shows which papers are currently actively discussed in the online space. The papers with the highest Altmetric Attention Score are listed, with 13 of them featuring a score above 10.

VI. CONTRIBUTION OF THE STUDY
Identifying areas needing future research attention was in the centre of the fourth research question (RQ4) and represents one of the contributions of this study. We identified topics, and applications that warrant further research. When thinking about a focus area for their research, academic and practitioners could point their attention towards some of the areas that are topical and actual but have not been investigated thoroughly. These include, for example, the use of chatbots for innovation, surveys, purchasing, stress management, news distribution or security. If researchers want to focus on a current application of chatbots that is relevant and has not been the focal point of many previous studies, they can investigate the use of chatbots in marketing, CRM, internal support/ITSM, innovation management or multiple touchpoints (how chatbots can help integrate or help serve customers using more touchpoints).
Another contribution of our study is a comprehensive overview of methods used in the field of chatbot research. Researchers can now understand which methods dominate the research field of chatbots and their business implications. Experiment and questionnaire were found to be the most often used methodologies -either one of them was used in more than half of the studies (56.7 percent). There are two possible perspectives of the implications of these findings. If researchers want to use a method that is standardised and widely accepted in the field, they can use one of the most popular methodologies. Another option is to choose a methodology that has not been used in a large number of studies, thereby enriching the field by not only providing results from a different sector, perspective or application, but also by developing a methodological application that has not received much attention. Patent analysis, content analysis, conceptualisation, secondary research, or case studies represent such opportunity.
By identifying the publications that have published most of the research on chatbots and their business implications, we helped the researchers to choose the publication outlet. The journals identified in Table 1 published more than one paper on this topic and therefore, if a quality paper is prepared, the chance of being considered for publication will be increased. Identifying the most cited studies also contributes to the current knowledge in the research field. Based on this overview, researchers can ensure that they read the most impactful papers that have been published. There are nine studies with more than 100 citations, four of these are cited more than 200 times. We also created an overview of citations per year as this takes the time factor into account and helps reveal papers that have a very strong impact over a shorter time. By including the Altmetric Attention Score in the impact analysis, we also enrich the theory and methodology of conducting systematic literature reviews which mostly relied on traditional citations analysis. It is mostly newer studies (2019 -2021) with high Altmetric Attention Scores and these values do not necessary correlate with the most cited studies. Thus, an overview from Table 7 helps the researchers identify studies that are being currently discussed and talked about. These 10 studies with the highest Altmetric Attention Score [30], [50], [58], [60], [61], [72], [90]- [92] should not be omitted from reading if conducting a study on chatbots which would also include their business implications.

VII. LIMITATIONS
Our study has certain limitations. Firstly, although an effort was made to search for relevant keywords in resources in different databases, the search will never be entirely exhaustive, as some relevant studies may have been omitted due to the filtering process that was adopted. Only papers that specifically used the 'chatbot' terminology were included, while papers that used a related terminology (e.g. conversation agents) were not included in the research sample. Secondly, only journal papers were included in the research, while conference papers, books, book chapters, monographs, dissertations, and other potentially relevant studies and reports were omitted. Thirdly, only studies where the business implications of chatbots were clearly articulated or can be directly derived, were included. Although the selection process was objectivised, some papers describing implications for business (although it has not been clearly articulated) might have been excluded.
ANDREJ MIKLOSIK has brought his extensive experience from IT project management and consulting into academia. He is currently an Associate Professor with the Department of Marketing, Faculty of Management, Comenius University in Bratislava. He is also the holder of industry certifications, including ITIL, PRINCE2, CISA, CISM, CRISC, and CGEIT. He has authored more than 180 publications, including numerous monographs and University textbooks focused on IS in business and marketing, digital marketing, IT project management, and knowledge management. Serving the community, he is the reviewer of several IT, marketing, and management journals.
NINA EVANS received the bachelor's degree in computer science, the masters' degree in information technology, and the M.B.A. and Ph.D. degrees. She has worked at various higher education institutions as a lecturer, an industry liaison manager, an associate head of the school, the head of the department, and the vice dean. She is currently an Associate Professor with the UniSA STEM, University of South Australia. Her research interests include information and knowledge management, managing the business-IT interface, social networks, and ICT innovation.
ATHAR MAHMOOD AHMED QURESHI received the degree (Hons.) in computer sciences, the master's degree in ICT management, and the Ph.D. degree. He is currently teaching academic at the UniSA Business. He is also a Certified Knowledge Manager. His approach to development in both teaching and research is through effective strategy and active learning and absorption of knowledge. As the chief investigator, he is working with industry partners and leading an interdisciplinary team into the strategic consequences of digital transformation. He has been awarded the RTIS Grant for the project, such as Digital Business Transformation. VOLUME 9, 2021