Microtasking Activities in Crowdsourced Software Development: A Systematic Literature Review

With the utilization of crowdsourcing as a problem-solving approach, software industry has progressed tremendously in recent few years. It is a powerful approach which supports distributed human intelligence to solve complex problems in the field of software development, machine learning, linguistics, medical, interpretation and other considerable fields of study. Different models of crowdsourcing have been used depending on the nature of required outcome of the task, and have had varying levels of success to date. Microtasking is one of the lucrative models of crowdsourcing which penetrated the problem-solving strategy by facilitating the decomposition of complex tasks into short and self-contained microtasks which can be performed in few minutes. Regardless of considerable number of studies explored the kinds of microtasks, existing researches fall short when it comes to technical as well as non-technical tasks and the categorization of relevant microtasks. Thus, the aim of this research is to understand the context of microtasked related crowdsourcing and to explore the microtasks related to crowdsourced software development which exist in literature. Systematic literature review is conducted to identify the microtasking activities and expert review is conducted to validate the identified microtasking activities and their categories. The final publication sample to review the literature is composed of 42 research articles and the reviews of 4 experts are taken for validation. A total of 72 microtasking activities are found along with 11 categories. After validation applied, researchers came up with a list of 61 unique microtasking activities. This paper contributes to software industry by providing list of microtasks along with their categories which will be fruitful for researchers, microtasking platforms and their clients. It contributes to software industry by providing list of microtasks which will be fruitful for researchers and microtasking platforms.


I. INTRODUCTION
Microtasking is contemplated as one of remunerative model of crowdsourcing, which utilizes the shared cognitive efforts of online crowd [1], [2]. This model of crowdsourcing involves the decomposition of large and complex tasks into the number of simple, short and self-contained units (generally known as microtasks) [3], [4]. Microtasking is the process which involves the shared effort of large number of remote-workers (generally known as crowd) who participate to solve the problem for clearly defined and self-dependent tasks, by reducing geographical participation expenses and crowd workers mobility, thus saving time and expenses [5], [6], [7]. Different online platforms have been developed to provide the services to clients as well as crowd workers in terms of microjobs (another term for microtask), in order to reduce unemployment specially in developing communities [8]. These platforms support variety of tasks and provide different facilities to their users in terms of remuneration, social recognition, bonuses and e-gifts [9], [10]. Few platforms support the specific tasks (Quicktate and iDictate for call auditing and Topcoder for programming), while most of the platforms facilitate their clients with a variety of tasks related to designing, programming and development, testing and quality assurance, interpretation and analysis and content writing [11], [12]. Figure 1. presents the decomposition of task into multiple microtasks. On a microtasking platform, logo design task is requested by a client. Depending on the managerial policies of the platform, copilot (experienced individual paid by the platform to perform the task) decomposes the task into multiple microtasks i.e., sketching of design element for required logo, selection of appropriate colors and fonts for specific logo and suitable positioning of design element along with text to achieve the final outcome. Accusatively, microtasking supports the accomplishment of substantial digital tasks by decomposing the complex tasks into the number of microtasks which can be performed by diversified remote micro-workers available on microtasking platforms [4]. It has been noticed that published microtasks can be of technical (programming and development) as well as non-technical in nature [21]. Only a few noteworthy studies [18], [22], [23] have investigated different kinds and examples of microtasks which exist on web. However, their findings did not cover all the possible and existing microtasking areas. As a consequence, without adequate knowledge of what type of microtasks can be generated from a complex task, clients and microtasking platforms may suffer in the terms of late completion of project and by assigning the task to inappropriate worker respectively. Thus, it is essential to investigate the microtasking tasks related to crowdsourced software development. This research opens following research question: RQ: What kinds of microtasking activities related to crowdsourced software development, are presented in research literature?
The goal of the research question is to understand the context of microtasked related crowdsourcing and to identify the microtasking activities which exist in crowdsourced related software development. Moreover, the aim of this research is to come up with the list of microtasking activities along with their categories which exist in literature. The rest of the paper is organized as follows. Crowdsourcing definition, usage of crowdsourcing in software engineering, models of crowdsourcing, microtasking models, usage and platforms of microtasking is explained in section II. Methodology adopted to answer the research question followed by the guideline is presented in section III. Findings of the research are presented in section IV. Validation of findings is presented in section V. Limitations and future directions are explained in section VI and VII respectively. In the last, conclusion is presented in section VIII.

II. BACKGROUND
The utilization of crowdsourcing has become a new paradigm to solve complex problems [24]. It uses outsourcing model which involves the participation of all stakeholders by using a platform [25]. The term, crowdsourcing was first used by Jeff Howe in 2006, defined as "the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. This can take the form of peer-production (when the job is performed collaboratively), but is also often undertaken by sole individuals" [26]. With the persistent development of computer applications and web-based platforms; academia and IT industry has been utilizing the stakeholder's cognitive capabilities, which makes the crowdsourcing as dominant approach for development of complex projects [22]. It is being used for different purposes i.e., information exchange and data transcription, product design and development, testing of products, creation of taxonomies, crowdfunding, consensus, designing of biomolecule and software development [27], [28]. Software crowdsourcing is rapidly growing problem-solving approach which utilizes the metacognitive efforts of online  [24]. It eases the software development life cycle (SDLC) by decomposing and then; distributing the tasks to the best available crowd. It has been widely used in various applications e.g., Youtube, Wikipedia, Linux, reCAPTCHA, GoogleEarth and Yahoo Answers [28], [29]. Encyclopedia is another example of crowdsourced application, which was developed by 70,000 participants and supports 290 languages with 35 million articles [30]. Different crowdsourcing models are available which can be selected on the basis of requirements i.e., number of stakeholders required for accomplishment of specific project, best available platform for specific project and how open call method will be used [1]. Literature has revealed four models of crowdsourced which are peer production, competitions, investment and microtasking model [30]. Peer production is one of the mature model of crowdsourcing in which collaborators (crowd workers of this model) contribute to the project to gain experience and knowledge, instead of any financial reward [31]. Open-source software e.g., Rails, Linux, Apache and Firefox are best known examples of peer production model of crowdsourcing, for which different programmers from the world developed and updates the latest versions [32].
Competitions model is related to the conventional method of outsourcing in which contestants (crowd workers of this model) post the required project on crowdsourcing platforms. A copilot (experienced individual paid by the platform for the accomplishment of the task) decomposes the project into multiple tasks known as competitions. Every contestant provides a best solution according to their expertise, hence the best solution provider get paid, which is selected by the copilot [33]. This model is suitable when high quality and diversified results are required by the client. Different crowdsourcing platforms implement competitions model e.g., Topcoder, 99designs, testbirds and uTests to crowdsource the development, designing, usability and system testing related tasks respectively [9], [12], [34]. Investments model is similar to the crowdfunding in which crowd workers (fundraisers and mostly entrepreneurs) collects the funds by using crowdsourcing platforms which facilitates them to access the market directly [35]. Investors who contribute to the funds, take financial risks to support the development of software project and anticipate reimbursement [36]. Various platforms e.g., kiva, sandawe, fundable and kickstarter implement the investments model, which provides interaction between fundraisers and investors [37].
Microtasking model is related to the decomposition of complex task into the number of short, autonomous and less skill required tasks i.e., microtasks [6]. It supports the practice of distributed human computation by decomposing macrotask (generally complex in nature) into the self-contained short tasks which require less cognitive effort as well as less time [38]. In software engineering, microtasks often known as microservices; decomposition of complex web-based task into the number of short and independent tasks i.e., microservices [14]. Microtasking in crowdsourced software engineering can be achieved by two methods i.e., traditional method and behavior-driven development (BDD) approach [3]. In traditional method, each crowd worker performs the unique task e.g., an individual writes the test cases for all the behaviors of the system, and/or an individual implements the testing process for all the behaviors. It requires continuous communication between the crowd workers to accomplish the task and to ensure consistency. On the contrary, BDD approach is related to the accomplishment of a task by single crowd worker. An individual is responsible for writing the test cases for the behavior, implements and debug them by himself [14]. Microtasking can be achieved by implementing any of its two models. Selection of the model depends on the expected results of the task, nature of the problem, required skillset of the crowd workers, managerial challenges and monetary reward [39]. The first model is related to the accomplishment of non-sequential, independent and atomic units of tasks which require limited skills, less execution time, less cognitive effort and hence paid less [40]. Samasource is a platform which implements this model of microtasking which facilitates its users by providing the services related to image tagging, color and image identification. The second model is related to the accomplishment of sequential, interdependent and interactive tasks which are performed by multiple crowd workers. Tasks related to this model require special skillset, probably longer execution time and great cognitive effort. Literature has revealed that independent tasks are well defined, well mapped-out and well structured, and interdependent tasks require great collaborative effort and probably ill-structured and not well-defined [39].
With the persistent utilization of microtasking in recent years, different microtasking platforms have been developed to facilitate their users. Few platforms are specialized in specific niches e.g., Quicktate and iDictate only provides call auditing related services, TryMYUI provides user-interface related microtasked services and SurveyJunkie provides survey and sentiment analysis related microtasked services [8]. Few platforms e.g., My little job, click worker, field agent, swag bucks, rapid workers, ySense, prolific, PartTimeClicks, microworker and remotasks offer their users diverse services which are related to data manipulation, research, testing and quality assurance, graphic designing, tagging and labelling [41].

III. RESEARCH METHOD
The authors have followed Systematic Literature Review (SLR) methodology to identify the microtasking activities which exist in literature. In order to do so, SLR guidelines by B. Kitchenham [42] have followed as this is the detailed approach to conduct SLR in software engineering [43]. This SLR involves the comprehensive review of studies which are VOLUME XX, 2017 1 related to the microtasks and microtasked related crowdsourcing. The literature review was conducted with four databases (Science Direct, IEEE Xplore, Springer and ACM Digital Library) with using same search string. The details of steps followed in SLR are as follows.

A. SEARCH STRING FORMATION
The first phase of the search was the string formation. Following steps were taken to conduct the search: ➢ The authors derived the major terms from the RQ. The major terms are 1). microtasking, 2). activities and 3). Crowdsourced software development. ➢ Synonyms and alternative terms were identified for the major terms. Microtasking: (microtask, small task, simple task, short task, decomposed task, microtasking, independent task, micro-task), Activities: (types, kinds, tasks, actions), Crowdsourced software development: (crowdsourcing, software crowdsourcing, software outsourcing, crowdsourced development, crowdsourced software, crowdsourced computing). ➢ The authors used wildcards in search terms, where required. ➢ The authors used Boolean operators (OR, AND) where required, for concatenation purpose. ➢ After applying search strategy, final search string was formulated which is follows: ("microtasking" OR "microtask" OR "small task" OR "simple task" OR "short task" OR "decomposed task" OR "independent task" OR "micro-task") AND ("activities" OR "types" OR "kinds" OR "tasks" OR "actions") AND ("crowdsourced software development" OR "crowdsourcing" OR "software crowdsourcing" OR "software outsourcing" OR "crowdsourced development" OR "crowdsourced software" OR "crowdsourced computing").

B. PAPER SELECTION
The paper selection procedure was performed in three steps. In first step, 197 from Science Direct, 381 from IEEE Xplore, 98 from Springer, 291 from ACM Digital Library and a total of 967 papers were found. Inclusion criteria were applied on preliminary papers on the basis of: ➢ Those papers are included in the search which either addressed microtasks in general, microtasking activities in crowdsourced software development or microtasks which exist in software development.
➢ Inclusion criteria was based on the availability of required keywords in paper title or keywords of the found articles.
After applying inclusion criteria, 288 papers were selected. In second step, exclusion criteria were applied on the basis of following parameters: ➢ Those papers were excluded which were only giving information of proceedings of conference or only have table of contents.

Preliminary papers
Inclusion criteria

Applied on
Step 2

papers included
Exclusion criteria Applied on

Results in 42 Final papers
Step 3

papers included
Quality assessment Applied on ➢ Those papers whose title was in English but remaining content or full paper was in other language. ➢ Those papers which were repeated in data sources.
A total of 77 papers were included after applying exclusion criteria. In third step, quality assessment procedure was carried out to assess if required outcomes (microtasking activities) are presented in the paper. In order to do so, 77 papers were distributed among different researchers along with quality assessment checklist by Kitchenham. Quality assessment checklist is shown in table 1. Questions mentioned in the checklist were answered by the researchers who were selected to read the papers for quality assessment. It was a collaborative process in which selected research articles were randomly allocated to the postgraduate students. In total, 77 research papers (after applying inclusion/ exclusion criteria) were randomly allocated among researchers. Each member was provided with 7 papers; hence papers were distributed to 11 respondents. The scoring scale was based on: Yes = 1, Partially = 0.5, No = 0. For each paper, scores of the questions mentioned in quality assessment checklist were accumulated. Those papers whose accumulated values were ranging between 0.5 to 1 were selected for final review. From 77 papers, accumulated score of 35 papers was below 0.5, hence remaining 42 papers were selected to find the microtasking activities.

C. INFORMATION EXTRACTION
Data from each selected paper was extracted on the basis of data source (database), title, publication type (journal, conference, book chapter, thesis), conference/ journal/ book/ thesis name, publication year, author's name, methodology applied in the paper and microtasking activities. Data extraction form is shown in table 2.

C. DATA SYNTHESIS
The authors extracted the data from each selected paper on the basis of mentioned fields (Table 2). Results obtained from each paper in the form of microtasking activities are discussed in section IV.

IV. RESULTS
In this section, authors presented the results. Unique IDs are given to each paper which is shown in Appendix. A total of 72 microtasking activities are found from systematic literature review. On the basis of execution process and nature of found microtasking activities, relevant microtasks are grouped into categories. Generic names are given to those categories e.g., each microtask which is related to the matching or verification of any product, is placed under the category of 'Verification and Validation'. Table 3 shows the identified microtasks along with their respective categories. In order to depict the functionality performed by microtask(s), brief description of each microtasking category is presented in the table 3. These tasks are related to the identification and finding of information. For example; finding the conference or journal name of the article where it has been published; or filter the data according to given requirements.
[61], [62], [14], [23], [67], [72], [  Information finding These microtasks are related to general information e.g., finding the information of an organization which is situated in any other country, identify authentic and unpaid Facebook pages which are providing services related to bitcoin.

Data Filtration
These microtasks are related to filter the specific data or information from available vast data e.g., Make a list of kids related e-commerce websites which do not support payment by PayPal.

Data synthesize
These microtasks are related to the grouping of different modules of data in order to make it specific information.

Content verification
Any microtask which is related to the verification of content e.g., verify the errors corrected in the code, or check if the particular website provides the required relevant information or not.

Spam detection
These microtasks are related to verify if spam filter is working correctly e.g., send unsolicited or virus-infected email from another account and check if spam filter is preventing those emails from getting to an inbox. Data matching These microtasks require the verification and matching of data from given or prescribed information. For example, review the client's comments to check if the most of the freelancers of specific niche are providing the exact services which they described in their offer list (gig for fiverr).
Data tagging Such microtask requires the crowd workers to give the suitable terms to particular product or service by organizing a piece of information to the relevant product or service. For example, 'give a suitable tagline for a given product to attract the audience of amazon'.

Product comparison
Microtasks which are related to the comparison of different products e.g., compare the given products on the basis of their names, brand names, their quantity, their ingredients etc.

Content creation Data categorization
These microtasks are related to the categorization of same entities or same features, in order to create a content. For example, categorize the relevant features of the given product and give a suitable name for specific category.

Data classification
The microtask which refers to the classification of data, entities and elements on the basis of predetermined criteria. For example, from given products and services, select the most appropriate service and product for each mentioned category.
Data enhancement These microtasks are related to the review of content or piece of content and then addition or removal of content according to their expertise to make the content more appropriate and to check if it is given in-depth knowledge. For example, this site likes to explore your area, write something interesting about your area.

Data selection
These microtasks refer to the selection of words/terms, images, audio or video for the specific topic in order to create the content. For example, from given media (images and video), select the appropriate media which is related to 'child labor'.

Gathering of terms for taxonomy creation
These microtasks are related to gather the terms and words for the development of taxonomy. For example, gather all the words and terms which are relevant to 'jurisprudence'.

Dataset's module creation
These microtasks are related to the addition of entities or data in rows and columns of dataset, in case of missing data. Experts are required to accomplish these tasks. For example, add missing values in the given data set and create an extra row as small module of dataset.

Label an image
The microtask which involves the description of the given product or service. For example, Give the description of given product in terms of its usability, reliability and customer feedback.
Pasting the data Such microtask involves the pasting of given data at appropriate position of the document or site and according to the arrangement or hierarchy of the content. For example, Place the given two paragraphs at suitable position of the given document.

Data mapping
The microtasks which require the crowd workers to map the data fields from one site or database to another, in order to merge different databases or sites to their master copy and to manage it for content creation.

Addition of annotations
These microtasks require the crowd workers to post the comments against any product, service or platform according to their experience. For example, 'we are creating a content, comment your ideas and experience regarding parasailing, to let the clients know about the service being provided by us'.

Listing of data
The microtasks which are related to make the lists from given data. For example, from the given document, make a list of conferences which publish the articles related to e-commerce.
Organizing the data The microtask which are related to organizing the data according to flow of the content, hierarchy of thoughts, linkage of the content, categories and content under the categories. For example, organize the given document according to the mentioned criteria.
Restructure the data into standardized reports Such microtask requires the crowd workers to organize the given content into required formatting and convert them into given standardized document. For example, 'Template has been attached, convert the given content into required standard by checking the formatting of the content.

Media transcription
Such microtask involves the translation of audio or video from one language into other. For example, transcribe the given Chinese audio into English audio. .

Data translation
These microtasks require the crowd workers to translate the text into other required languages. For example, translate the given piece of information into English.

Human Optical Recognition tasks
Such microtasks are related to human computation regarding optical recognition. For example, 'Type what you see in the given captchas', 'transcribe the given scanned image into editable text file by using any OCR (optical character recognition) software'.

Digitizing locallanguage documents
These microtasks require the crowd workers to transform such information which computer cannot process. For example, 'convert the given hand written text into digital form', 'transcribe the following analog audio recordings into digital form'.

Interpretation & Analysis
Sentiment analysis Such microtask involves the opinions and feelings of individuals regarding any product, service or platform on the basis of their experience. For example, do you like the 'product hunting' task on Amazon?

Content moderation
These microtasks refer to the moderation of content i.e., check if the given content is according to the terms and conditions, if a content is inappropriate or violating the guidelines. Content moderators ensure that nothing offensive or irrational gets to specific site.

Data Analysis and interpretation
These microtasks are related to the personal thinking of the individuals regarding any product, service, platform or comments. These tasks depend on the perception level, intelligence, expertise and experience of the crowd. For example, interpret the given graph according to your observation.

Interpretation of visual data
These microtasks are related to the interpretation of image or video. For example, 'describe in few words about the duties performed by father shown in the video', 'describe the gestures shown by a kid in the given image'.

Checking and listing of websites
These microtasks are related to check the websites if they are fulfilling the specific criteria or not. For example, identify the e-commerce websites which deals with kids toys in Pakistan and enlist them according to the fast delivery.
6. Surveys Content feedback Microtasks which are related to take the feedback of the individuals against any product, service or online platform. For example, give your valuable feedback to improve our site.
Conduct an interview Microtasks which are related to interview the crowd workers about any product, service, platform or any specific day e.g., Mother's Day, Labor Day etc.

7.
Content access Promotion e.g., webpages These microtasks require the crowd workers to promote the content as well as to access the promoted content e.g., by clicking on the adds. For example, 'click the link given below to view the details', 'you can take information relevant to your interest by clicking on the links below'.
Copying of the data The microtasks which require the crowd workers to copy the data by simply accessing the content and use it in future tasks. For example, 'the following content is not copyrighted, you can save the data if it is of your interest'.

Content access
The microtasks which require expert crowd workers to access the content by using content access softwares. These types of tasks are related to database management systems, inventory control systems or data warehouses.
Capture the photos These microtasks require the crowd workers to access the data, product label or tables of database by simply capture their images.

Sharing of data with different sites
These microtasks involve the sharing of data (which can be in any format i.e., image, audio, video, text, your social media or freelancing account link) to other users and sites to let the viewers access the content provided by you.

Watch an online video
These microtasks usually require the crowd workers to consume time to watch the given online video e.g., 'click on the given link to watch the animated video of human nervous system for further understanding'.

Quality assessment & Testing
Debugging of program Such microtask involves the crowd workers to ensure the quality of a program e.g., 'from a given piece of code, identify the errors and remove them'.
Test a line of code Such microtask involves the crowd workers to ensure the quality of line of code. For example, 'from a given line of code, identify the error(s), and remove them, if any'.
Debugging of UI These microtasks require the crowd workers to assess the quality of User Interface e.g., "Check if the color scheme, font face, font size, positioning of images with respect to text, white spacing and alignment are according to the design brief". Implement a unit test Such microtask involves the crowd workers to ensure the quality of code by implementing a unit test.

Algorithmic debugging
These microtasks involves the crowd workers to ensure the quality of an algorithm e.g., 'debug the given algorithm, if any error(s) exist, correct them'.
Delta debugging These microtasks usually require the workers to ensure the quality of program or piece of code by using given automated debugging tool.
Identify, test, implement and debug the behaviors in code These microtasks require the crowd workers to identify and remove the errors according to required programming behavior and implementation of program with amendments and debug again.

Locate known faults in code fragments
In these microtasks, chunks of code which needs debugging are provided to the crowd workers and ask them to identify and remove the errors to ensure the quality of the given code fragments.

Review of function behavior
Such microtasks involve the crowd workers to assess the behavior of the function which is being used in program in order to check if the program is performing the same functionality as it was intended to develop.
Implementing part of a function The microtasks which require the crowd workers to implement part(s) of a function to assess the quality of a code. For example, 'implement the required part of a function in a given code. ( Adding pseudo-code Such microtask involves the addition of pseudo-code and comments in the code, to make the reviewers understand about the functionality performed by the code. For example, 'write the pseudo-code for the given requirement and then call the function to check the functionality'. Human computation These microtasks usually involve the human computation in order to develop any (can be public) system which will check if the user is robot or not. For example, 'Mark the images in which birds are seen' or 'Identify the images which show green traffic signal'.

Identification Identification of main decision points
The microtask which usually involves the identification of main decision points from the set of requirements. For example, "Business plan document is given, you are required to identify the main decision points from it".

Identification of alternative solution
These microtasks are related to find the alternative solutions of a given problem. For example, a problem related to designing a trifold flyer is discussed in the given document along with its one solution, you are required to provide the alternative solutions with comparatively low budget.

Identification of missing values in the dataset
The microtask which refers to the identification of any missing data or information from the given dataset or the dataset's brief. These microtasks can be successfully performed by the experts of relevant fields. For example, a dataset related to patients of the hospital is attached in the document, you are required to fill the missing cells if any.

VI. LIMITATIONS
This research cannot be accomplished without limitations. One of the limitations is related to the selection of digital libraries to identify the microtasking activities. In this regard, researchers have selected four databases. However, there is a possibility that authors have missed many of the microtasking activities, as those studies are uncovered in this study. Besides, four experts have validated the findings of the study. However, it is possible that experts have missed any duplication or naming conventions of the microtasking activities or overlooked some of the microtasking activities. Another limitation is related to the selectin of keywords to create the search string for SLR. There is possibility that the selected keywords and search string is not well formulated with respect to the field of software engineering, especially in microtasked related crowdsourcing. Thus, it might generate the results which do not truly reflect the essence of the study.

VII. FUTURE FOCUS
As most of the software development is taking place by utilization of distributed human cognitive efforts. Experts are required to distribute and decompose the complex task into multiple microtasks. In this regard, future studies can be conducted to examine the pros and cons of automated and manual task decomposition systems. Moreover, a generic model can be developed in future which can decompose all types of tasks into microtasks. Another research can investigate, if automated task decomposition system is developed in future, what will be its effects on microtasked related crowdsourcing and ultimately on crowdsourced software engineering.
Another research can be conducted in future to explore the remaining databases to identify the microtasking activities as well as microtasking categories which remain uncovered in this study. Besides, validation of the identified microtasking activities can be performed by using other methods. Moreover, different experiments on crowdsourcing platforms can be performed by using identified microtasking activities.

VIII. CONCLUSION
The Focus of this research was to understand the context of microtasked related crowdsourcing and to highlight the microtasking activities related to crowdsourced software development which exist in literature. Comprehensive findings of the research will help the researchers, microtasking platforms and their clients in terms of selection of right crowd worker to perform the specific task. As possible microtasking activities are presented under each category, it will help the microtasking platforms to scrutinize the expertise of crowd workers giving multiple tasks to them. VOLUME XX, 2017 9