Data Collection in Sensor-Cloud: A Systematic Literature Review

The integration of cloud computing and Wireless Sensor Networks (WSNs) to create Sensor-cloud helps in extending the data processing capability and storage capability of WSNs. Knowing how weak WSNs are with regards to communication ability, how to collect and upload sensory data to the cloud in limited time has become an issue in Sensor-cloud. In the last decade, with increasing interest by researchers in the domain, a considerable amount of research works have been conducted and published in the research domain. The main objective of this study is to systematically review the current research on data collection in Sensor-cloud. Hence, the study also aims at identifying, categorizing, and synthesizing important studies in the field of study. Accordingly, an evidence-based methodology is utilized in this study. By doing so, 43 relevant studies were identified and retrieved to answer the formulated research questions. The systematic methodology offers a methodical and rigorous study selection and evaluation process that is repeatable and precise. The result shows that research on data collection in Sensor-cloud is relatively consistent with stable output in the last five years. Ten proposal contributions were identified with System, Framework, and Algorithm being the most used by the selected studies. In conclusion, key research challenges and future research directions were identified and discussed for researchers to propose effective solutions to the existing challenges. Although research on data collection in Sensor-cloud is gaining some traction in recent years, the works in the domain are not sufficient and concrete proposals are needed to improve data collection.


I. INTRODUCTION
R ECENTLY,Wireless Sensor Networks (WSNs) were mostly deployed in many applications, such as forest fire detection [1], agriculture [2], health monitoring [3], and so on.Hence, WSNs used for these applications normally generate a vast amount of data that necessitates to be collected and processed in a minimal time period with relatively low delay.However, sensors are known to have a limited battery with limited computing capability and storage capability to support huge data transmission and processing.This constraint frequently leads to a small network lifetime.As a solution, the data processing and storage abilities of WSNs can be extended using cloud computing [4].With cloud computing, WSNs performance can be improved, such as service quality, computation latency, energy consumption, and so on.Therefore, the integration of WSNs and cloud computing is termed as Sensor-cloud.The last 10 years have seen quite a number of works on data collection in Sensorcloud by proposing different solutions on ways to enhance the efficiency and effectiveness of data collection.In recent years, many surveys and review papers were published on data collection of sensory data from sensor devices in the research domain (see Section II).In a study by Khan et al., the authors presented a taxonomy of numerous data collection VOLUME 4, 2016 schemes that used sink mobility [5].The authors identified some unresolved issues in the field of study.Waghmare and Chatur conducted a survey on energy-efficient data collection and routing algorithms in WSNs.The current issues and limitations of the algorithms studied were also discussed [6].In another study by Yetgin et al., the authors reviewed the current studies in WSNs which includes their design constraints, applications, and lifetime estimation models [7].However, based on our knowledge, a systematic literature review (SLR) on data collection in Sensor-cloud is non-existent in this research domain.Therefore, in this study, this SLR will try to fill the research gap by the identification, categorization, and synthesisation of important works in the field of study.Therefore, an evidence-based systematic methodology is utilized in this paper to ensure that significant and important studies on data collection in Sensor-cloud in the past 10 years (2011 -2020) are identified and retrieved.The methodology has a systematic selection and evaluation process with a detailed and repeatable studies selection process.This paper further presents results that is based on the identified selected studies overall demographics and characteristics, the contributions (with regards to data collection in Sensor-cloud) of the selected studies, the evaluation mechanisms utilised by the selected studies, and the performance measures utilised by the selected studies in the research domain.The study main contributions are as follows: • The conduction of a broad systematic review on data collection in Sensor-cloud.• The Analysis and synthesisation of current studies in the field of study.• Identification of the existing research challenges in the field of study and highlighting the areas that need attention from researchers.
The remaining of this study is planned as follows.The related work is presented in Section II.Section III articulates the research method used.Section IV presents the results with respect to the defined research questions (see Section III-A).Section V outlines the general discussion of the SLR study.And lastly, the conclusion is presented in Section VI.The paper is organized as illustrated in figure 1.Abbreviations used in this paper are defined in table 11.

II. RELATED WORK
This section highlights and discussed the existing review and survey studies in the field of study.Highlighting these studies will aid in articulating and solidifying the need for this SLR to be conducted.
Francesco et al. conducted a survey on data collection in WSNs [8].The survey gives a comprehensive taxonomy of WSNs architectures with the role of mobile elements.The authors also outline the data collection process and the existing issues and challenges were also highlighted.Wankhade and chavhan conducted a review on data collection methods.The authors further show the comparative study of various data collection methods and sink nodes data collection methods [9].In another study by Nair and Jose, the authors conducted a survey on data collection methods and routing algorithms in WSNs.The authors also outline the existing issues and challenges in the field of study [10].
Another study by Khan et al. produced a taxonomy of distinct data collection schemes which used sing mobility [5].The authors identified some unresolved issues in the field of study.Waghmare and Chatur conducted a survey on energyefficient data collection and routing algorithms in WSNs.The current issues and limitations of the algorithms studied were also discussed [6].In another study by Yetgin et al., the authors reviewed the current studies on WSNs.Facets such as WSNs design constraints, applications, and lifetime estimation models [7].A mini review was conducted by Ali et al. on data collection in smart communities using sensor cloud [11].
Lastly, other surveys such as [12] [13] also conducted a survey on data collection on WSNs.However, based on the review and survey papers discussed, there are no systematic studies in the field of study.Therefore, this study's objective is to fill this gap.Table 1 list all the review and survey studies with their limitations.

III. RESEARCH METHOD
In conducting an SLR, the identification, evaluation, interpretation, and reporting the research that is associated to a research domain of interest is necessary by a researcher [14] [15] [16].In this study, the adoption of an evidence-based searching and study selection procedures was done with the aim of improving transparency.Consequently, to conduct an SLR, a search plan has to be followed which is transparent, fair, and also unbiased.Therefore, the search plan has to guarantee the broadness of the search for assessment [17] [18].To this time, based on our knowledge, there is no SLR study that rigorously review and analyse the current research on data collection in Sensor-cloud (see Section II).Therefore, the aim of this study is to fill this research gap.To do so, we conduct an SLR by utilizing Kitchenham's methodology [19] .The systematic review procedures is the combination of many stages that have to be completed in a disciplined manner, these stages include the development of a review protocol, conducting a systematic review, analysis of the results, results reporting, results visualization, and finally discussion of the research findings.

A. RESEARCH QUESTIONS
The general objective of this paper is to have some insight of studies that are based on data collection in Sensor-cloud.Hence, to have a comprehensive view of this research domain, the SLR formulated four significant research questions (RQs).These RQs will help in categorizing and understanding the existing research in this domain and further identify the limitations and future research directions in the area of study.The four formulated RQs are presented below.In Table 2, the five electronic databases used in this study are highlighted.Hence, in this study, we considered these databases to be the prime data sources for retrieving any possibly relevant studies.On the other hand, Google Scholar was excluded.This is due to the issues of lack of precision of searched results with results overlapping from other data sources.Hence, all the important studies that are in Google Scholar are already retrieved by the other sources.

C. SEARCH TERMS
To successfully search for important studies, search terms are vital.In a study by Keele [14], the author recommended Population, Intervention, Comparison, and Outcome (PICO) perspectives.These perspectives was largely utilized by many SLRs and Systematic mapping studies [20]- [22] .However, in this study, with respect to the general foundation of PICO structure, we constructed a generic Search string to sustain the stability of search on many databases.Thus, to conduct the search in the data sources (Table 2), the outline generic Search string serves as a guide.Generic: (Sensor cloud AND Data collection) In this stage (study selection process), the main aim is to effectively identify studies that are significant to the objectives of our SLR study.In Figure 2, the study selection procedure (SSP) of this study is presented.The study selection process is in three phases, each of these stages was accomplished through an in-depth consensus meeting between the researchers to make sure that there is high confidence with least bias in the study selection process.Hence, if a particular study is in multiple sources, we only take one into consideration with respect to our search order.We initially found 3569 studies through our search.The search results of the study were integrated for different searchers (which are all the authors).The authors also carry out a preliminary screening of the 3569 study collected.This screening is with respect to studies' title, abstract, and conclusion.Hence, for each study screened, two researchers evaluated it to finally resolve if the study would be included.Consequently, for a study that was judge otherwise (the study should be excluded), further discussion was carried out by the two researchers who conducted the evaluation of the studies until an agreement was established.The aim of this screening was to primarily remove studies that were clearly not relevant or they are duplicate or they did not work on data collection in Sensorcloud.

E. INCLUSION AND EXCLUSION CRITERIA
In the quest to answer the defined RQs in this SLR, we formulated and used well-articulated inclusion (IC) and exclusion (EC) criteria to help in choosing relevant studies from the data sources.The criteria were used on all the studies collected in the different stages of the SSP (see Figure 2).We further set the data collection period from January 2011 to August 2020 (10 years) for studies search, this is to make sure that only the latest studies were included.Moreover, we also include early cited studies, as long as the full study text was available.In Table 3 and 4, we outlined the IC and EC criteria used in this SLR respectively.These criteria were utilized in the second and third stages of the SSP (see Figure 2).In the second stage, the IC and EC criteria were used based on the studies' titles, abstracts, and conclusions.Thus, 210 out of 456 studies were selected in the second stage.In the third stage, to improve the confidence in studies coverage, we applied a snowballing procedure on 210 full-text studies examined.On the same note, a backward and forward snowballing was conducted.To conduct backwards snowballing, the researchers search through the study reference list and remove studies that do not meet the criteria of this study.
For forward snowballing, the researchers analysed the studies based on the studies' citing the study being examined.With this, each study citing a particular study is examined.Therefore, in this study, we consider the inclusion and exclusion of a study based on the IC and EC criteria in Table 3 and Table 4 respectively and the quality attributes outlined in Section III-F.Hence, both criteria were used concurrently to the fulltexts of all the 210 studies.Lastly, 43 studies were finally selected for this study Quality assessment (QA) is critical and highly important in every SLR.QA of the studies was conducted in the third stage of the SSP.The inclusion and exclusion with the QA criteria were used to the retrieved studies in the second stage of the SSP.210 studies were collected by the researchers in the third stage where each study was examined by the researchers to remove bias.Consequently, to evaluate the quality of the selected articles, we designed a questionnaire.The design questionnaire was inspired by earlier systematic studies [23] [21].As a result, a scale of 1-4 served as the final quality score for a particular article.

G. DATA EXTRACTION
After the second stage of the SSP, the selected articles were then analysed by the review teams.Therefore, each article's full text was analysed by at least two researchers.As a result, vital information was extracted to a data extraction form.The form was composed of key list of items.These items are as follows.
• Title The results with respect to the RQs of this study are presented in this section.

A. RQ1:WHAT ARE THE SELECTED STUDIES DEMOGRAPHICS AND CHARACTERISTICS?
From the 210 studies that were examined based on all the defined criteria, 169 studies were removed while 43 were finally selected for this study.We intensely and critically analysed the 43 selected studies in order to answer all the RQs presented in Section III-A.In Table 5, all the selected studies are outlined in detail.

1) Publication over time
From Figure 3, we present the total number of studies that were published based on the year of publication (2011 -2020).In the last 10 years, there is a considerable amount of attention given to the field of study by researchers at a progressive passion.We observed that 2011 was the least active year with zero studies published.In other words, there is no study published in that year.However, throughout the years, we have seen an increased interest from researchers, particularly from 2016 -2020.This can be explained by acknowledging the build-up that occurs from 2012 to 2015 where a stable number of studies have been published, with 11 key studies published in those years.In these years (2012 -2015), key works have been published, such as S14, S20, S21, S28, S39, and S40 that serve as the foundation for new and veteran researchers to contribute to this new and interesting research field.The reader will also observe that in the year 2017 and 2019, there are many studies published in comparison to the rest of the years with seven studies each.This could be explained by the fact that some of the most popular high ranked Journal and Conference have produced some studies this year.Journal and Conference like Transactions on Industrial Informatics and Conference on Wireless Sensor Networks.In 2020, a conclusion cannot be driven due to our search cap (Section III).Hence, the year has to end for us to know the total number of studies published.In general, despite a slow start in the early years (2011 -2015), the research activity in the field of study continues to gain momentum with stable growth, mainly in the last 5 years (2016 to 2020).

2) Publication Channel and Quality Scores
In Table 5, we listed the publication channels, publication year, and citation count for each study.Generally, five different publication channels were identified, which are Journal, Conference, Symposium, Workshop, and Magazine.We observed that most of the studies were published in Conferences with 19 studies (44.19%) of the selected studies, 14   studies (32.56%) published in Journals, 6 studies (13.95%) were published in Symposiums, 3 studies (6.98%) were published in Workshops, and lastly, 1 study (2.32%) was published in Magazine (further presented in Figure 4).With this, the general quality of the selected studies is relative, because only 32.56%of the selected studies were published in Journals.Even though it is not a bad number, hence, more quality Journal publications are needed to improve the quality of research in the research domain.We also examined the selected studies for quality based on our quality criteria in Section III-F.In Table 5, we presented the quality score for each study.The results of the quality analysis demonstrate that all studies score more than 1.Also, only four studies score 2 which are S8, S9, S13, and S30.Ten studies score 4 (S1, S2, S3, S4, S5, S11, S26,S34, S42 and S43) and ten studies score 3.5 (S6, S12, S23, S24, S25, S31, S33, S35, S36, and S40).

3) Publication Source
With respect to the publication sources, Table 7 classifies all the studies based on their publication sources.This classification will aid in finding the publication sources that produce more studies in the field of study for the last decade.Additionally, we also present the publishers of each publication source.In total, 37 sources that published the selected studies were identified.Transactions on Industrial Informatics, Internet of Things Journal, Conference on Wireless Sensor Networks, and Global Communication Conference were the top contributors with 2 publications each, respectively.We also found that most of the studies published in the top  4 presents the publication channels.From the figure (Figure 4), one can see that ma-jority of the selected studies were published in Conferences Journals with 19 studies, followed by Journals, Symposiums, Workshops, and Magazine with 12, 6, 3, and 1, respectively.

4) Citation Impact
From Figure 5, the number of citations of all the selected studies were given.Hence, the citation count of each in-   dividual study is retrieved from Google scholar.Therefore, the citation count can change at any time.Overall, from our selected studies, we identified 3 study that has more than 100 citations.These studies are [25], [60], [61].We further find 7 studies with or more than 30 citations.These studies are [27], [40], [46], [47], [54], [56], [59].Generally, the overall number of citations for the selected studies is 1039, and the average citations per paper is 24.16.In answering this RQ, we look at the contributions that are proposed by the selected studies in this SLR.Based on our analysis, we have identified 10 contributions which are System with 27.95% of the studies, followed by Framework 23.25%, Algorithm 11.635% , Model 9.30% , Protocol 6.98%, Approach 6.98%, Investigation 4.65%, Method 4.65%, Architecture 2.33% and Topology 2.33%.In this section, the studies with their respective contributions will be discussed in detail.We observed that 12 studies have proposed Systems for data collection in Sensor-cloud.In a study by Zhang et al., an agriculture irrigation system was proposed through the use of sensor-cloud technology in the agricultural sector [2].The proposed system aids in collecting and the efficient processing of sensing data in agriculture irrigation.The result shows the performance of the proposed system in terms of energy consumption.In another study by Pansare and Bajad, a new system is proposed to help in detecting errors in a large sensor data during transmission [31].The system shows some promise.Ward and Barker introduced a scalable distributed data collection cloud system [33].The proposed system helps in collecting sensor data to a cloud system.The experimental result reveals some improvement.In a study by Li et al., a cloud-based data streaming system named WaggleDB was proposed [35].The system is proposed to address the challenges of data collection, data availability, efficiency, and so on, that are in cloud data infrastructure.The result shows some improvement.Charalampidis et al. introduced a fog-enabled IoT system utilized for sensory data collection.The experimental result shows some promise where the system reduces energy consumption [36].In another study by Soultanopoulos et al., the authors presented a system implementation and IoT service architecture for a gateway service running on smart devices [39].The system is built to help in the processing of sensor data prior to their transfer to the cloud.The result indicates that the proposed system supports fast data collection with real-time communication.In a study by Wu et al., the authors proposed a system named Concinnity.The proposed system takes sensor data from the source to the destination via a cloud data repository [48].A case study was conducted by the authors.The result shows some progress in terms of data anomalies detection.A remote health system was introduced by Stojanovic et al [50].The proposed system used sensor fusion which allows the processing and examination of IoT data from sensor devices.The result shows that the system increased accuracy.Gesvindr et al. proposed a system used for collecting sensor data from smart homes by utilizing TapHome solution [51].The result shows some promise.Min proposed a multi-network data acquisition system that is based on cloud platform with real-time data update of sensory data [53].In a study by Wang et al., a system was proposed.
The proposed system integrates blockchain technology that regards each mobile database as a block [58].The proposed system aid in data collection and analysis.The result shows some promise.Maiti et al. proposed a data collection system that supports the storage of sensors data to the cloud [62].
The analysis shows some promise.Furthermore, we observed that 10 studies proposed Framework.A three-layer framework was proposed by Wang et al.The framework is used multiple mobile sinks with fog structure [25].The aim of the proposed framework is to break the bottleneck of data collection from WSNs to the cloud.The framework was compared with various existing traditional solutions.The experimental result reveals that the framework can help in the improvement of throughput and the reduction of transmission delay.Mao et al. introduced a framework for a multi-cloud environment named parallel cloud data possession checking scheme [26].The proposed framework uses a homomorphic verification tag that is generated by a pallier cryptosystem to support unlimited query challenges with support for error localization and data correction.The result of the evaluation demonstrates the security and efficiency of the proposed scheme.Dash et al. investigated the key design issues and current challenges for sensor-cloud [27].Hence, in addressing the identified design issues, the authors introduced a framework that integrates sensor-cloud with sensor networks.In a study by Ghanavati et al., a cloud-based wireless body area networks (WBANs) framework was proposed [38].The framework is tailored toward real-time health monitoring of patients.The main objective of the proposed framework is to combine both mobile technology and cloud computing to provide services for patients.Based on a case study conducted, the result shows some promise.Liang et al. proposed a reliable trust computing mechanism (RTCM) [43].The framework helps in enhancing the reliability and efficiency of data transfer to the cloud.The result shows some promise.A framework named an efficient privacy-preserving-based data collection and analysis (P2DCA) for Internet of Medical Things(IoMT) applications was proposed by Usman et al. [45].Hence, the proposed framework is aimed to protect against privacy issues when collecting data to the cloud.The result demonstrates that the proposed framework is better than the current schemes.Bhuiyan et al. proposed a cloud-enabled remote structural health monitoring (cSHM) framework for remote structural health event detection [55].The proposed framework helps in facilitating the secured sensor data collection on the cloud.The experiment result reveals that the proposed framework performs very well in terms of data protection.An event-driven data collection framework in sensor-cloud was proposed by Bhunia et al. [63].The framework utilizes fuzzy logic to make certain of efficient data collection.The result shows some promise.
Enzo et al. proposed a novel paradigm coined fog of everything (FoE) paradigm.The proposed paradigm integrates fog computing and internet of everything.The result shows a good outcome [60].In another study by Enzo et al., the authors outlined the major challenges in conducting real-time energy-efficient management of resources at mobile devices and internet-connected data centres [61].
Algorithm was proposed by five studies out of the selected studies, which amount to 11.63% of the selected studies.Traub et al. proposed an algorithm that schedules read across a huge amount of sensors based on the data-demands [30].The algorithm aim is to enhanced data transfer from sensor nodes to sensor-cloud.The experimental result shows that data transmission effectiveness was improved.With the issues of how to upload sense data to the cloud within a small time which turn into a bottleneck of sensor systems, Li et al. proposed the utilization of multiple mobile sinks which will aid in uploading data from WSNs to cloud [29].The authors further designed a new algorithm which will schedule the multiple mobile sinks.Based on simulations conducted, the results demonstrate that the proposed algorithm performs very well with respect to a decrease in data upload latency and minimal energy consumption.Argyriou proposed an algorithm to maximized data delivery to the cloud for postprocessing for each sensor in WSN [42].Based on the simulation, the result demonstrates the algorithm performance with regards to raw sensor data collection to the cloud.With the upload of sensor data to the cloud within a small time becoming a bottleneck, Wang et al, proposed the utilization of multiple mobile sinks to help in data collection [46].Furthermore, to reduce data delivery latency, the authors proposed a time adaptive schedule algorithm (TASA) for data collection through multiple mobile sinks.The result demonstrates that the proposed algorithm can gather data from WSNs to cloud with limited latency and minimal energy consumption.Hence, makes the sensor-cloud sustainable.Tao et al. proposed a secure data collection algorithm named secure data with the goal of addressing security concerns during data transfer [47].The simulation result reveals that the proposed algorithm is useful when applied for security protection.
From the studies selected, four studies proposed a model for data collection in sensor-cloud.In a study by Wang et  [41].The proposed model combines both a sensor gateway and a cloud gateway.The result shows that the model supports large data collection efficiently.In a study by Chen et al., data collection scheme was proposed [52].The proposed scheme protects the collected data from attackers while maintaining data correlation.The simulation result shows that the proposed scheme is very efficient for data collection to the cloud with strong privacy properties.
Lawson and Ramaswamy proposed a model for monitoring tradeoff, an architecture that changes based on data quality, and customer data stream best matching cloud service.The authors concluded that their system will perform better.
With respect to Protocol proposals, three studies proposed Protocol out of the selected studies.Wang et al. proposed a new scheme, named energy-efficient and anonymous data collection.The proposed scheme is specifically for mobile edge networks (MENs).The aim of the scheme is to get a balance between data privacy and energy consumption where the privacy information of sensors is concealed in the course of communication [24].The result based on simulation shows that the proposed scheme is better than existing schemes with respect to lifetime and energy consumption.A data transfer protocol was proposed by El Mougy and El-kerdany.The protocol was built with some principles of TCP to tackle the issue of data collection from Bluetooth low energy (BLE) sensors to the cloud [37].Based on a simulation conducted, the result shows that the proposed protocol allows for a reliable data transfer and also reduces energy consumption.In a study by Wang et al., the authors proposed a bidirectional prediction-based underwater data collection protocol [64].
The proposed protocol uses mobile edge elements for data collection from end to cloud.The result shows that the cost of data collection was reduced and bandwidth utilization increases.
With respect to Approach proposals, three studies proposed it out of the selected studies.Gejibo et al. investigate the challenges that are related to remote mobile data collection to a central cloud and further proposed an approach that can provide solutions to data protection, sharing, and recovery [34].The authors conclude that the underlying challenges have to be further investigated.In a study by Nakagawa et al., the authors proposed an approach named m-cloud [49].The approach aids in collecting sensor data using cloud resources for IoT data.The result indicates some progress.Wang et al. proposed a comprehensive trustworthy data collection approach (CTDC) for sensor-cloud systems [59].Based on an extensive simulation, the result shows that CTDC improved performance in data collection.
Next, With respect to Investigation and Method proposals, they were proposed by two studies each out of the selected studies.A study by Yang et al. focuses on studying the data curation problems of IoT big-sensing-data processing on the cloud [56].The authors highlight the current trends with future research directions for big-sensing-data processing.Abdul Rahman et al. conducted a chain of experiments that measure the energy consumption of two IoT sensor nodes that are transferring data to the cloud [32].Hence, the experimental result will be useful in comparing IoT sensor nodes implementation in both wired and wireless scenarios.The result shows that the wireless connection consumes extra power in comparison to wire connection.Wang et al. proposed a data cleaning method that is based on a mobile edge node during data collection to sensor-cloud [54].The experimental result demonstrates that the proposed method enhanced the efficiency of data cleaning with enhanced data integrity and reliability.Also, the proposed method further decreases the energy consumption of the industrial sensorcloud system (SCS).In another study by Wang and Wang, to address the issue of bandwidth and real-time data collection issues of large-scale data collected from IoT devices to a central cloud, the authors introduced a new data collection VOLUME 4, 2016 method that uses deep learning technology [28].Based on an experiment, the proposed method performs effectively.
Architecture and Topology are the least proposed with one study each.Piyare et al. proposed an extensive architecture that helps in integrating WSNs with the cloud [40].Based on an experiment conducted, the result shows some promise.Mihai et al. proposed a three-layer topology for smart data monitoring and processing.The aim of the topology is to lessen the sending of raw data to the cloud, hence, to improve the ratio between useful information and noise [44].The simulation result shows some improvement.Table 8 presents the list of the identified contributions with respect to the studies that proposed them((x-Axes represent number of studies and Y-Axes represent year of publication).In Figure 7, the proposed contributions are presented with respect to the year of publication.

C. RQ3:WHAT ARE THE EXISTING EVALUATION MECHANISMS USED BY THE SELECTED STUDIES?
To fully know the contributions in terms of evaluation mechanism used by the studies selected, we outline and classify the current identified evaluation mechanisms used by the selected studies and further categorized the studies based on which evaluation mechanisms they utilized.In Table 10, the studies with respect to the evaluation mechanism they utilized are presented.In totality, we identify six evaluation mechanisms.These mechanisms are Experiment with 16 studies, followed by Simulation Our result shows that Experiments are the most conducted in the field of study with 37.21% of the selected studies utilizing it (as shown in Table 9), followed by Simulation and Theoretical analysis with 30.23% and 13.95% of the studies, respectively.Furthermore, we found out that 6 out of the 16 studies that used Experiment are studies that contribute Systems for their proposals (S12, S14, S15, S18, S29, S30), while 3 out of the 13 studies that used Simulation contributes Algorithm for their proposals (S8, S25, and S26).Hence, these are the most contributions among the highlighted most popular evaluation mechanisms.

D. RQ4.WHAT ARE THE PERFORMANCE MEASURES USED TO EVALUATE THE SELECTED STUDIES?
In answering this RQ, we identified 36 (83.72%) studies out of the selected studies that used performance measure for evaluation, while seven studies did not use any performance measure as shown in Table 10 (S6, S10, S28, S32, S35, S39, S41).From the selected studies, various and diverse performance measures where identified.Most of the studies used a combination of more than one performance measure.However, despite the performance measures been so diverse, one of the most conducted and most common is Energy consumption performance measure.We observed that 16 out of the 43 selected studies used energy consumption for their evaluation (S1, S2, S5, S8, S11, S15, S16, S17, S19, S25, S26, S33, S36, and S38).This amounts to 37.21% of the selected studies in this research domain.The dominance of measure such as energy consumption is relative due to the research directions of these studies, where the studies focus on data collection from sensor devices to sensor-cloud.When dealing with data collection in WSNs, energy is always a huge concern.Hence, the domain is dominantly solution proposal driven, where a researcher normally has to propose a new system, model, framework or algorithm, and so on (as shown in Table 8) to help in the collection of data in sensorcloud.Table 10 highlights the performance measures used by the selected studies.

V. DISCUSSION
In this article, we conducted an SLR on data collection in sensor-cloud.Data collection in sensor-cloud has gained substantial attention from researchers in the past 10 years.Of recent, the collection of sensory data from sensor devices to the cloud has become a crucial and important issue in the research domain.In this section, the results related to the RQs are summarized and discussed through the presentation of the research findings, research challenges, and future work directions.

A. RESEARCH FINDINGS
The key objective of this SLR is to examine the current works in the area of study.To do that, 41 studies were selected based on the adopted methodology in Section III for analysis.Hence, these selected studies were deeply analysed and synthesized to help in addressing the RQs is outlined in table 2. The main findings of this SLR are presented as follows.
Based on our analysis with regards to demographics of the selected studies, we observed some stability with a consistent output of publications in the past 5 years.The result shows that 2011 was the least active year with zero studies published.In other words, there is no study published in that year.Furthermore, we have seen an increased interest from researchers, particularly from 2016-2020.This can be explained by acknowledging the build-up that occurs from 2012 to 2015 where a stable number of studies have been published, with 11 key studies published in those years.Hence, in these years, key works have been published, such as S14, S20, S21, S28, S39, and S40 that serve as the foundation for new and veteran researchers to contribute to this new and interesting research field.We found that most of the studies were published in Conferences with 44.19% of the selected studies, this makes it the highest publication channel amount all the identified publication channels.With this, the general quality of the studies selected is relative, because only 32.56% of the studies selected were published in Journals.Even though it is not a bad number, hence, more quality Journal publications are needed to improve the quality of research in the research domain.With respect to the quality of the selected studies, the result demonstrates that 23.25% of the studies have a total quality score of 4 (which is the highest score), while also 23.25% has a quality score of 3.5.This indicates that the selected studies have some quality, With respect to contributions proposed in the field of study, we identified 10 key contributions.Out of the ten identified contributions by the selected studies, three were found to be more proposed by researchers, which are System, Framework, and Algorithm with 27.95%, 23.25%, and 11.63% respectively.In answering RQ3, we found out that six evaluation mechanisms were utilized by the selected studies, which are Experiment, Simulation, Theoretical analysis, Hybrid, and Case study.37.21% of the studies selected utilized Experiments, followed by Simulation with 30.23%.These mechanisms are the most used by the studies selected whereby cumulatively they were utilized by 67.44% of the selected studies.Furthermore, we found out that 6 out of the 16 studies that used Experiment are studies that contribute Systems for their proposals (S12, S14, S15, S18, S29, S30), while 3 out of the 13 studies that used Simulation contributes Algorithm for their proposals (S8, S25, and S26).
Performance measures are key when it comes to measuring and evaluating a proposal's effectiveness with respect to data collection.In this study, 36 studies were identified to used performance measures for evaluation out of the 43 selected studies.We found out that one of the most conducted and most common is the Energy consumption performance measure.We observed that 14 out of the 41 selected studies used energy consumption for their evaluation (S1, S2, S5, S8, S11, S15, S16, S17, S19, S25, S26, S33, S36,S38,S42 and S43).This amounts to 37.21% of the selected studies in this research domain.The dominance of measure such as energy consumption is relative due to the research directions of these studies, where the studies focus on data collection from sensor devices to sensor-cloud.When dealing with data collection in WSNs, energy is always a huge concern.

B. CHALLENGES AND DIRECTION FOR FUTURE WORK
A comprehensive review of the selected studies was conducted in this study.This SLR findings will allow researchers to know the existing contributions on data collection in Sensor-cloud.The study will also help researchers to know the evaluation mechanism and performance measures utilised by the studies selected in data collection.Therefore, the identified challenges with respect to the scope of this study were highlighted in this section.Also, the direction for future works is also given for further research in this research domain.
From figure 3 we have seen that the research output in this domain is relatively stable from the last 5 years.However, despite the stability, the output is quite low, where maximally, there is no publication year that has more than eight studies.This is a cause for concern looking at how important the research area is.Hence, we urge the research community to be more active.With 32.56% of the studies selected published in Journals, the general quality of the selected studies is perceived to be poor based on the research team consensus.Hence, we encourage both new and veteran researchers to publish more papers in Journal sources, because in general, Journal publications are more qualitative and have more depth.Evaluation mechanism such as Case study and the newly categorized in this study Hybrid, have received less attention from the research community.The combination of more than one evaluation mechanism (Hybrid) is very important and essential for rigorous evaluation of a given proposal.Hence, for future works, more evaluation should be conducted with a Case study and Hybrid mechanisms.Lastly, Performance measures are key when it comes to measuring and evaluating a proposal's effectiveness with respect to data collection.We observed that measures such as Energy consumption have dominated the field where most of the studies have utilized it (see Table 10).However, more diverse performance measures should be utilized and taking into consideration for a more rigorous evaluation.Hence, measures such as latency, data transmission, and delay should be used more by future works.
In addition, data collection in Sensor-cloud falls under the Fourth industrial revolution [65], where Internet of things (IoT) technologies such as WSNs are combined with cloud computing to provide real-time interface between the physical and virtual worlds.This paper research domain falls under this realm.However, due to the limitations of industry 4.0 (Fourth industrial revolution), various industries are looking ahead to industry 5.0 (Fifth industrial revolution), where sensory data can be collected autonomously with strong Artificial Intelligence (AI) presence.Hence, for future works, we encourage the research community in this domain to explore the utilization of industry 5.0 in their future research works.These explorations will further help in moving the research area forward to new heights.

C. THREAT TO VALIDITY
The limitations of this review have to be considered to have an overall analysis of the results gained from this SLR.Therefore, the key threats to the validity of this SLR are twofold, which are the incompleteness of the study search and the biases on study selection.Hence, in this section, all these threats are discussed.
Firstly, with respect to the incompleteness of the study search, key studies can be missed in the process of retrieving the studies.This can affect the general completeness of the study search.Therefore, to alleviate this threat and further make sure that all significant and prospective studies are taking into consideration, a general search was done on the selected databases (see Table 2).This data sources contain a huge amount of Journals, Workshop, Conference, and Symposium in this domain that are indexed.Furthermore, the selected studies were backward and forward referenced searched to make sure that significant studies are taken.Even Secondly, with respect to the study selection process, in order to decrease bias by researchers, we formulated a very clear and precise IC/EC criteria.Each researcher can have a different view of the IC/EC criteria, hence, the results of study selection of individual researchers are possibly going to differ.To alleviate this bias, we conducted a pilot selection so as to make sure that an agreement between the researchers is attained on the general meaning of the criteria.The possible mismanagement of duplicate studies is an additional threat.Hence, the threat might have marginally changed our results.32 potential duplication were found and were thoroughly assessed to find out if they are the same study.In addition, to select a study, the final decision is taking by the two researchers that did the search process.Therefore, any disagreement that arises between the two researchers will be fixed between them.This will be done through discussion between them until a tangible agreement is established.Furthermore, the other researchers will check the final selected studies.For this study, peer-reviewed papers were solely include.Nonetheless, it's likely that we have missed some vital non-peer-reviewed studies in this domain.

VI. CONCLUSION
In this study, an SLR was conducted that presents a 10 year (2011-2020) summary of the current literature on data collection in Sensor-cloud.Out of the 3569 papers retrieved from the initial search conducted, 210 papers were selected based on rigorous analysis, of which 43 studies were selected based on the defined IC and EC criteria.
Our findings show that the research in this domain is relatively new, with a moderate and stable amount of studies published in the last 5 years.44.19% of the studies selected where published in Conferences, followed by Journals, Symposiums, Workshops, and Magazine with 32.56%, 13.95%, 6.98%, and 2.32%, respectively.With respect to quality assessment, the result demonstrates that 23.25% of the studies have a total quality score of 4, which is the highest score that is set in this study.However, the majority of the studies, 53.50% score less than 3.5 as there quality score.With respect to the publication source, we identified four sources that are more noticeable.These sources are Transactions on Industrial Informatics, Internet of Things Journal, Conference on Wireless Sensor Networks, and Global Communication Conference with 2 publications each, respectively.Furthermore, the result of our analysis shows that there are 10 main contributions which are System with 27.95% of the studies, followed by Framework (23.25%),Algorithm (11.63%),Model (9.30%), Protocol (6.98%), Approach (6.98%), Investigation (4.65%), Method (4.65%), Architecture (2.33%), and Topology (2.33%).We observed that System and Framework are the most proposed contributions in the field of study.On evaluation mechanisms, we found out that six evaluation mechanisms were utilized by the selected studies, which are Experiment, Simulation, Theoretical analysis, Hybrid, and Case study.37.21% of the studies selected utilized Experiments, followed by Simulation with 30.23%.These mechanisms are the most used by the studies selected whereby cumulatively they were utilized by 67.44%of the selected studies.With respect to performance measures, 36 studies were identified to used performance measures for evaluation out of the 43 selected studies.We found out that one of the most conducted and most common is the Energy consumption performance measure.We observed that 16 out of the 43 selected studies used energy consumption for their evaluation.
Finally, our research shows there is a substantial amount of interest by the researchers in the research domain considering the consistency in publication in the last 5 years.With this consistency, we expect more contributions with respect to proposals in years to come.Moreover, with the research challenges and future research directions presented in V-B, researcher must take them into consideration to help in tackling the identified challenges.

Figure 1 :
Figure 1: Overview of Research Work

Figure 3 :
Figure 3: Number of articles published per year

Figure 6 :
Figure 6: Top 7 countries with the most studies

Figure 7 :
Figure 7: Analysis of the proposed contributions with respect to years of publication

Table 1 :
Existing review and survey papers [7]ot focused on data collection in sensor cloud • Not a systematic literature review[7]2017 A survey of network lifetime maximization techniques in wireless sensor networks • Not focused on data collection in sensor cloud • Not a systematic literature review [12] 2017 A survey on multipath routing protocols for QoS assurances in real-time wireless multimedia sensor networks • Not focused on data collection in sensor cloud • Not a systematic literature review [11] 2018 Data collection in smart communities using sensor cloud: recent advances, taxonomy, and future research directions • Not focused on data collection in sensor cloud • Not a systematic literature review • Not focused on data collection in sensor cloud • Not a

Table 2 :
Electronic databases

Table 4 :
Exclusion Criteria Number Exclusion Criteria EC1 Studies that are not written in the English language EC2 Studies that are not associated with the RQs EC3 Gray articles; for instance, articles without key information, like publication date/type, issue numbers, and volume were excluded EC4 Duplicate articles (latest study is included in a situation where multiple studies on the same theme are available).The rest are excluded EC5 Studies with unclear results and findings

Table 5 :
Overview of selected studies

Table 6 :
Quality evaluation of the selected studies

Table 7 :
Publication Source

Table 8 :
[4]posed contributions in the field of study The model works in a way where the data retrieved from WSNs is processed separately by algorithms on edge servers from privacy computing[4].As highlighted by the authors, the benefits of the proposed model is twofold.The model helps in preserving data privacy and it is implemented by different storage methods.Based on a rigorous experiment and theoretical analysis, the proposed model was validated and has shown some promise.To deal with constant and long-duration monitoring and collection of data from sensors, Grace and Sumalatha introduced a model for sensor-cloud coined senud controller

Table 9 :
Evaluation mechanisms used by the selected studies

Table 10 :
Performance measures utilized by the selected studies all these actions to enhance the general completeness of the study search, this paper can still suffer from selection bias.This is due to the fact that other libraries like EI Compendex, Taylors Francis, Emerald Insight, and Citeceerx were not taking into consideration.

Table 11 :
Definitions of all acronyms mentioned in the paper VII. APPENDIX

Table 12
presents the selected studies information concerning the authors' names, institutions, and countries of the studies