NFV Data Centers: A Systematic Review

Recently, the widespread use of Network Function Virtualization (NFV) has meant a significant shift in communication and network cloud-based services. NFV increases network function, allocation, and flexibility, for the implementation of new network services, The benefits from optimizing resource allocation in NFV data center include automation of the operational processes, reducing the operational efforts, and lowering capital costs. Despite being a hot topic, NFV data centers are still in their early stages. In this sense, there are some aspects involved in the current NFV implementations that required further investigation, such as performance evaluation and managemen. This paper presents an in-depth analysis of NFV applications in data center. Therefore, a Systematic Literature Review (SLR) was conducted aimed at studying and synthesizing the available works in the NFV state-of-the-art. Consequently, about 1,408 papers have been analyzed and filtered, evidencing a total of 65 relevant studies. This paper documents the relevance of the 65 papers, as well as provides an indepth insight into the primary open research questions concerning NFV architecture. In particular, highlighting the need for consolidating the integration of NFV in data centers by solving problems related to performance, dependability, resource allocation, cost, management, and resource interoperability. Based on the conducted SLR, several research gaps and open challenges have been identified, allowing the authors to provide new propositions and suggested future research avenues. Finally, the obtained results have shown that there is a strong need for further collaboration between academia and industry to conduct quality research and advance NFV practical implementations.


I. INTRODUCTION
The data center infrastructures are specialized value-added services that provide scalable data processing and storage capabilities for organizations of any size, such as Amazon, Google, Facebook, and Yahoo. In recent years, services based on the softwarization and virtualization paradigms, such as Network Functions Virtualization (NFV), Software Defined Networking (SDN), and Cloud Computing, have emerged as a solution for data center architectures are providing flexibility by implementing a software-based network over a physical infrastructure, allowing OPerational EXpenditure (OPEX)/CApital EXpenditures (CAPEX) reduction [1]. These applications have gained great popularity, especially NFV ones, as demonstrated in different studies, such as, for The associate editor coordinating the review of this manuscript and approving it for publication was Ting Wang . instance, the survey conducted by IHS Markit, 1 projecting an increasing use of NFV until 2020, reaching an investment of $11.6 billion.
NFV is a recent virtualization paradigm that hosts network functions typically embedded in dedicated network devices, such as firewalls, load balancing, and Deep Packet Inspection (DPI), in Virtual Machines (VM) or containers [1]. In particular, deploying NFV in data centers transfers network functions to Commercial-Off-The-Shelf (COTS) hardware (e.g. standard x86 servers), rather than specialized proprietary devices. In this way, the hardware standardization allows service providers to reduce costs while earning economic flexibility. SDN is a network architecture that allows decoupling of the data plane from the control (logical) plane. The former consists of forwarding devices, such as switches and routers (either physical or virtual), while the latter makes decisions on how to handle network traffic. In particular, the control plane is logically centered on a software entity, called the SDN controller [2]. Cloud Computing is a widely used network service; it provides users with access to critical data of any scale, at any time, anywhere. Although NFV, as well as SDN, can be used individually, significant advantages are achieved when using them together [3]. The advantages are particularly noticeable in the case of cloud computing applications where NFV provides an automated provisioning, and SDN allows centralized control by enabling network administrators to manage network resources from a central console.
Currently, literature reviews exploring NFV applications, such as the ones in [3], [4], and [5], have been published. In particular, [3] studies integrated NFV/SDN architectures; it synthesizes the most relevant architectural projects in the literature, and identifies areas for further improvement. In [4], a general approach to the VNF design is provided, whereas in [5], VNF is studied in terms of flexibility. Despite the mentioned efforts, to the best of the authors' knowledge, none of them have specifically addressed the NFV applications in data centers.
Moreover, the previous studies, as identified in the literature, have neither covered phases, techniques, and essential topics of applying NFV in data centers nor provided a broad view of the challenges. As already said, NFV applications use general-purpose, commodity hardware components towards delivering software-based services rather than hardware-based ones. In this way, NFV data centers improve the network management capabilities, providing support for firewall, and load balancing functions, enabling provisioning, network configuration, bandwidth allocation, automation operations, monitoring, security, and control policies [1]. The implementation of NFV applications results in a new NFV data center environment with additional resource allocation requirements, for example, raising availability levels, ensuring fault tolerance, and reducing costs. Consequently, to achieve a higher quality of service (QoS) and adapt to the growing demand of users, further review of the computer network's mechanisms is required. Researchers have long discussed the resource allocation problem in computer networks and cloud computing [3], [4], less attention has been paid on resource allocation for virtualized functions in NFV data centers [6]. To bridge this research gap, we conducted a Systematic Literature Review (SLR) to synthesize the available knowledge concerning NFV applications in data centers. Besides providing a deep insight into them, making a particular focus on some of the main NFV aspects that still need to be further investigated, such as performance evaluation, management activities, and resource allocation. To the best of our knowledge, this is the first SLR specifically focusing on NFV applications in data centers.
The main goals of this SLR are the evaluation and analysis of the advances and current challenges in the NFV data center field as reported by both scholars and industry. Thus, we intend to identify current research gaps and suggest a direction for future research. The SLR addresses the main challenges for NFV in a bottom-up manner. We discuss different NFV application scenarios and present a complete and description of the research problem of NFV data centers. Also, we summarize the most relevant and accesible methods, metrics, and techniques used for VNF placement, scheduling, migration, load balancing, scale, and diagnostics. To do so, we use a search string capable of guaranteeing in-depth results for the researched area. The contributions of this paper are as follows: • Provide and organize a comprehensive overview of existing NFV data centers research, and discuss state-ofthe-art research efforts to address the main challenges in the field.
• Devise a novel classification within NFV data center's in terms of technologies, assessment methods, metrics, techniques, tools, and research in both industry and academia: -Technologies used with NFV applications in data centers -Network Assessment and Measurement Analysis Methods -Network Assessment Metrics -Techniques and Tools -Research in the Academic and Industry Context • Identify crucial challenges in NFV applications in data centers, highlighting different inherited challenges, concerns, and perspectives.
• Identify the main barriers and challenges of NFV applications in data centers, as well as the lessons learned identified in the reviewed papers.
• Discuss common observations in a detailed section regarding learned lessons.
• Provide an online solution for browsing the results of this study to be made available at Github. 2 • Highlight the realization of NFV data center challenges and identify the open research challenges and issues, including opportunities for relevant future research. The remainder of the paper is organized as follows. Section II presents the technical background. Section III introduces and explains the methodology followed in conducting the SLR. Sections IV and V describe the pre-processing and classification of the SLR data. Section VI presents the main challenges in the field of NFV data centers. In particular, Subsection VI-A addresses the inherent challenges, whereas Subsection VI-B presents the main concerns and perspectives. Section VIII, based on the results of this SLR, discusses opportunities for relevant future research directions based on the conducted SLR, highlighting the most critical opportunities and open challenges. Section IX discusses the main implications of the obtained results for academia and industry. Section X discusses the main threats to the validity of the conducted SLR. Finally, Section XI provides the concluding remarks.
• The NFVI contains all hardware and software required for virtualizing environments through a hypervisor. The hypervisor is the controller of the VMs, for example controlling their instantiation and dynamic migration.
• OSS/BSS is in charge of accommodating the carrier's processes; it manages the legacy system that must be integrated with the NFV environment. Since VNFs can reside in data centers, network nodes, or even in near end-user installations, they are supported by virtualization in combination with many hardware features as well as leveraged by cloud computing foundations, bringing new ways in which network operators can create, deploy, and manage their services. SDN, NFV, as well as cloud computing, introduce new means for efficient and flexible utilization of data center infrastructures through a software-centric service paradigm. Figure 3 depicts the use of general-purpose machines in an NFV data center, which allows virtualizing several different network functions, for example, Packet Inspection (PI), firewall and load balancing, among others. In particular, using NFV in data centers has the advantage of allowing the initiation of VNFs based on data centers' on-demand networks,therefore, providing flexible VM resource allocation. In practice, VNFs are a large set of servers that act on hosts, where each host can run one or more VMs. Each VM requires a data path to the external network, that is, a path between the network interface and the VM. In this process, the following NFV requirements should be fulfilled: • Physical and virtual network management. • Migration and provisioning through different environments.
• Optimization of resource allocation.
• Ensuring elasticity so that the service agreements can be fulfilled.
• Providing the same level of attendance to virtual and physical network functions. The Service Function Chain (SFC) is defined by a logical and ordered sequence of VNFs called Service Function Path (SFP), where each chain has an identifier. The SFC requires a qualification process for traffic design for the service chain. The qualification process is followed by the differentiation of routing/flow of traffic routines through the set of Service Function (SF). Thus, the flows of different services can be isolated and simultaneously traverse the same SFC. VNFs belonging on the same SFC [12]. Therefore, a Service Function Forwarder (SFF) must exist at each node of the NFVI to provide virtual links to hosted VNFs. The Internet Engineering Task Force (IETF) is developing protocols that more efficiently implement SFC. IETF/ETSI also works together to define the Network Service Header (NSH) that is the SFC encapsulation required to support the SFC architecture [12]. SFFs create the SFP that the flow must traverse. The encapsulated flow will then allows the selection of SFPs and, if necessary, the sharing of metadata information. SFC is of paramount importance to the realization of network slicing [13], which consists in granting isolated and QoS-aware resources provisioning to different tenants over the same infrastructure.

III. SLR PLANNING
Researchers in the field agree that fulfilling the above-listed requirements is a challenging and complicated task. Since NFV applications in data centers are in their early stages, there is still much research to be conducted in this line. In this section, we present an SLR towards providing an in-depth insight into the current scenario of NFV data centers, its most concerning issues, its main open challenges, and the most exciting opportunities for directing relevant future research. In this paper, the SLR is conducted based on the widely adopted guidance provided in [14]. The following subsections introduce and describe each of the followed research steps, as suggested in [14].

A. RESEARCH QUESTIONS
The formulation of the research question is mandatory in order to set the basis for the entire literature review process. According to [14], to conduct a successful review, this question should be direct and concise. The main research question addressed in this paper is as follows: (RQ1) What are the current concepts and challenges related to NFV applications in data center architectures?

B. SELECTION OF SEARCH STRATEGIES
According to [14], the first step after formulating the research questions is to select the search strategies, in terms of data sources, search periods, and search keywords. In this paper, the database sources recommended in [15] for computer network research publications (listed below) have been used as primary sources(web search engine): ACM (dl.acm.org), Wiley (wiley.com), Scopus (scopus.com), IEEE (ieeexplore.ieee.org), Springer (link.springer.com), Elsevier (sciencedirect.com), GoogleScholar (scholar.google.com) and Taylor&Francis (taylorandfrancis.com).
The conducted search includes papers published between 01 January 2014 and 30 November 2019. We decided to set the search period from 2014 since the earliest published research on NFV in data centers began that year. Once establishing the search period, we defined the search keywords, based on the research topic and the formulated research questions. The identification of the keywords followed the search string creation guidelines introduced in [14]. The purpose of this step is to create naming terms and synonyms by deriving major terms from the research questions and identifying keywords in the most relevant papers. In particular, the search includes the logical operators OR and AND to link the main keywords and their abbreviations and synonyms. Then, after several tests, we selected the search string capable of returning the largest number of relevant articles. The general form of such a search string is as follows: (''DC'' OR ''Data Center'') AND (''NFV'' OR ''Network Function Virtualization'').

C. SELECTION OF STUDIES
The complete search string sequence was performed to search the previously identified database sources, considering the title, abstract, and keywords. A priori, all studies in the English language obtained from web search engines were selected as primary studies. Then, to organize and limit the number of studies retrieved from the search string application, where the same refinements or filter options used for the database sources. In this way, we favored the inclusion of studies related to the research topic, avoiding the addition of another type of article. In a subsequent step, these studies went through a further selection and evaluation process. This additional process consisted of three different steps, being an article included for further processing in the next step if it was approved in the previous one. The different stages of this new selection and evaluation process are depicted in Figure 4, and described as follows: • Step 1-Automatic Search: In this step, an automatic search of the primary studies were performed based on the selected search string. Then, if the resulting papers proposed a solution for NFV data centers, we included them in the following steps.
• Step 2-Exclusion: In this step, we evaluated the content of the previously selected studies (Step 1) towards identifying suitable ones. To do so, we read the title and abstract of each paper, including them in the next step only if they proposed a solution for NFV data centers. An article will be excluded if it does not meet the following quality criteria: (i)Is there a clear statement of the goals (i.e., target environments and problems to solve) of the research?(ii)Is the architecture design well detailed? In other words, is it possible identify the used methods, tools, the place of NFV framework design?(ii)Are the experiments realized to evaluate the ideas presented in the study?
• Step 3-Quality Screening: In this step, the previously selected studies in Step 2 passed through a quality screening. A study was excluded if it did not meet the following Quality Criteria (QC): -(QC1): Is there a clear statement of the research goals (i.e., target environments, and problems to solve)? -(QC2): Is the architecture design well detailed? In other words, is it possible to identify the used tools, the NFV framework design, and its elements? -(QC3): Do the conducted experiments allow us to evaluate the main ideas presented in the study? Each of the above-described criteria has three possible responses: Yes, Partly, or No. ''Yes'' answers count for 1 (one) point, ''Partly'' ones count for 0.5 points, and ''No'' ones count for 0 (zero) points. To be accepted, a paper must obtain a score greater than or equal to 2, mathematically, QC1 + QC2 + QC3 2. Three different researchers conducted the described research process individually. Once all had completed their search, they reviewed the lists to exclude duplicates and solve discrepancies. Also, once the described process was finalized, the statistical measurements for SLR quality were applied to minimize the influence of the researcher's random selection [16]. In particular, we applied Cohen Kappa's (K) coefficient to measure the agreement level between the selections while using the Landis and Koch methodology to test the review criteria reliability. A Kappa value above 0.70 identifies a significant level of agreement, indicating that the researchers' selection processes were not ambiguous. In the case of our selection process, the calculated Kappa VOLUME 8, 2020 coefficient resulted in being 0.861, indicating a high level of agreement.
Following the process described above, the results obtained are as follows. The automatic search process (Step 1) returned 1,408 primary studies. From them, we identified 733 duplicate and removed them. Then, researchers reviewed the remaining studies based on their title and abstract (Step 2), resulting in a total of 275 studies. Finally, after Step 3, the total number of relevant studies was 65. All the results and discussions presented in this paper are based on the remaining 65 studies. For the sake of clarity, a unique identification code (IDx) is assigned to each of the relevant studies. Figure 5, identifies the publication venue where the selected studies have been published. From Figure 5, it is evident that researchers prefer scientific conferences to publish their studies (43 papers), followed by journals (18 papers), indicating that the academic community has widely discussed NFV applications in data centers. Figure 6, clearly illustres the time distribution of the selected studies, showing the increasing interest of the academic community on the subject. Specifically, 19 studies were published between 2014 and 2015, 21 between 2016 and 2017, and 25 between 2018 and mid-2019. This further confirms the increasing interest in the field towards researching NFV applications in data centers. Finally, although the number of publications in the NFV data center field increases year after year, it is evident that the topic requires further research.

IV. SLR DATA PREPROCESS
In this Section, the data gathered from the 65 selected studies is pre-processed so that it can then be classified and analyzed.

A. QUALITY ASSESSMENT
To classify the 65 selected studies according to their quality, we applied a quality assessment process based on the criteria introduced in [17]. Table 1 identifies the nine quality criteria.  As can be seen in Table 1, we further grouped the quality criteria into four categories, viz., Reporting quality, Rigor, Credibility, and Relevance. We evaluated each category in terms of a Likert scale. A Likert scale is a widely used method for conducting SLRs, considering four responses: −2 (totally disagree), −1 (partially disagreed), 1 (partly agreed), and 2 (fully agreed). Accordingly, papers achieving the maximum value of eight, are those most relevant to this paper. To obtain a quality scale varying from zero to ten points, we divided the calculated value from the Likert range by two and sum with 1 (LikertScale/2) + 1.
Each study that underwent a quality assessment received a score from zero to ten. Figure 7 shows the quality analysis results. As shown in Figure 7, none of the studies obtained the maximum score. Also, 41 papers reached the average quality value (dashed line in Figure 7), while 24 were below average. Besides, most of these 24 studies also obtained a low evaluation in terms of the ''Rigor'' category. Finally, it is essential to highlight that the studies ID38] did not present a clear correlation with the research objectives, tending to explain the processes and results, rather than the methodology. The lack of clarity of these studies makes replication of their conducted research challenging to achieve the same or similar results.

B. DATA EXTRACTION
Three independent researchers organized the data extraction in datasheets. The researchers compared the data and measured the percentage of agreement of their selections. As a result, the researchers identify any inconsistency in terms of relevancy to the topic and methodology. Table 2 shows the adopted data extraction form based on the one proposed in [17].

V. SLR DATA CLASSIFICATION
In this section, we analyze and classify the collected (as described in Sections III) and pre-processed (as described in Sections IV) SLR data.

A. IDENTIFICATION OF THE RESEARCH PROBLEMS IN NFV APPLICATIONS IN DATA CENTERS
In this section, we analyze the relationship between the identified and addressed NFV data centers problems, as well as the various solutions proposed in the selected studies. Figure 8 presents a Sankey diagram showing the studies' flow from the identified research problem (left side of the graph) to the proposed solutions (right side of the chart) in terms of metrics (a) methods, (b) techniques, and (c) evaluations, respectively. The thickness of each link in Figure 8 encodes the amount of current from the research problem to its solution, providing quantitative information regarding their relationship. Each diagram represents directed and weighted graphs with weight functions that satisfy flow conservation. The sum of the input weight for each node is equal to the output weight. The simulation model and flow chart can be analyzed in detail using the flow tracking feature, which allows us to track individual flows through the flow chart. In this way, Figure 8 allows the exploration of complex flow scenarios helping us to identify the different research gaps, as well as the potential solutions proposed in the literature. In particular, according to the data collected from the 65 selected studies, the actual complexity of the problem streams in NFV applications in data centers is quite diverse and challenging. In this context, it is crucial to carefully analyze Figure 8 to answer the following research questions: • What is the relationship between the metrics (a), methods (b), and techniques (c)?
• What are the research gaps? Based on the information provided in Figure 8, we can identify the main research problems associated with the NFV data centers and classify them into different categories. In this paper, the identified categories for NFV data centers' problems are: placement, scaling, diagnostic, migration, mapping, network design, scheduling, security, and load balancing (see VI-B.2). Therefore, the 65 selected papers were classified into the above named research problem categories. For clarity, each paper was assigned a unique identification code (IDx), then classified into a problem category with reference for the paper. Table 3 shows the found NFV data center problem categories and the relevant corresponding papers.

B. CLASSIFICATION OF NFV APPROACHES FOR DATA CENTER APPLICATIONS
In this section, as identified in Subsection V-A; we classify the selected studies in terms of the proposed approaches towards addressing NFV data centers problems. In particular, to provide insight into the research questions listed in Section III-A, the following subsections classify the adopted technologies (Subsection V-B.1), the applied network topologies (Subsection V-B.2), the proposed network assessment and measurement analysis methods (Subsection V-B.3), the evaluation metrics (Subsection V-B.4), tools and techniques (Subsection V-B.5) and the context, either academic or industry, in which the research was conducted (Subsection V-B.6)). Here, it is essential to highlight that; unfortunately, it was not possible to classify all the selected studies in terms of the categories mentioned above. In some cases, the characterization of the applied methods was not  adequate; the research design was incomplete or lacked analysis details. Figure 9 shows complementary technologies used for integrating NFV in data centers. From Figure 9, shows that 20% of the relevant published studies address both Cloud and SDN-based technologies. The analysis of the 65 selected studies shows that, although there has not always been evident cooperation between the technologies in the same article, there is a relationship between the two techniques. In recent years, there has been an increasing trend of applying NFV together with SDN and Cloud technologies. In the case of combining SDN and NFV, SDN provides flexible and automated connectivity between VNFs, while NFV can use SDN as part of an SFC. In this sense, since SDN and NFV are complementary technologies allowing more dynamic traffic in the network management, their cooperation provides an evolutionary scenario of the current networks in data centers [82], [84]. When combining Cloud and NFV, several researchers indicate that using NFV principles applied in the Cloud context can have a significant influence on the carrier Cloud, helping to reduce the Total Cost Ownership (TCO) [19], [65].

3) NETWORK ASSESSMENT AND MEASUREMENT ANALYSIS METHODS
According to [18], [85], network assessment methods encompass performance, dependability, and sustainability. In this context, evaluation methods, such as Markov chains, benchmark, or optimization algorithms, are usually adopted. Figure 10 identifies the network assessment methods used in the selected studies. In particular, 75% employed optimization algorithms, while 13% did not explicitly state based on methods. Two studies ( [20], [67]) rely on Markov chains, while the authors in [29] used analytic models with Integer Linear Programming (ILP).

4) NETWORK ASSESSMENT METRICS
Network evaluation methods provide an understanding of the Data center's architecture system. In general, the utilization of a computational resource is determined based on its metrics including the performance of the VNF in terms of placement, scaling, diagnostic, migration, mapping, Network Design, scheduling, security, load balancing, such as throughput, capacity planning, delay, Round Trip Time (RTT), queue size, and packet loss rate. Also, the evaluation may include the analysis of reliability, availability, and energy-saving performance at multiple layers of the protocol stack. Figure 11 shows the selected studies assessment metrics frequency.
From Figure 11, it is clear that 84% of the selected studies are performance-related, 10% address energy, and 6% dependability. Based on the results shown in Figure 11, we conclude that researchers in the field have paid significant attention to performance-related research, while there remains a research gap regarding the study of dependability metrics.

5) TECHNIQUES AND TOOLS
As suggested in [18], two criteria were considered to identify the main techniques used in the selected papers. On the one hand, we analyzed the explanation of the main techniques by the study's author. On the other hand, we explained our VOLUME 8, 2020  perception about the study under consideration, evaluating to what extent we agreed with the author regarding the employed techniques. Our perception agrees with the author, by creating and improving these systems, we are generally increasing performance, dependability, and efficiency energy measures. The adoption and diversification of technology and tools exacerbate the importance of robust experimentation in this area. Figure 12 presents the identified techniques with relation to the number of papers where they were employed.
As can be seen from Figure 12, most of the studies address the research based on simulation techniques. The most used are related to developing exact, heuristic, and metaheuristic models to specify the VNF service chain. We did not find any similar work that relied on the Game-Theory-based technique. However, in a broad sense, Game-Theory has been applied in resource allocation of energy-aware NFV in some studies [86] and a distributed service chain composition solution [87]. Also, in some studies, such as [ID07] and [ID13], the technique was not clearly introduced, whereas for [ID17] and [ID20], we identified more than one technique. The main tools used to perform the simulation studies were CPLEX,   Table 4 the papers by technique.
In this context, both industrial researchers and academics have concurrently been researching the architectural components of networks. All this raises the question of how a virtual network function can be developed and deployed for use in a data center scenario. For example, Cisco, ATT's and Cloud4NFV platform integrated cloud data centers have already developed and run their own proprietary NFV. On the other hand, there are some open-source projects, with several options for SDN and NFV development. Examples of popular NFV development libraries include ClickOS and NetBricks. Figure 13 shows the proportion of studies using each of the mentioned techniques.  research has been developed. Also, Figure 14 shows the joint academia and industry research partnership.
From Figure 14, 46 of the 65 selected studies were conducted within the academic context (only academic research, not considering partnerships with industries), while four were undertaken within the industry scenario (only industry research). Finally, 15 studies, which represents 23% of the total selected studies, were conducted as an academic and industry partnership. The results acknowledges, there is still low interaction between both sectors in the field of computer networks, highlighting the need for greater collaboration to solve the current research problems.

VI. IDENTIFIED CHALLENGES OF NFV APPLICATIONS IN DATA CENTERS
In this section, we introduce and discuss the identified challenges from the analysis of the selected studies. In particular, in Subsection VI-A, we discuss the inherent difficulties of the NFV application in data centers, while in Subsection VI-B, we discuss the main concerns and perspectives in the field.

A. INHERITED CHALLENGES
The efficient allocation and management of network resources issues are primarily discussed in the field of a computer network [6]. In the particular case of virtualized networks, most of the proposed approaches are focused on embedding virtual resources into the physical infrastructure [7], [35]. In this context, we must pay particular attention to the risks inherent in the physical infrastructure, such as nodes and links. That is, since the network components of the physical layer are prone to failure, risks must be considered in both the virtualized and the physical infrastructures. Then, the implementation of NFV makes it necessary to use more efficient mechanisms to improve resource allocation and ensure fault tolerance. Therefore, one of the main challenges consists of properly dealing with NFV performance as a function of workload variation [88]. In this sense, especially in cloud computing, fault-tolerant elasticity of service provisioning is the primary concern. One possible solution is to split VM into different nodes or clusters to help to solve the data center management problem and reducing power consumption. This is because by running virtualized services together with physical servers, the processing capacity of these devices can be leveraged as carefully as possible to their totality. It is essential to understand the capacity of the resources and the types of services provided by the cloud data centers to avoid any waste of available resource capacity, while reducing energy consumption.
NFV data centers are more dynamic and flexible than traditional ones and provide services based on software rather than on hardware. In this context, resource allocation strategies should be implemented to increase performance levels and availability. An orderly chained VNF, usually called SFC, is typically accomplished by installing routing inputs into switches within data centers. For example, in a ''Network Protection'' service, the essential service functions are firewalls, DPI, and virus scanner. The most common SFC is a linear set of service functions between endpoints. In more complex cases, service functions divide network streams into different paths. Such service chains modeled as targeted graphics are called VNF Forwarding Graph (VNF-FG) [89].
Several questions arise when deploying SFC in the data centers. First, it is essential to describe the functional and non-functional properties of a VNF. Thus, it is necessary to accurately describe what the service is and how it works, taking into account the service interfaces and the technical constraints to their use. The second question is regarding the deployment of SFCs, taking into account the order constraints. Although the service functions are usually independent, many of them have strict requirements. Therefore, a consistent way of enforcing the ordering of service functions in a service chain is required. It is particularly crucial since the deployment of network services is associated with the topology of the data center. Also, delivery constraints may impact the service in terms of service optimization. Solving this problem helps to determine which physical nodes the VNFs must be placed (Placement) and when executed (Scheduling). Figure 15 shows a set of challenges for SFC operation in the data center according to the mentioned questions.
The four steps in the implementation of SFC in a network include Service Chain Description (SC-D), Service Chain Composition (SC-C), VNF-FG Placement (VNF-P) and VNF scheduling (VNF-S).
• The first step, based on a popular method for describing services on the Internet, such as the Unified Service Description Language (USDL), describes Web and Internet services from commercial, operational, or technical perspectives. Although significant work has been published in the literature describing services in the data center, the available service description languages are not easily adapted to elastic service environments where upgrading service chains is a challenging task. • The second step consists of efficiently concatenating the VNFs to create a network service that satisfies the applicant's objectives.
• The third step, according to pre-defined requirements, assigns the VNFs into the appropriate PM network. Each VNF is defined with a type that includes computing, storage, or networking, then, assigned to the appropriate physical node. In practice, the orchestrator, relying on the VNF managers and the VIM, assesses all the conditions for assigning VNFs to physical resources. Most of the existing approaches in the literature are focused on addressing the problems in this third step.
• The fourth step aims to allow VNFs sharing physical machines (PMs) to minimize the total run-time of network services. Thus, it determines how to mapping each virtual link belonging to the same SFC to routing paths on the physical network and how to allocate bandwidth along those paths belonging to the same SFC.
• In particular, fifth step scheduling of VNFs is must find time slots in which activities are be processed under certain resource constraints, and precedence between these activities. This step is responsible for finding the corresponding slots for the VNFs belonging on the same SFC that compose the different network services to be executed on a given set of servers. It is a very complicated task, being necessary to conduct further research in this direction. Finally, since NFV applications in data centers are still in their early stages of development, there are many aspects of such applications that need further research, including performance, dependability, resource allocation, cost, management, and resource interoperability [6], [56], [67], [70], [90], [91].
Although the selected studies propose different approaches to address the above-listed challenges, the particular focus is resource allocation and performance evaluation. Besides, all the authors agree that adopting a clear and well-designed methodology is crucial towards succeeding in both resource allocation, for example, deployment cost, delay and throughput and performance evaluation issues. In Subsection VI-B, we provide an in-depth analysis of the main concerns, as well as the main perspectives in the field of NFV applications in data centers.

B. CONCERNS AND PERSPECTIVES
One of the main challenges for the deployment of NFV in data centers is to achieve an efficient allocation of the resources of the required services in the network infrastructures. Resource allocation aims to optimize NFV in the data center, using algorithms to determine the best positioning of the VNF as well as the VNF scheduling, among other requirements. Resource allocation is primarily investigated in the literature of computer networks, either in Cloud computing or in the SDN contexts [49], [54], [92], [93]. Nevertheless, researchers in the field agree that it is still an open research issue [6]. Subsections VI-B.1 and VI-B.2 present the main concerns and perspectives regarding resource allocation problems.

1) RESOURCE ALLOCATION CONCERNS
As previously stated, NFV data centers use standardized hardware; in this way, NFV data centers are more dynamic and flexible, thereby reducing CAPEX, OPEX, and energy consumption. According to the analysis of the selected studies in the previously discussed SLR, the following aspects of NFV data centers require improvement: • Availability: Availability is the ability of a system to perform a defined function in an instant of time. NFV applications require high availability in data centers, in the sense that data centers should be able to provide the required resources and within a short time frame.
• Maintainability: Maintainability is the ability of the network to perform repairs and modifications. In general, network maintenance should be performed at the early fault detection before a serious problem occurs. Maintainability is tightly related to availability; the faster maintenance occurs, the more likely it is to have high availability.
• Reliability: Reliability is the likelihood that service continues to operate correctly, even when a failure occurs. Network infrastructure must be reliable to avoid downtime.
• Response Time: The response time, which is a crucial aspect of NFV, refers to the propagation time between the virtual host and the physical one. One of the most widely used strategies to reduce the response time consists of dividing network traffic among several servers. However, this distribution makes performance analysis more difficult.
• Energy Saving: Reducing energy consumption is crucial in any modern computing system. NFV contributes to reducing power consumption due to its virtualized characterization. One of the main challenges in this regard is to exploit further the benefits of virtualization technologies towards improving energy efficiency in data centers.
• Fault tolerance: The fault tolerance concept refers to reducing the probability of a network device failure. In particular, minimize the impact of a problem;therefore, the system continues to operate even when problems occur. Fault tolerance practices are developed based on the preventive and predictive models. In this line, recommended alternatives include limits of reliability, redundancy, as well as self-protection mechanisms, for example, firewalls that filter traffic against anomalies and attack signatures.
• Cost: from a financial point of view, can also become a complex aspect of VNF deployment. A single VNF operation can imply costs for multiple service or resource providers, such as network operators, software manufacturers and cloud owners. The cost for data transmission can be extremely high.
• Delay (end-to-end): is the time it demands the packet to travel, from its origin to its destination. It encompasses the following types of delays: transmission, processing, queueing, and propagation.
• Throughput: it represents the amount of applications processed in a given time for a virtualized environment.
Flow is an even more packet critical factor due to the hypervisor layer and containers, which may belong to different virtual networks. In addition to the challenges posed by virtualization, there is also the impact of SFC.
Many VNFs will be composed of several VMs, which may or may not be resident on the same server.
• Performability: The performability allows for combining performance and availability. This joint view permits sufficient understanding and analysis of performance degradation scenarios. For example, if an overloaded host suffers performance degradation, this could be an adverse effect of VM migrations or the number of shutdowns, VM failures, thus, can explain such degradation since the event condition would indicate that the network is working correctly (with failures) over time.

2) RESOURCE ALLOCATION PERSPECTIVES
Based on the analysis of the selected studies, we have identified several challenges that are currently open and require further investigation towards achieving an efficient resource allocation for NFV application in data centers increasing their performance in terms of the above listed concerning aspects: availability, maintainability, reliability, response time, throughput, energy saving, and fault tolerance. The main identified perspectives are listed as follows: • Migration: Migration in data centers allows the transfer of a VM from one physical device to another. In some cases, migration can lead to the unavailability of the service. The causes include: service misconfiguration, lack of hardware compatibility, presence of undesired dedicated hardware, lack of network access, or inadequate computing resources. [57].
• Scalability: Scalability is a crucial aspect for the success of NFV applications in data centers. Scalability refers to the data center's capability to handle an increasing amount of work while maintaining availability, reliability, and performance as the amount of traffic increases. In this sense, scalability can also help to handle unexpected peak traffic loads. Ensuring scalability can help to avoid failures, performance degradation, or violation of Service Level Agreement (SLA) requirements caused by over or under-provisioning [94]- [96].
• Elasticity: Data centers should have enough elasticity to adjust and adapt to different types of workloads. In this regard, the data center should be prepared to handle an increase, as well as a decrease in workload. During a decrease in workload, the resources are assigned to other processes [33], [54], [94], [97].
• Placement: Placement refers to identifying the best location(position) for VNFs. Researchers agree that, given the highly dynamic nature of VNFs, placement is quite a complicated task. In general, optimization methods are used to solve placement problems. Although the literature has proposed different placement approaches, further research is needed to achieve better positioning results [47], [48], [92]. More formally, a cloud management infrastructure decides which VNFs should be co-located on the physical server. The problem can naturally be divided into sub-problems: (i) A Placement VOLUME 8, 2020 problem to determine the location of the VNFs belonging to the same SFC, (ii) Mapping determine how to map each virtual link belonging to the same SFC to routing paths on the physical network and how to allocate bandwidth along those paths belonging to the same SFC.
• Slice Allocation and Deployment: Network slicing has profound implications for resource management. By instantiating a slice, an operator needs to allocate computational and communication resources to their VNFs. In some cases, these resources can be dedicated, making them inaccessible to other slices [13]. On the other hand, smart allocation algorithms can dynamically allocate resources to slices based on the changing time demands of tenants. Although these types of algorithms have the advantage of allowing modifications to the shared resources assigned to each tenant, they also introduce additional complexity to the system. Thus, requiring further research for the development of algorithms capable of efficiently and dynamically allocating resources to slices.
• Mapping: Network mapping determines the best way to allocate virtualized network resources to a physical substrate network. In particular, for each request to create a VNF, mapping is responsible for assigning virtual nodes to physical nodes and virtual edges to one or more physical paths [66], [98].
• Scheduling: In general, the available physical and network resources are limited, making the efficient resource scheduling, in terms of optimizing the VNF distribution process according to SLA requirements, to be one of the most challenging issues towards succeeding in the NFV implementation [78], [81], [99], [99].

VII. LESSONS LEARNED
This section summarizes our findings from the analysis of the SLR conducted as described in Sections III, IV, and V, and the identified research challenges, concerns, and perspectives discussed in Sections VI-A and VI-B, respectively. The research findings from the SLR conducted in this paper are listed below: • Improving the cost-to-cost ratio of VNF-SFC in the data center increases network availability, and reduces the complexity of the VNF placement problem.
• Reducing network traffic and wastage of computing resources increases network performance and reduces the complexity of VNF placement problem.
• Using VIM in a multi-service architecture improves network and storage performance. The use of VIM also reduces the cross-layer design problem in NFVI architecture.
• Improving the use of network resources and reducing saving consumption reduces the complexity of the placement problem in NFV-MANO architecture.
• Improving the network flow control, through role assignment policies for new communication scenarios, increases the network performance and reduces the resource allocation problem in NFV-MANO architecture.
• Automating the VNF scheduling increases availability and reduces performance detraction.
• Improving resource allocation by using fail-safe preventive measures as affinity models minimize the impact on physical hardware lifetime.
• Improving evaluation in the NFV data centers with performability measures is essential to derive optimal design parameters. Depending on the required NFV distribution performance, the parameters include system size, subsystem size, and component status performance rates, among others.
• Finally, specific components, such as: processors, memory modules, disks, and network interface cards, can bound the performance of a VNF implementation. Thus, a possible auto-search solution with reinforcement learn model, including the components which can determine the optimal deployment size for a given workload in realtime.

VIII. FUTURE RESEARCH DIRECTIONS: OPPORTUNITIES AND OPEN CHALLENGES
The SLR presented in this paper has highlighted several research gaps and open challenges related to the benefits of NFV applications in data centers. As can be seen from the discussion presented in VI, where the most critical issues are listed, the most serious and concerning ones are those related to the constant demand for Internet traffic. Also, the unique characteristics of SFCs require revisiting the existing solutions in the literature. In particular, in the case of SFCs, new challenges appear because they have to deal with the implementation of multi-vendor service functions in geographically distributed data centers. Also, the rigorous ordering of service functions is generally defined dynamically by traffic flows. These considerations impose constraints, such as dynamic routing and traffic, on the solutions already proposed in the literature. In this section, we perform an in-depth analysis of the found research gaps towards better understanding the main barriers to applying NFVs in data centers, as well as future research directions towards proposing new and innovative solutions.

A. ELASTICITY AND SCALABILITY
As already said, scalability is an essential feature for applying NFV in data centers towards delivering the the end-user with on-demand resources. Nevertheless, it is challenging to know when and how to scale resources to meet such requirements. In such a dynamic cloud environment, achieving the required elasticity becomes a research issue, being difficult to predict user demand. In this line, although there exist different prediction algorithms for on-demand applications, we proposed further research towards achieving a high level of elasticity required in data centers.
In the SLR conducted in this paper, 4 out of 65 selected studies address scalability issues, viz., [52](ID22), [53](ID26), [54](ID34), [55](ID45). According to [52](ID22), most of the scalability solutions available in the literature are heuristic-based approaches. In fact, in [53](ID26), [54](ID34) and [55](ID45), authors proposed different heuristic approaches for scaling NFV in data centers, showing that such approaches are well-suited for addressing scalability issues since they provide optimized solutions quickly. Authors in [52](ID22), do not agree with [53](ID26), [54](ID34) and [55](ID45), arguing that heuristic approaches lack of theoretical background, and presents a scaling approach based on ILP. Providing further solutions, some researchers proposed using deep learning algorithms for addressing scalability [100], [101]. Nevertheless, the time-space trade-off can pose significant challenges to the use of such algorithms in the context of NFV-MANO applications, leading to serious scalability issues. Unfortunately, such issues are likely to appear when using deep learning solutions in the context of large network traffic control systems like the Internet. For instance, in the case of NFV, it not only requires a highly parallel packet processing method but also needs to have significantly ample storage. Using a high-speed Storage Area Network (SAN) can be a viable approach providing access to the relevant data set for deep learning system training by moving the unwanted data set to backup file storage.

B. VNF PLACEMENT
Network functions can be strategically positioned according to their function. This analysis usually considers optimization methods to define the optimal location of the NFV. In particular, convergence is due to positioning influences on each network conditions in a data center. Also, since functions are independent of the node location, they can become part of the placement process. For each function, the placement strategy must consider capacity constraints and topological distances between customers and nodes. These functions allow the seperation of evaluation and data planning interests. Analyzing the behavior of a service network infrastructure requires a high-level optimization algorithm.
Most of the 65 selected studies analyzed in the SLR address the placement problem, confirming it is a crucial aspect towards NFV data center success. In particular, the 33 studies addressing placement as depicted in Table 2. Based on their analysis, we identified many challenges associated with VNF placement: (i) consideration of the location of the NFV while planning the NFV architecture of the data center [24](ID10), [26](ID12); (ii) define VNF positioning strategies in a geographically distributed data center [29](ID17), [32](ID20); (iii) use of minimization and maximization functions to provide dynamic positioning criteria for virtual nodes [20](ID02); (iv) investigate the direct impact of placing a virtual router on the behavior of the entire network system [21](ID05), (v) investigate the strategy of backup and reduction of power consumption placement problems using, for instance, a multi-objective algorithm in real-time [34](ID24); (vi) simultaneous optimization of the placement of VNF on servers and traffic load paths [31](ID19), [51](ID65); and (vii) defining the necessary placement of VNFs, minimizing network usage and network operating costs, without violating SLA requirements [23](ID08). Addressing such optimization issues implies dealing with restrictions that increase the cost-effectiveness or performance requirements of NFV applications, being the main constrains associated with maximizing the desired factors and reducing the unwanted ones. In this line, it is necessary to propose and develop new algorithms. The new algorithms will analyze strategies that optimize the internal power consumption of data centers and minimize performance degradation while maximizing the power use of NFV systems. Design a new heterogeneous network architecture for optimization through NFV.

C. NFV SECURITY ANALYSIS
Predictive and prescriptive analysis of big data have the potential to protect networks and systems against cyberattacks. Based on the considerable amount of data, security service providers can easily predict the behavior of the user/client and prescribe corrective action. VNFs of intrusion detection systems are most effective when instantiated near an attack source or in a data center specialized in traffic monitoring. Therefore, in this flexible data center environment, a chain of network functions can have instantiated components in domains of different carriers. In a virtualized scenario, effective communication between end-points is done by connecting large data centers, which can provide, on-demand, end-to-end services tailored to each application through SFC. By attacking a virtualized function, the entire service chain can fail.
SLR studies addressing NFV security issues include, viz., [61](ID32), [82](ID42), and [48](ID56). Accordingly, most of the security software solutions available in the literature only address the identification of abnormal traffic patterns as potential risks to networks. The solutions proposed in literature must be revisited before determining if they are applicable in NFV scenarios. For instance, one of the critical challenges is ensuring security during the VNF configuration. The challenge is because elasticity control must pass through some functional blocks, such as NFV Orchestrator, VNF Managers, and VIM. Deep learning has widely demonstrated its massive impact on cybersecurity applications. Nevertheless, these approaches still need to be implemented in intrusion detection systems. In this line, how to enable deep learning applications to detect potential threats on a user scale needs to be further explored.
The selected studies [61](ID32), [82](ID42), and [48](ID56), state that efficient coordination of NFV and SDN through a Security Service Chain (SSC) is a promising approach to address security threats, for example, insider attacks that cannot be observed by traditional security devices in the network boundary. In [61](ID32), a VNF placement approach is proposed as network security monitoring in data centers to oversee communications between VMs and between VMs and external hosts. While [82](ID42), proposes an ILP approach to determine the composition of SSC that optimizes the resource allocation without violating security and resource requirements. From the analysis of the studies in [61](ID32), [82](ID42), and [48](ID56), we can infer that there is a current trend in NFV security regarding applications based on the SSC concept.

D. FAULT TOLERANT VNF SCHEDULING
The scheduling problem, which is an NP-hard problem in nature, has largely been discussed by researchers in the field. The following SLR studies address fault-tolerant scheduling issues: [75](ID14), [76](ID37), [77](ID39), [78](ID46), and [79](ID47). Traditionally, approaches, such as heuristics and meta-heuristics, have been proposed to address fault-tolerant scheduling [6]. Also, in recent years, several mechanisms for fault tolerance, such as the ones based on results and parameters SLA violation, number of migrations, CPU utilization, number of PMs, and power consumption, have been proposed in the literature [6]. Nevertheless, there are still several scheduling open challenges that require further consideration. In general, a useful criterion defined for the VNF scheduling in PM is the one aiming at achieving an optimal trade-off between power consumption and availability, considering aspects of fault tolerance, such as prevention and prediction [75](ID14). In this sense, the implementation of the PM scheduling in such a way that guarantees agreement between the QoS parameters and the SLA [75](ID14), [77](ID39). For instance, in [77](ID39), taking into account resource requirements and QoS parameters, dynamic VNF embedding, and scheduling algorithms are proposed based on a Mixed ILP (MILP) approach. Nevertheless, some QoS parameters, such as jitter and packet delivery rate, have not been extensively considered [6]. Besides, within the context of SFC applications, additional constraints appear. For instance, different paths may belong to different service providers; therefore, each path has different service quality requirements. Moreover, to provide QoS, the VNF must monitor the real-time performance of the network function. Therefore, to address these additional constraints identifying the need to revisit exisitng solutions in the literature [77](ID39).

E. ENERGY EFFICIENCY
The current trend in the energy efficiency field is to add as much energy as possible, as well as harvest energy from natural resources to reduce operating costs. As suggested in the SLR, studies addressing energy issues focus on new approaches that are based on energy-sensitive and environmentally friendly SFC solutions ( [75](ID14), [34](ID24), [58](ID60), [73](ID62), [74](ID64)). In this context, these studies agree that it is crucial to take into account energy aspects when designing optimal SFC schemes. In particular, consider the node power consumption rate as a critical metric in designing algorithms to optimize the chaining, placement [34](ID24), and scheduling [75](ID14) of service functions.
In particular, a critical factor regarding energy efficiency is the high energy consumption of cooling devices [73](ID62). On the one hand, the data center's steam-absorbing cooling systems can utilize waste heat. On the other hand, data center cooling consumes a significant amount of energy. Also, server power densities are increasing with the use of stacked and multi-core server designs, which further increases cooling costs [73](ID62). In this context, reducing the energy consumption and the costs of cooling is crucial to improve the energy efficiency of data centers. In fact, in recent years, data center cooling is one of the principal areas where new energy efficiency solutions are being introduced [73](ID62). In particular, researchers agree that free cooling techniques based on steam absorption can improve the Power Usage Effectiveness (PUE) value by neutralizing cooling costs. In this line, place Cloud Data Centers (CDC) in areas with free cooling resources available. In this way, low-temperature regions could use the heat generated by the Data Center for heating installations. [73](ID62).
Finally, it is essential to highlight that energy-efficient solutions are not only feasible when designing a new data center, but also when updating it. It is simplified when dealing with dynamic systems with dynamic control possibilities. The additional advantage of allowing the system to predict power overload events is highlighted in the SLR analysis addressing energy issues ( [75](ID14), [34](ID24), [58](ID60), [73](ID62), [74](ID64). We have observed that previous works did not use predicted methods. But it is possible to develop predictive control methods for safe power overlap subscription instead of using reactive control methods. For example, the one proposed in [34](ID24) detects power overload events and reduces their duration through power limitation. In this way, if successful, the predictive control method would prevent power overload events by preventing any transient power spikes to the data center power infrastructure, where such power spikes compromise data center reliability. Finally, a significant challenge for analyzing tenants' energy consumption is the availability of a multi-tenant data center.

F. CAPACITY PLANNING
Data centers must ensure proper capacity planning towards achieving a solid Return of Investment (ROI). It provides efficient capacity planning, including the analysis of VNF traffic before execution. In particular, to avoid VNF failure, require an effective data storage capacity plan. VNF or hardware failures, which are quite common in a large data center, can lead to intermittent system downtime and subsequent breach of SLAs. In this context, the SLA should set the QoS parameters to ensure backup, recovery, storage, and availability, improving user satisfaction, and attracting more future customers. In recent years, researchers have highlighted the importance of applying NFV capacity planning considering Cost-Benefit Prediction (CBP) to decide the optimal number of PMs that minimizes the total cost of ownership for a given SLA requirement. Also, researchers have discussed whether it is more cost-effective to use cheaper and less reliable PMs, or more expensive and more reliable PMs, to ensure the same availability characteristics. In such cases, optimization problems can be solved using stochastic search algorithms called simulated annealing. Also, such prediction models can be used as cloud management tools. The SLR conducted in this paper, found several studies related to the context of network design, such as [67](ID03), [68](ID09), [69](ID29), [70](ID30), [71](ID38), [72](ID51), [73](ID62), and [74](ID64). Nevertheless, none of them directly addressed the capacity plan aspect, making it an open research challenge. Besides, the capacity plan can use the characterization of the VNF performance allows identify the most appropriate types of resources to be allocated to a VNF at deployment time. As observed by [102], considering a given resource configuration, estimate the VNF capacity; determine optimal resource configuration; evaluate different operational system virtualization/hardware alternatives, and compute system overhead.

G. VNF RECONFIGURATION
One of the main challenges in a virtual network environment consists of incorporating virtual network requests into a physical network with finite resources. In particular, due to the dynamic nature of such requests, admitting all requests on the physical network is usually a difficult task. A possible solution to addressing this issue consists of coupling the requests to the admission control policies towards admitting as many requests as possible without significant performance degradation. In this context, it is crucial to develop optimization algorithms capable of dealing with a demand for resources that changes over time without having full knowledge of such demand. In particular, the reconfiguration of virtual nodes and links, by remapping and redirection, can help to improve the network performance [103]. In this case, it is essential to take into account that each reconfiguration can have a cost. On the one hand, virtual node migration leads to an increasing amount of traffic, while redirection dictates the configuration of new paths and the demolition of existing ones. Finally, under certain network conditions, such as deterioration, it may be necessary to explore new policies to trigger reconfiguration strategies [103]. In the SLR, only one study addressing VNF reconfiguration has been found ( [72](ID51)), suggesting that further research needs to be undertaken in this direction. In particular, [72](ID51) proposed the construction and reconfiguration of multiple Virtual Data Centers (VDC) and VNFs on top of the data plane.

H. DYNAMIC RESOURCE ALLOCATION
Studies addressing the dynamic resource allocation include, viz., [50] (ID21), [35](ID25), [82](ID42) and [77](ID39). According to these papers, the large-scale commercial application of data center networks requires the dynamic provision of services. In particular, as needed, network providers should be able to dynamically allocate and scale resources on a growth basis. Static chaining approaches do not consider the possibility of re-composing, re-mapping, and re-scheduling some previously composed, mapped, and programmed service requests. Nevertheless, in real-world applications, service requests arrive at the network dynamically, at an unknown time with unknown duration. In this context, where the service requests may change over time, chaining algorithms must solve the online problem as the service request arrives, i.e., in real-time. In this case, re-composition, remapping, and re-scheduling may be required, involving new challenges [50] (ID21), [77](ID39). Based on the analysis of [50] (ID21), [35](ID25), [82](ID42), and [77](ID39), the authors place special focus on: • VNF Resource Management in a real-time network to virtualized infrastructure.
• Deployment of stand-alone VNFs for distribution across data centers.
• Consideration of QoS to determine the scale or allocation of resources required for network traffic demand.
• Evaluation of the execution time of positioning migration strategies.
• Evaluation of the time needed to reconfigure the deployment of some services.
In addition, in the case of involving SFC, complexity increases, because service deliveries involve function chains [50] (ID21). The chaining can be built manually or dynamically through a user interface. Several challenges arise due to multiple mappings, including • The creation of algorithms that allow restrictions on reserved resources (network interfaces, memory, location, and CPU cycles).
• The design of an algorithm that creates multiple instances of SFP for each SFC. In this case, the algorithm should provide dynamic SFP, avoiding cases of runtime congestion.
• Develop models capable of minimizing traffic to reduce the number of points of delivery (PoD) that are used in each SFC. These models should consider scalability and simplify capacity planning between the data center resource between the containers.
• Develop an optimal resource utilization algorithm.
• Reduce network delays due to congestion.

I. PERFORMANCE AND DEPENDABILITY ANALYSIS
To offer highly available cloud computing services, data centers comprise a huge number of servers, which in turn consist of a vast number of carefully designed (but prone of failure) hardware components, such as processors, memory modules, disks, and network interface cards. According to the SLR studies addressing performance and dependability issues ( [24] (ID10), [60] (ID21), [76] (ID37), and [63] (ID43)), the probability of failure is multiplied to such a large scale and can easily lead to performance degradation. In this context, the reliability and performance of cloud services are two of the main concerns of researchers in the field. In [104], server repair/failure rates are characterized to understand the reliability of hardware for large cloud computing infrastructures. As a result, cloud service reliability modeling is separated into two parts, namely, reliability modeling of the request stage (failures such as overflow and timeout) and reliability modeling of the execution stage (failures such as lack of data resources, lack of computing resources, software failure, database failure, hardware failure and network failure). The analysis conducted in [104] demonstrates how a large scale system complicates the cloud reliability modeling. In this line, researchers have to address the following issues: • Reduce the costs of repairing failures for a given PM utilization rate.
• Longer Mean Time to Failure (MTTF) trade-off analysis vs. faster Mean Time to Repair (MTTR) on system availability.
• The Trade-off between the availability of SLAs cost vs. operational costs, including repair, replacement, and power costs.
Availability management is another concerning task in the field. According to [24] (ID10), [60] (ID21), [76] (ID37) and [63] (ID43), the challenges are multiple, those of most concern are as follows: • Data centers, which contain VNFs, include connected devices such as servers, switches, and links, each with multiple layers (e.g., software and hardware), and can have multiple topologies.
• Device failures are quite common, and the failure and repair processes of each device in a data center can have heterogeneous probability distributions.
• Device Failures can have different influences on service availability, for example, VNF vs. a core switch.
• The availability of an SFC request may vary depending on its location in the data center.
• Redundancy may need to be provisioned to meet the need for availability [105].
• Determining the optimal amount of backup resources in a dynamic data center environment is essential for reducing resource consumption.
• Predicting the impact of heart variations avoids the effects of aging, leading to a degradation in the performance and reliability of a network device, either virtual or physical, thus limiting its expected lifetime [106].
• When applying an aging aware design in NFV data centers, a reliable prediction of the effects of aging is needed to optimize the system's end of life performance.
Finally, in order to address the above listed challenging issues, it is necessary to develop prediction algorithms capable of estimating the availability of SFC requests quickly [24] (ID10), [60] (ID21), as well as of allocating the minimum amount of resources needed to meet different availability requirements [24] (ID10), [60] (ID21), [76] (ID37), and [63] (ID43), to any data center topology with heterogeneous fault and repair processes for each device.
J. NFV RESILIENCY Achieve Failure resiliency by using an automated mechanism to recompose and rescheduling failed service chains after failure. By ensuring the continuity of the service, this process should have no impact on other chains. In this way, by providing resiliency in NFV and SFC, load balancing can be located with Service Function Forwarders (SFFs) to give robustness to path failures [107]. The literature proposes a variety of methods for protecting network services from failure. Among them, traditional redundancy mechanisms, such as the Virtual Router Redundancy Protocol (VRRP), are one of the most popular ones. [108], proposes a recovered service based on temporarily assigning failed VNF tasks to another VNF. In [109], the SFC mapping with availability recognition is studied, introducing an online approximation algorithm that provides the necessary availability within polynomial time, minimizing the consumption of physical resources. In this same line, several papers focused on resilience in the incorporation of virtual networks, such as [110], [111]. For use within the context of SFC, they should be revisited.
We believe that as the technology extends to more domains, the nature and severity of potential failures needs further clarification by integrating multiple levels of redundancy. Traditional cloud architecture is not adequate and has a high communication cost to travel large data streams generated by IoT devices with latency(delay)-sensitive communication applications, such as security-critical applications, military applications, emergency medical systems, among others. To try to solve this problem the concept of fog [112] and edge computing [113]. The primary goals of Fog and Edge computing are: distribute the processing and functions to IoT devices, network equipment such as routers at the network edge, or to micro data centers at the radio access networks, thus bringing computing and storage closer to the end-user. One of Fog computing's main characteristics is to create a federated layer, that is, a virtual layer, which aims to efficiently approximate the layers of the system, providers, and consumers of information, and with that, try to minimize the large volume of communication on the Internet. Also, the interaction between an application and data replication across architecture requires additional study. The review includes the data source and data transmission, which can be ascending (from fog clouds), lower (from fog sensors), or embedded internally in the fog node. The mentioned issues make the fog fault-tolerant replication model more complex than that corresponding to the traditional cloud and critical systems. In this context, controlling the propagation of errors through the fog structure, ensuring failure recovery and reintegration of nodes, are the most concerning problems.
Selected SLR studies investigate different methods of fault detection, fault tolerance, fault prevention, and fault diagnosis. Nevertheless, it is necessary to develop methods within fog environments that will allow faulty components to recover and reintegrate into system operation. It can prevent system failures or shutdowns caused by rapid redundancy friction-scalability problems. Also, fog nodes should be able to provide services for a large number of heterogeneous devices in different areas of application. Studies related to Fog/Edge applications were identified in the SLR, viz., [36](ID27), [69](ID29), and [80](ID59), none focused on the resiliency aspects of such applications. In this context, it is clear that further research needs to be conducted in this direction. In particular, since it has unique characteristics that differ from those in other networks, specific focus on the predictive fault tolerance mechanism for deploying NFV to the distributed edge networks.

IX. RESEARCH IMPLICATIONS FOR ACADEMIA AND INDUSTRY
The application of NFV in data center architectures is a new technology that has been the subject of increasing interest in the academic and industry communities. In this section, we provide some implications of our findings based on the SLR presented in this paper. Discuss limitations of the conducted research, as well as suggest future research directions in the context of the academic (Subsection IX-A) and industry (Subsection IX-B) communities.

A. IMPACT ON ACADEMIA
From the analysis of the SLR presented in this paper, it can be seen that researchers have made significant advances in the conceptual work on NFV data centers. In particular, several relevant studies highlight the advances made in theoretical and practical issues about NFV deployment. Specifically, 40% of the papers presented by the academic community are devoted to solving new problems. Here, it is essential to highlight that no work has proposed a complete solution; instead offering solutions to issues related to specific aspects of the NFV architecture. Also, a little over 40% of the studies are of lower quality, while 15% do not provide research design definitions. It shows that the quality of the available work in the literature is still relatively limited; the need exists for improving the quality of the methodological explanations towards allowing the replication or reproduction of the conducted research. In some cases, the purpose of the proposed techniques, metrics, and methods is not clear, and the developed approaches do not achieve the desired effects regarding the integration of NFV into data center architectures. Therefore, it is essential to highlight that the lack of details regarding the research methodology, and the inadequate explanation of the methods, tools, techniques, and metrics employed, makes it impossible for researchers to replicate the research experiments, nor able to have benchmark results available.

B. IMPACT ON INDUSTRY PRACTICE
Empirical studies are necessary to raise our confidence in the efficacy of the proposed solutions to practical problems. In addition, it is crucial to conduct careful empirical studies, following a well-designed and explained methodology. In this way, it is possible to obtain benchmark approaches and results, hence, allowing researchers to replicate studies and compare results. As already discussed, further empirical research needs to address several problems that are relevant and challenging for the industry.
The high-tech NFV industry requires a different approach to problem-solving and design research focused on data centers. Proposed NFV solutions reduce the adverse effects of data center integration while considering increased performance, availability, and reduced energy consumption. We can cite a keen interest in the telecommunications industries in the use of NFV in data center architectures. Nevertheless, severe technical challenges are inherent to rapid technological advancement. In particular, research has been conducted in different industry scenarios, for example, Telecom operator networks, data center networks, multi-data centers, LTE Mobile Core Gateways, Mobile Core Networks, Network-enabled Cloud (NeC), and Network Function Center (NFC). Besides, although most of the research focuses on NFV architecture (MANO, VNF, and NFVI), some studies are focused on performance evaluation. We recommend using evaluation techniques, such as modeling or simulation, together with optimization algorithms to deal with the open challenges in the industry. Finally, the presented SLR has revealed that companies have not researched the field of OSS/BSS. In this line, OSS/BSS remains an open research challenge.
Based on the results of the SLR presented in this paper, we can make the following suggestions for industry practices: • Study the already available work in the literature, identifying challenges and possible opportunities.
• Characterize the problems based on their complex nature and their many factors and levels (technical methods and metrics), to understand the possible and best solutions related to each problem.
• Interpret the results of empirical studies in terms of the research problem and research design.
• Conduct practical tests to confirm or refute the proposed approaches. In this way, significant results can be obtained for the industry as well as future research directions can be identified. Finally, the analysis of the selected SLR studies suggests that the NFV data center issue is still not mature. In particular, there has not been found any researcher associated with a company or university leading scientific research with more than two publications. In addition, a few studies (18%) were conducted by authors associated with industry. Although industry-academia research increased to 30% of published papers in 2016, the number of papers by each author remains small. The complex problem of approximation between sectors presents itself as a challenge for academic research and industry. In this line, specific awards may encourage co-authoring of articles between academics and industry.
We expect this discussion to inspire more researchers from academia and industry communities to become involved in VOLUME 8, 2020 the research of NFV data centers. In the first place, we recommend conducting an evaluation of the academic and industry contexts towards providing further tools for quality research in the NFV data center field. Specifically, we encourage further university-company interaction. Particular attention to simplifying the proposed approaches towards allowing the generalization of the obtained results to other scenarios. Moreover, perform a broadening of computer network research to address research aspects that have not been thoroughly investigated, such as performance, dependability, and sustainability. In this sense, this paper has presented direction and impact for future research in computer networks and cloud computing. We have identified research gaps for NFV data centers and research opportunities regarding the development of new approaches, techniques, methods and metrics for addressing crucial NFV aspects, such as resource allocation, performance, scheduling, fault tolerance, dependability, sustainability, scalability, availability, reliability, NFV placement and migration.

X. THREATS TO VALIDITY
In this section, we discuss the limitations of the presented SLR. In particular, we have identified the following threats to validity: • Research questions: The research questions we formulated cannot cover all aspects of the NFV applied in data centers. Nevertheless, as much as possible, we considered comprehensive search string to cover the researched topic. For instance, we cannot guarantee that all relevant studies were selected during the search process. We mitigated this threat by reviewing references included in the relevant studies.
• Influences of the researcher: To minimize the potential influences of each particular researcher, we analyzed the SLR sources in conjunction with a doctoral advisor. In particular, the chosen advisor has experience in conducting SLRs, surveys, and mapping in computer networks, which further reduces potential biases and threats. During the SLR process, we analyzed the data together with the research team for each of the SLR phases. Finally, to minimize the bias aspect of publication, we used the search method of snowballing (backward or forward).
• Data extraction: -The studies were classified based on the authors' judgment. However, some studies could have been miss-classified. As already said, to mitigate this threat, two researchers performed the classification. In this way, it was pretended to reconcile the extraction and analysis of data to reduce inconsistencies and misalignment. -To avoid threats to the identification of primary studies, a systematic manual search analyzed as many papers as possible related to NFV applied to data centers. In particular, it was decided to use the manual search technique, according to [114], the use of snowballing helps to detect other potential subjects. Also, snowballing has the advantage of identifying relevant articles with less noise. Finally, reading titles and abstracts in the first stage, aided the researchers to discard papers. However, we believe that if an article were indeed relevant, it would be identified during the subsequent snowballing process. -There have been problems in identifying the fields to which each study belonged. It was mostly due to the lack of information in the original research. In these cases, we sent emails to the authors asking for the missing information.

XI. FINAL CONSIDERATIONS
This paper presented an SLR regarding NFV applied to data center architectures. The research team has found 1,408 primary studies in an automatic search applied to different well-known databases, such as IEEEXplore, ACM Digital Library, SpringerLink, ScienceDirect, and Scopus. After a careful selection process, 65 relevant papers were filtered. Analyzing the quality of the selected studies, we found that 40% of the papers were above the quality average. The obtained results show that, although NFV data centers have gained great popularity among the academic community, the research area is still in its early stages. As such, there exist several research gaps, bringing new research challenges and opportunities for academia as well as industry communities. The SLR presented in this paper shows that it is necessary to consolidate the integration of NFV in data centers by solving different problems related to NFV applications. In this regard, the conducted SLR has uncovered several gaps and open challenges related to the benefits of NFV applications in data centers. In particular, it identified the most challenging aspects of NFV data centers as dynamic resource allocation and performance evaluation. In this line, the most concerning issues are related to availability, maintainability, reliability, response time, flow, energy efficiency, and fault tolerance. Successfully addressing such issues, further solutions need to be proposed for improving migration, mapping, scheduling, placement, elasticity, and scalability aspects. In this line, we have provided valuable propositions and identified significant avenues for future research towards revisiting already proposed approaches in the literature, as well as developing new ones to offer innovative solutions for NFV applications in data centers.
In this paper, we also discussed the implications of the results for academia and industry communities, showing that further collaboration between them should be encouraged. In this line, the new knowledge may guide new research and influence the operation of computer networks. We recommend research collaboration between academia and industry to identify precise conceptual and operational definitions required for integrating findings and generating consistent knowledge about the NFV data center. These definitions are the basis for future research and the impact on computer networks. Finally, we expect this discussion to inspire more researchers from academia and industry communities to become involved in the study of NFV applications in data centers.