Optimized Energy Aware 5G Network Function Virtualization

In this paper, network function virtualization (NFV) is identified as a promising key technology that can contribute to energy-efficiency improvement in 5G networks. an optical network supported architecture is proposed and investigated in this work to provide the wired infrastructure needed in 5G networks and to support NFV towards an energy efficient 5G networks. In this architecture the mobile core network functions as well as baseband function are virtualized and provided as VMs. the impact of the total number of active users in the network, backhaul/fronthaul configurations and VM inter-traffic are investigated. A mixed integer linear programming (MILP) optimising model is developed with the objective of minimising the total power consumption by optimizing the VMs location and VMs servers' utilisation. The MILP model results show that virtualization can result in up to 38% (average 34) energy saving. The results also reveal how the total number of active users affects the baseband VMs optimal distribution whilst the core network VMs distribution is affected mainly by the inter-traffic between the VMs. For real-time implementation, two heuristics are developed an Energy Efficient NFV without CNVMs inter-traffic (EENFVnoITr) heuristic and an Energy Efficient NFV with CNVMs inter-traffic (EENFVwithITr) heuristic, both produce comparable results to the optimal MILP results


I. INTRODUCTION
ccording to Cisco Visual Networking Index, mobile data traffic will witness seven pleats between 2016 and 2021 and will grow at a Compound Annual Growth Rate (CAGR) of 46% reaching 48.3 exabytes per month by 2021 [2]. This growth is driven by a number of factors such as the enormous amount of connected devices and the development of data-greedy applications [3]. With such a tremendous amount of data traffic, a revolutionary mobile network architecture is needed. Such a network (5G) will contain a mix of a multiple access technologies supported by a significant amount of new spectrum to provide different services to a massive number of different types of users (eg. IoT, personal, industrial) at high data rate, any time with potentially less The authors are with the School of Electronic and Electrical Engineering, University of Leeds, Leeds LS2 9JT, U.K than 1 ms latency [4]. 5G networks are expected to be operational by 2020 where a huge number of devices and application will use it [5]. Users, applications, and devices of different kinds and purposes need to send and access data from distributed and centralized servers and databases using public and/or private networks and clouds. To support these requirements, 5G mobile networks have to possess intelligence, flexible traffic management, adaptive bandwidth assignment, and at the forefront of these traits is energy efficiency. Information and Communication Technology (ICT) including services and devices are responsible for about 8% of the total world energy consumption [6] and contributed about 2% of the global carbon emissions [7]. It is estimated that, if the current trends continue, the ICT energy consumption will reach about 14% of the total worldwide consumption by 2020 [6] There have also been various efforts from researchers on reducing the power consumption in 5G networks. For instance, the authors in [8] focused in their work on the power consumption of base stations. They proposed a time-triggered sleep mode for future base stations in order to reduce the power consumption. The authors in [9] investigated the base stations computation power and compared it to the transmission power. They concluded that the base station computation power will play an important role in 5G energyefficiency. The authors of [10] developed an analytical model to address the planning and the dimensioning of 5G Cloud RAN (C-RAN) and compared it to the traditional RAN. They showed that C-RAN can improve the 5G energy-efficiency. The research carried out in [11] focused on offloading the network traffic to the mobile edge to improve the energyefficiency of 5G mobile networks. The authors developed an offloading mechanism for mobile edge computing in 5G where both file transmission and task computation were considered. Virtualization has been proposed as an enabler for the optimum use of network resources, scalability, and agility. In [12] the authors stated that NFV is the most important recent advance in mobile networks where among its key benefits is the agile provisioning of mobile functions on demand. The fact that it is now possible to separate the functions form their underlying hardware and transfer them into software-based mobile functions as well as provide them on demand, presents opportunities for optimizing the physical resources and improving the network energy efficiency. In this paper, network function virtualization is identified as a A promising key technology that can contribute to the energyefficiency improvement in 5G networks. In addition, an optical network architecture is proposed and investigated in this paper to provide the wired infrastructural needed in 5G networks, and to support NFV and content caching. In the literature, NFV was investigated either in mobile core networks [13][14][15] or in the radio access network [16][17][18] of the mobile network and mostly using pooling of resources such as the work in [19,20]. In contrast, virtualization in this paper is not limited to a certain part in the mobile network, but is applied in both the mobile core network and the radio access network. Moreover, it is not confined to pooling the network resources, but is concerned with mobile functions-hardware decoupling and considers converting these functions into software-based functions that can be placed optimally. A Mixed Integer Linear Programming model and real-time heuristics are developed in this paper with the goal of improving the energy-efficiency in 5G mobile networks.

II. NFV IN 5G NETWORKS
According to the third generation partnership project (3GPP) the evolved packed core (EPC) is an important step change [21]. There are four main functions in the EPC [22,23] illustrated in Fig. 1: the packet data network gateway (PGW), the serving gateway (SGW), the mobility and management entity (MME), and the policy control and charging role function (PCRF).
The work in this paper extends our work in [24,25] to include a number of factors such as the total number of active users in the network during the day, the backhaul and fronthaul configuration and the required workload for baseband processing. It introduces an optical-based framework for energy efficient NFV deployment in 5G networks and provides full MILP details and associated heuristics. In this framework, the functions of the four entities of mobile core network are virtualized and provided as one virtual machine, which is dubbed "core network virtual machine" (CNVM). For the radio access side, the BBU and RRU are split and the function of the BBU is virtualized and provisioned as a "BBU virtual machine" (BBUVM). Consequently, the wireless access network of the mobile system will encompass only the RRU that remain after the RRU-BBU decoupling. RRU is referred to here as "RRH" (as in a number of studies [26][27][28]) after it is separated from BBU. The traffic from CNVM to RRH is compelled to pass through BBUVMs for baseband processing, as in Fig. 2. Moreover, the capabilities of Passive Optical Networks (PON) are leveraged as an energy-efficient broadband access network to connect the IP over WDM core network to RRH nodes, and to represent the wired access network of our proposed system. Fig. 3 shows three locations that can accommodate virtual machines (VMs) of any type (BBUVMs or CNVMs), which are the optical network unit (ONU), optical line terminator (OLT), and the IP over WDM nodes. For simplicity, the nodes where the hosted servers are accommodated are referred to as "Hosting Nodes".
The hosting nodes (ONU, OLT and IP over WDM nodes) might host one VM or more than one VM of the same or different types, bringing forth the creation of small clouds, or "Cloudlets". Therefore, the proposed architecture will provide an agile allotment of services and processes through flexible distribution of VMs over the optical network (PON and IP over WDM network), which is one of the main concerns of this work in minimizing the total power consumption. Based on this architecture, a MILP formulation has been developed  with the overall aim of minimizing power consumption.

AMOUNT OF BASEBAND PROCESSING WORKLOAD
This section illustrates the configuration of the fronthaul and backhaul used in the proposed network; so that the ratio of the backhaul to the fronthaul data rate could be calculated. Fronthaul is the network segment that connects the remote radio head (RRH) to the baseband unit (BBU) [29], whilst the network segment that connects the BBU to the mobile core network (CN) is called "backhaul" [30]. The internal interface of the fronthaul is defined as a result of the digitization of the radio signal according to a number of specifications. The wellknown and most used specification among radio access network (RAN) vendors is the Common Public Radio Interface (CPRI) specification [31] which is implemented using digital radio over fiber (D-RoF) techniques. On the other hand, the backhaul interface leverages Ethernet networks as they are the most cost effective network for transporting the backhaul IP packets [32,33]. In order to adequately determine the data rate in each network segment (backhaul and fronthaul), we will start with the physical layer of the current mobile network which is the Long-Term Evolution (LTE) network. The LTE network uses single-carrier frequency-division multiple access (SC-FDMA) uplink (UL), whilst orthogonal frequency-division multiple access (OFDM) is used in the downlink (DL) [34]. In both techniques, the transmitted data are turbo coded and modulated using one of the following modulation formats: QPSK, 16QAM, or 64QAM with 15 kHz subcarriers spacing [35]. A generic frame is defined in LTE which has 10 ms duration and 10 equal-sized subframes. Each subframe is divided into two slot periods of 0.5 ms duration [36]. Depending on the cyclic prefix (CP) used, slots in OFDMA have either 7 symbols for normal CP or 6 symbols for extended CP [37]. Fig. 4 illustrates an LTE downlink frame with normal CP. In the LTE frames, a resource element (RE) is the smallest modulation structure which has one subcarrier of 15 kHz by one symbol [38]. Resource elements are grouped into a physical resource block (PRB) which has dimensions of 12 consecutive subcarriers by one slot (6 or 7 symbols). Therefore, one PRB has a bandwidth of 180 kHz (12 × 15 kHz). Different transmission bandwidths use different number of physical resource blocks (PRBs) per time slot (0.5 ms) which are defined by 3GPP [39]. Fig. 5 It is worth mentioning that for each transmission antenna, there is one resource grid (50 PRBs for 10 MHz); therefore in 2 × 2 MIMO the previous data rate is doubled (100.8 Mbps) [41]. The transmission of user plane data is achieved in the form of In-phase and quadrature (IQ) components that are sent via one CPRI physical link where each IQ data flow represents the data of one carrier for one antenna that is called Antenna-Carrier (AxC) [42]. A number of parameters affect the data carried by AxC, [41]: Sampling frequency which is calculated as: subcarrier BW (15 kHz) times the FFT window (size). The FFT size is chosen to be the least multiple of 2 that is greater than the ratio of the radio signal bandwidth to the subcarrier BW. For instance, if the radio bandwidth is 10 MHz, the FFT size is the least multiple of 2 that is greater than 666.67 (10 MHz / 15 kHz) which is 1028 (2 10 ). In this case the sampling frequency is calculated as 150 × 1024 = 15.36 . Using the same approach, the sampling frequency at 20 MHz radio bandwidth system is 30.72 MHz. IQ sample width (M-bits per sample): According to the CPRI specification, the IQ sample width supported by CPRI is between 4 and 20 bits per sample for I and Q in the uplink and it is between 8 and 20 in the downlink [42]. For instance, with M = 15 bits per sample; one AxC contains 15 bits per sample for I and 15 bits per sample for Q which are 30 (2 × ) bits per sample I and Q which are transported in sequence: 0 0 1 1 … 14 14 . The IQ sample data rate can be calculated by multiplying the number of bits per sample by the sampling frequency. For instance; for a radio bandwidth of 10 MHz (fs =15.36 MHz) and IQ samples 15 (M = 15) the IQ data rate is: CPRI data rate is designed based on the Universal Mobile Telecommunications System (UMTS) chip rate [42] which is 3.84 Mbps [43,44]. Therefore, one basic CPRI frame is created every Tc= 260.416 ns (1/3.84 MHz) and this duration should remain constant for all CPRI options and data rates. According to CPRI specification in [42], one basic CPRI frame consists of 16 words indexed (W=0…15), where the first word is reserved for control. The length of the frame word (T) depends on the CPRI line rate as specified by CPRI specification in [42]. Accordingly, the transmission of AxC data will be expanded by a factor of 16/15 (15 bits payload, 1 bit control and management). In addition to the sampling rate fs that is calculated earlier, AxC data needs to be coded using either 8B/10B or 64B/66B. To put all these calculations together, let's start with the number of bits per word in the CPRI frame. The number of bits per word is equal to the total number of bits per frame divided by the frame payload words (15 words by replacing with 2 × × where M is defined earlier as the number of IQ bits. In addition, AxC data are coded by either 8B/10B or 64B/66B. By putting these together, the CPRI data rate is calculated as Finally, the ratio of the backhaul to fronthaul data rate is calculated as: Therefore, depending on coding, sampling, quantization, and other parameters; the baseband processing adds overheads to the backhaul traffic as it passes through the BBU. In this work the ratio (13.44%) calculated in (10) is used in our model, whilst the amount of workload in Giga Operation Per Second (GOPS) needed to process one user traffic is used based on the following relation which is explained in [45]: where: wl: is the baseband workload in (GOPS) needed to process one user traffic, A: number of antennas used, M: modulation bits, C: the code rate, L: number of MIMO layers R: number of physical resource blocks allocated for the user.

IV. MILP MODEL
This section introduces the MILP model that has been developed to minimize the power consumption due to both processing by virtual machines (hosting servers) and the traffic flow through the network. As mentioned in the previous section, the MILP model considers an optical-based architecture with two types of VMs (BBUVM and CNVMs) that could be accommodated in ONU, OLT and/or IP over WDM as in Fig. 6. The maximum number of VM-hosting servers considered was 1, 5, and 20 in ONU, OLT, and IP over WDM nodes respectively, which is commensurate with the node size and its potential location and hence space limitations (together with the size of exemplar network considered in the MILP). All VM-hosting servers were considered as sleepcapable servers for the purpose of VM consolidation (bin packing) For a given request, the MILP model responds by selecting the optimum number of virtual machines and their location so that the total power consumption is minimized.
The following indices, parameters, and variables are defined to represent the developed model:  Traffic from hosting node ℎ to RRH node that traverses the link between the nodes ( , ) in the network in Gb/s , , Total traffic from node to node that traverses the link between the nodes ( , ) in the network (Gbps) The integer part of the total normalized workload at node   The total power consumption is composed of: 1) The power consumption of RRHs and ONUs 2) The power consumption of the OLTs 3) The power consumption of the IP over WDM network 4) The total power consumption of VMs and hosting servers The model objective is to minimize the total power consumption as follows: Subject to the following constraints: 1) Traffic from CNVM to BBUVM 2) Traffic to RRH nodes Constraint (12) represents the traffic from CNVMs to the BBUVM in node ℎ where is a unitless quantity which represents the ratio of backhaul to fronthaul traffic. Note that this constraint allows a BBUVM to receive traffic from more than a single CNVM, which may occur for example in network slicing. Constraint (13) represents the traffic to RRH nodes from all BBUVMs that are hosted in hosting nodes. This enables an RRH to receive traffic from more than a single BBUVM (network slicing).
3) The served RRH nodes and the location of BBUVM • ℎ, ≥ ℎ, ∀ ∈ , ∀ℎ ∈ , ℎ, ≤ • ℎ, ∀ ∈ , ∀ℎ ∈ , • ∑ ℎ, Constraint (14) and (15) ensure that the RRH node is served by the BBUVM that is hosted at node ℎ as illustrated in Fig. 7. Constraints (16) and (17) determine the location of BBUVM; is a large enough number to ensure that ℎ and ℎ are equal to 1 when ∑ ℎ ∀ ∈ > 0. In constraint (16) there are two possibilities for the value of (∑ ℎ, ∀ ∈ ) which are either zero (no traffic from h to r) or greater than zero (there is a traffic from h to r). When the value of ∑ ℎ, ∀ ∈ is zero, the left-hand side of the inequality ( • ∑ ℎ, ∀ ∈ ) should be zero and this sets the value of ℎ to zero. In the second case when the value of ∑ ℎ, ∀ ∈ is greater than zero, the lefthand side of the inequality ( • ∑ ℎ, ∀ ∈ ) will be much greater than 1 because of the large value . In this, the value of ℎ may be set to 1 or zero. In the same way constraint (17) sets the value of ℎ . Table V illustrates the operation of constraints (16) and (17).

4) CNVM locations
,ℎ ≤ • ,ℎ ∀ , ∈ , ≠ , ≤ ∀ , ∈ , ≠ , ≥ + − 1 ∀ , ∈ , ≠ 5) Hosting any VM of any type Constraints (18) and (19) ensure that the BBUVMs at node ℎ are served by the CNVMs that are hosted at the node . Constraints (20) and (21) determine the location of the CNVMs by setting the binary variable to 1 if there is a CNVM hosted at node , where is very small number. Fig. 8 illustrates the functions of constraints (20) and (21) whilst Table VI illustrates their operation. Constraints (22) - (24) ensure that the CNVMs communicate with each other if they Fig. 7. BBUVM and the traffic toward RRH nodes are hosted at different nodes and , and this is equivalent to the logical operation , = . Fig. 9 illustrates the function of constraints (22) - (24). Constraints (25) - (27) determine if the hosting node h hosts any VM of any type (BBUVM or CNVM). It is equivalent to the logical operation ℎ = Constraint (28) represents the traffic between CNVMs at hosting nodes p and q. Constraint (29) represents the total traffic between any two hosting nodes ( , ) which is caused by virtual machines communication. Constraint (30) represents the flow conservation of the total fronthaul traffic to the RRH nodes. Fig. 10 illustrates the principle of flow conservation, and for clarification purposes, it is applied to constraint (30). Constraint (31) represents the flow conservation of the total traffic between any two hosting nodes that might host virtual machines of any type (BBUVM or CNVM).

13) GPON link constraints
Constraint (32) represents the total BBU workload at any hosing node h. Constraint (33) calculates the total BBU and CNVM normalized workload at any hosting node. The workload is scaled and normalized relative to the server CPU workload and is separated into integer and fractional parts. Constraint (34) ensures that the total power consumption of hosting VMs does not exceed the maximum power consumption allocated for each host. Constraints (35) - (38) ensure that the total PON downlink traffic does not flow in the opposite direction. 14) Virtual Link capacity of the IP over WDM network Constraints (41) and (42) are the constraints of the physical link ( , ). Constraint (41) ensures that the total number of wavelength channels in the logical link ( , ) that traverse the physical link ( , ) does not exceed the fiber capacity. Constraint (42) determines the number of wavelength channels in the physical link and ensures that it is equals to the total number of wavelength channels in the virtual link traversing that physical link. Constraint (43) determines the required number of aggregation ports in each IP over WDM router.

V. MILP MODEL SETUP AND RESULTS
Five IP over WDM nodes are considered constituting the optical backbone network of the proposed architecture. The distribution and topology of the IP over WDM nodes have been built upon the NSFNET network described in [46][47][48][49][50][51]. Each IP over WDM node in turn is attached to two GPONs with one OLT and two ONUs for each GPON. Accordingly, the network topology has 10 OLTs and 20 ONUs. In addition, each ONU is connected to one RRH node as shown in Fig. 11. Two GPONs for each IP over WDM node are enough to investigate the VM response for demands and power savings. To finalize the portrait of the network topology, we have concentrated on the distribution of the hosting nodes and the way in which they are connected to each other and for this reason the GPON splitters are not shown. As alluded to earlier, two types of VMs have been considered: BBUVM, which realize the functions of the BBU, and CNVM to achieve the functions of the mobile core network. The amount of workload needed for BBUVMs is calculated in GOPS according to (11) [45] and based on the calculated workload, the hosting server CPU utilization due to hosting BBUVMs is determined. On the other hand, the total workload needed for CNVMs is calculated based on the number BBUVMs group in each hosting node since we have allocated one CNVM for each group of BBUVMs in one hosting node. A single VM consumes around 18W [52] and by knowing the hosting server maximum power consumption (365W), idle power (112) and the maximum workload (368 GOPS), Ψ ℎ can calculated for a single VM. Therefore Ψ ℎ = corresponds is (18 × 368) (365 − 112) ⁄ = 26 GOPS.
We have investigated the effect of the intra-traffic between the CNVMs by considering a range of intra-traffic relative to the total network traffic (0%, 1%, 5%,10%, and 16% of the total traffic) flows from CNVMs. Moving toward the access network, each RRH node is considered to serve a small cell that operates on 10 MHz bandwidth and with a maximum capacity of 10 users. Each user in the small cell is allocated 5 physical resources blocks (PRB) as the users are assumed to request the same task from the network. Accordingly, the total downlink traffic to the RRH node depends on the total number of active users in the small cell. The input parameters to the developed MILP model are listed in Table VII. We have considered 17 time slots over all the day from 0 hour to 24 hour in steps of 1.5 hours using the average number of users daily profile shown in Fig. 12. The MILP results are compared with the case where there is no NFV deployment. In the "no virtualization" scenario, the BBU is located close to the RRH where they are attached to each other, whilst the integrated platform ASR5000 is deployed to realize mobile core network functionalities and it is connected directly to the IP over WDM network. The ASR5000 maximum power consumption, idle power, and maximum capacity are 5760 (W), 800 (W), and 320 (Gbps) respectively [53], whilst the BBU maximum power consumption, idle power, and maximum capacity are 531 (W), 51 (W), 9.8 (Gbps) respectively [54].  [60,61] The results in Fig. 13 show the total power consumption of the of the case where no virtualization is deployed (standard model) as well as the cases where the virtualization is deployed under different CNVMs intra-traffic for different time slots in a day. Fig. 14 shows the total power consumption of the same scenarios versus the total number of active users in the networks. The virtualization model has resulted in less power consumption compared to the no virtualization model (standard model) as it optimizes the processing locations of the downlink traffic through optimum placement and consolidation of VMs. Compared to other virtualization cases, virtualization without CNVMs inter-traffic has saved a maximum of 38% (average 34%). This is because there is no power consumed by the CNVMs inter-traffic as this traffic is zero. The total power saving decreases as the CNVMs inter-traffic increases to reach its lowest value in the case of virtualization with 16% CNVMs inter-traffic which is 37% (average 32%).
Virtualization in the presence of CNVMs inter-traffic resulted in comparable values of total power consumption (and power saving) for all values of CNVMs inter-traffic greater than zero. The main reason behind this is that the CNVMs intertraffic produces relatively small amount of power consumption compared to the power consumption induced by the fronthaul traffic and hosting server as shown in Fig. 17. As the inter-traffic increases, the MILP model tends to eliminate its effect by consolidating CNVMs in one place. Although virtualization has saved a maximum of 38% (without CNVMs inter-traffic) and 37% (with 16% CNVMs inter-traffic) of the total power consumption, it cannot provide such level of power saving over the entire day. As the number of active users varies with the time of day (as in Fig. 12), the power saving achieved by virtualization varies accordingly. The results in Figs. 15 and 16 show that a high-power saving is achieved when the total number of active users is around 20% (around 4 am to 8 am) while the lowest power saving is recorded at high number of active users (during the day rush   . At small number of active users, the MILP model tends to consolidate all the VMs in the IP over WDM network to minimize the number of servers hosting VMs to reduce the total power consumption.

Figs. 18 and 19 show the VMs consolidation and distribution over the network at low number of active users (13%) under
CNVMs of 0% and 16% respectively. At low number of active users and 0% inter-traffic, the MILP model consolidates the VMs at the IP over WDM network. Since the total number of active users is low, the fronthaul traffic is relatively low and consequently the power consumption induced by the fronthaul traffic is low compared to the hosting power consumption (servers power). For this reason, the MILP model tends to pack BBUVMs in the IP over WDM network as much as possible to reduce the power consumed by the hosting servers. Also, the MILP model tends to host CNVMs close to the BBUVMs as the inter-traffic between CNVMs is zero. Once the inter-traffic is greater than zero, the MILP model consolidates the CNVMs at one location as in Fig. 19.
Figs. 20 and 21 show the VMs consolidation and distribution over the network with high number of active users (around 100%) under 0% and 16% CNVMs inter-traffic. When the number of active users is high, the amount of fronthaul traffic is high, for that reason the MILP model tends to distribute the BBUVMs at the closest centralized location to the users which are the OLTs, while CNVMs inter-traffic has no effect on the distribution of BBUVMs.
Hosting BBUVMs in OLTs when the number of users is high ensures shorter paths for the fronthaul traffic than hosting BBUVMs in the IP over WDM networks and consequently,   the power consumed by this traffic is less. For CNVMs, the MILP model tends to distribute them close to the BBUVM when there is no inter-traffic between them, and this is clearly seen in Fig. 20. In contrast, when the inter-traffic between CNVMs is greater than zero, the MILP model tends to centralize the location of CNVMS in the IP over WDM network to reduce the power consumption induced by the inter-traffic and the power of the hosting servers as shown in Fig. 21.

A. Energy Efficient NFV with no CNVMs inter-traffic (EENFVnoITr) heuristic
The EENFVnoITr provides real-time implementation of the MILP model without CNVMs inter-traffic. The pseudocode of the heuristic is shown in Fig. 22. the network is modelled by sets of network elements NE, and links L. The heuristic obtains the network topology G=(NE, L) and the physical topology of the IP over WDM network Gp= (N, Lp) where N is the set of IP over WDM nodes and Lp is the set of physical links. The total download request (fronthaul traffic) of each RRH node is calculated based on the total number of active users in each cell (RRH). The heuristic determines the amount of baseband workload needed to process each RRH download request. According to the baseband workload for each requested download traffic and the available capacity of the hosting VM server, the EENFVnoITr heuristic chooses the closest place to accommodate BBUVM in such a way that it serves as many RRH requests as possible. The EENFVnoITr heuristic may host a BBUVM in an OLT node if it has enough processing capacity to serve all the requests from the closest RRH nodes. In this way, the heuristic exploits bin packing techniques to reduce the processing power consumption. The amount of fronthaul traffic delivered by each BBUVM determines the backhaul traffic flows from each CNVMs toward BBUVMs. The EENFVnoITr heuristic determines the total amount of backhaul traffic that may flow from each IP over WDM node and sorts them in a descending order. The nodes in the top of the sorted list of IP over WDM nodes represent highly recommended nodes to host CNVMs. In such a scenario, the EENFVnoITr heuristic ensures less of the backhaul traffic flows in the IP over WDM network. The EENFVnoITr heuristic uses the sorted list to accommodate CNVMs. Once the VMs are distributed and the logical traffic is routed, the EENFVnoITr heuristic obtains the physical graph Gp=(N, Lp) and determines the traffic in each network segment. The IP over WDM network configuration such as the number of fibers, router ports, and the number of EDFA is determined the total power consumption is evaluated. The heuristic reduces the number of CNVMs candidate locations by one, re-configures the IP over WDM network, and reevaluates the power consumption to determine the best number and location of CNVMs for minimum power consumption.

C. EENFVnoITr and EENFVwithITr heuristics results
In order to verify the results of the proposed MILP model, the network topology in Fig. 11 used for the MILP model is also used to evaluate the heuristics. All the parameters considered in the MILP model such as the wireless bandwidth, number of resources blocks per user, and the parameters in Table VII are considered in the evaluation of both EENFVnoITr and EENFVwithITr heuristics. The number of users allocated to each cell in the heuristics is the same as in the MILP model to ensure the requested traffic by each RRH node is the same in all models. Fig. 24 compares the total power consumption of MILP with EENFVnoITr model at different times of the day when the CNVMs inter-traffic is not considered. It is clearly seen that there is a small difference in the total power consumption of the two models and it varies over the day according to the total number of active users. The total power consumption of the MILP model is less than the EENFVnoITr heuristic with a maximum of 9% (average 5%) drop in the total power consumption. This is mainly caused by the distribution of CNVMs in the EENFVnoITr heuristic. As there is no traffic flowing between CNVMs, the EENFVnoITr accommodates them close to the BBUVMs wherever the VM servers have enough capacity. To accommodate the CNVMs, the heuristic sequentially examines the capacity of the VM servers in the OLT nodes that are close to the BBUVMs before investigating other servers in the IP over WDM network. As the distance and capacity requirements of the VM servers are met, the heuristic accommodates a CNVM in the server. This case results in high EENFVnoITr VM server power consumption compared with MILP model. This is clearly seen in Fig. 25 where the VM servers power consumption of MILP and the EENFVnoITr heuristic are compared. The total network power consumption of both EENFVnoITr heuristic and MILP model are the same for most of the time of the day. Fig. 26 shows the network power consumption of MILP model compared with EENFVnoITr heuristic. It shows that there is a small difference in the network power consumption between the two models during the time of the day when the total number of active users is low. This is driven by the approach of the MILP model where it tends to accommodate the CNVMs at the IP over WDM nodes rather than OLT at the time of the day where the total number of users is low. In contrast, the heuristic tends to accommodate the CNVMs wherever the VM server is close to the BBUVMs and it has enough capacity. Fig. 27 compares the total power consumption of EENFVwithITr with the MILP model when the CNVMs inter-traffic is 16% of the total backhaul traffic. It is clearly seen that there is a small difference in the total power consumption of the two models and this varies over the day according to the total number of active users. The total power consumption of the MILP model is less than the EENFVnoITr model with a maximum drop of 9.5% (average 5%) in the total power consumption. This is mainly driven by the distribution of both CNVMs and BBUVM over the network nodes. The MILP model tends to accommodate BBUVMs and CNVMs at the IP over WDM network during times of the day when there is a small number of active users.
This causes more traffic from BBUVMs and CNVMs to flow in the IP over WDM network which eventually increases the IP over WDM network power consumption as shown in Fig.  28 which compares the IP over WDM network power consumption of both MILP model and EENFVwithITr heuristic when CNVMs inter-traffic is considered 16% of the total backhaul traffic. In contrast, the IP over WDM network power consumption of EENFVwithITr varies according to the total number of active users during the day. The sequential examination by EENFVwithITr of VM servers, their location, and available capacity increases the processing distribution of

VII. CONCLUSIONS
This paper has investigated network function virtualization in 5G mobile networks with the impact of total number of active users in the network, the backhaul / fronthaul configurations, and the inter-traffic between VMs. A MILP optimization model was developed with the objective of minimizing the total power consumption by optimizing the VMs locations and VM servers' utilization. The MILP model results have been investigated under the impact of CNVMs traffic variation, and variation in the total number of active users during different times of the day. The MILP model results show that virtualization can save up to 38% (average 34%) of the total power consumption, also the results reveal how the total number of active users affects the BBUVMs distribution while CNVMs distribution is affected mainly by the inter-traffic between them. For real-time implementation, this paper has introduced two heuristics: Energy Efficient NFV without CNVMs inter-traffic and Energy Efficient NFV with CNVMs inter-traffic. The results obtained through the use of the heuristics were compared with the MILP model results. The comparisons showed that the total power consumption when the heuristics are used is higher than the total power consumption when the MILP optimization model is used by a maximum of 9% (average 5%). From (2005 to 2009) he was a mobile core network senior engineer and short message system, intelligent network, PSTN, and billing system team leader at ZTE Corporation for Telecommunication, Iraq branch. His current research interests include energy efficiency in optical and wireless networks, NFV, mobile networks, 5G networks, caching the contents, cloud computing and Internet of Things.
GreenTouch Wired, Core and Access Networks Working Group, an adviser to the Commonwealth Scholarship Commission, member of the Royal Society International Joint Projects Panel and member of the Engineering and Physical Sciences Research Council (EPSRC) College. He has been awarded in excess of £22 million in grants to date from EPSRC, the EU and industry and has held prestigious fellowships funded by the Royal Society and by BT. He was an IEEE Comsoc Distinguished Lecturer 2013-2016.