Globally Optimal Resource Allocation and Time Scheduling in Downlink Cognitive CRAN Favoring Big Data Requests

This paper is concerned with a cognitive cloud radio access network (CRAN) with a special attention to efficient and reliable transmission of big data. In particular, the paper focuses on optimizing performance of secondary users (SUs) in the downlink. Existing approaches either try to maximize the number of accepted SUs or the sum data rate of accepted SUs. The first approach unfairly favors users with small data requests, whereas the second approach allocates most resources to users with better channel conditions. In contrast, this paper develops a novel approach that favors big data requests while simultaneously maintaining a certain degree of fairness among SUs. To this end, we first introduce a novel objective function that allows us to jointly optimize deadline-aware time scheduling, spectrum allocation, SU selection, and remote radio head (RRH) allocation for SUs. Second, we demonstrate that finding the global optimum solution for the formulated problem entails the enumeration of all colorful independent sets on a generalized interval graph which is known to be NP-hard. Third, we propose a dynamic programming (DP) approach which yields the global optimum solution at a reduced computational cost. Fourth, we analyze the complexity of the proposed DP approach and numerically compare its performance against existing baseline algorithms. Numerical and simulation results revealed that our solution favors big data users while incurring only a small degradation in the fairness index. Our proposed solution is practical for small-to-medium size networks. Furthermore, it offers a benchmark against which any new sub-optimal low-complexity algorithm can be compared to determine how far from the global optimum its performance is.


I. INTRODUCTION
T REMENDOUS increases in both the number of mobile devices and their data demand in the last few years [1] have led to many new challenges in the design of modern wireless communication networks. Efficient utilization of time, frequency, and spatial resources is imperative in providing efficient wireless communication of big data. The fifth-generation (5G) [2], [3] and sixth-generation cellular networks [4] are expected to support both licensed primary users (PUs) and unlicensed secondary users (SUs), as well as much higher data rates [5], [6]. The first commercially developed version of 5G technology is being designed based on specifications published by 3GPP [7], which put a special focus on unlicensed users and spectrum bands [8]. Cisco's annual internet report emphasizes that more than 10% of the wireless connections will be on 5G by 2023 [1]. Unsurprisingly, current mobile network operators, such as AT&T, Vodafone, and NTT DoCoMo, are evolving from 2G/3G/4G to 5G and 6G [9], [10].
The term "big data" is characterized by five big V features: VOLUME 4, 2016 Volume, Velocity, Variety, Veracity and Value [11], [12]. Here, Volume explains the size of data (in gigabytes) generated per time unit. Velocity explains how fast data is generated, and Variety refers to various data types which are generated. Veracity explains the accuracy and quality of the generated data. Lastly, Value refers to the priority or usefulness of data.
To serve the increasing number of mobile devices and handle big data, a massive number of micro/macro base stations, and remote radio heads (RRHs) could be employed, which causes densification of wireless networks. On the other hand, cloud radio access network (CRAN) [13] has been introduced as a revolutionary redesign of the centralized cellular architecture [14], which integrates existing radio access networks to achieve feasible, flexible and scalable solutions [15]. A CRAN consists of three basic components: RRHs, wireless/wired fronthaul links with low-latency and high bandwidth [16], [17], and central baseband processing unit (BBU) pool cloud. Using technologies such as softwaredefined networks [18], CRAN detaches the BBUs from traditional base stations and integrates a large number of them in the BBU pool. As a result, a CRAN enables cross-layer design, efficient cloud-based resource allocation, optimal central routing and controlling, and load equalization in the network. A use case of CRAN can be seen in resource sharing among several service providers. This helps to reduce the cost of establishing network infrastructure for each operator. In general, service providers rent resources from a network to support data transmission [19]. With such an arrangement, unlike conventional wireless networks that usually have large idle times, BBU pool cloud-based resource allocation can be designed to improve the overall efficiency for big data transmission. In the next subsection, we provide a brief overview of existing literature on this topic and point to their limitations. Then, we will enumerate the contributions of this work.

A. RELATED WORKS
Our proposed scheduling and resource allocation algorithm will answer the following three questions: (1) Which SUs and when are they scheduled for optimal utilization of available resources? (2) How to handle big data requests and prioritize them over small data demands while ensuring a certain degree of fairness? (3) How to develop efficient algorithms that can address these issues with both satisfactory performance and complexity? Focusing on these questions, we categorize related works into two parts: (1) Works that investigate RRH and spectrum allocation, user selection, and scheduling, (2) Works related to big data transmission. A summary of prior art and the topics they address are provided in Table 1.

1) RRH and Spectrum Allocation, User Selection, and Scheduling
The authors in [20] consider constant transmit power during each radio resource block and assign users to these resource blocks and RRHs. The work done in [20] is then extended in [21] to perform joint user scheduling and power allocation by maximizing the overall CRAN data rate. Then, in [22], a low computational complexity graph-based approach is developed to construct a power allocation graph that is responsible for transmitted frame synchronization, power level control and user scheduling. Subsequently, in [23], a multiple CRAN framework is considered for a hybrid scheduling with a constraint that users are assigned to a single or multiple RRHs of a single cloud. In [24], the authors solve the joint user scheduling and beamforming problem to maximize the overall CRAN data rate. In a similar framework, the work in [25] presents a greedy algorithm for a coordinated interference aware scheduling for the downlink of a CRAN in order to maximize the overall CRAN throughput.
The main goal in [26]- [28] is to minimize the delay in the CRAN. In [26], a queuing model is established to minimize the mean response delay and power by using joint BBU allocation and load scheduling. A two-phase interference aware scheduling for a CRAN is investigated in [27]. First, all users are grouped into clusters based on estimated delay cost function and interference levels. Then, channels are matched to user groups to minimize the sum delay of the system. Under the total power constraint, the authors in [28] minimize the traffic delay in a hierarchical CRAN by using a joint virtual machine scheduling and RRH allocation.
In [19] and [29], the overall throughput of a CRAN is maximized via resource allocation. Resource sharing in a CRAN with fronthaul constraints is studied in [19], where service providers rent radio resources to provide services. A threshold-based version is used to both manage interference between RRHs, and provide minimum resource requirements. A multi-scale global and local channel allocation and user association mechanism is performed at different time scales. Classification and machine learning are examined in [29], which led to a conclusion that a fairer scheduling should be done in a distributed manner.
RRH selection and resource allocation are studied in [30]- [33]. In [30], a traffic-aware RRH clustering algorithm is introduced for efficiently improving the QoS of a CRAN. Then, an optimal spectrum assignment algorithm is designed for clusters. In [31], RRH selection for coordinated multipoint transmission and resource allocation are jointly optimized to improve user's capacity performance in a CRAN. Furthermore, the optimal resource allocation is solved under fixed RRH selection. In [32], MISO downlink multi-cast is studied for a CRAN where an energy-aware joint scheduling and RRH selection for coordinated beamforming are investigated. The goal is to increase CRAN performance by considering interference management and energy efficiency.
Various works on CRANs' uplink and downlink resource sharing have been done in [33]- [36]. Joint resource assignment and interference cancellation in full-duplex CRAN are studied in [33] with the goal of optimizing the downlink capacity under transmit power constraints while guaranteeing QoS of the uplink. The work in [34] addresses joint downlink and uplink resource sharing in orthogonal frequency-division multiple access (OFDMA) CRAN. In order to maximize the overall throughput, cooperative RRH selection, scheduling, and power assignment with maximum power constraints are performed. The work in [35] designs a joint RRH association and user scheduling for OFDMA-based CRAN in both uplink and downlink.
The problems of user selection in CRANs are studied in [36]- [38]. In particular, the authors in [36] minimize the power consumption of full duplex CRANs by jointly optimizing the transmit powers of users, RRH set, beamforming vector and compression ratio of the fronthaul. To make it more efficient in computational complexity, a two-level algorithm is employed. First, an user selection algorithm, which is based on minimizing square error between the minimum SINR that satisfies QoS requirements and the achievable SINR, is performed to find the largest cluster of users that can be served. Second, power optimization is carried out. The work in [37] applies a generalized Stackelberg game to the problem of overall data rate maximization by joint distributed power allocation and centralized user association in heterogeneous CRANs with guaranteed QoS for users. A two-level approach is developed in [38] to guarantee the minimum data rate for users in the downlink of an ultra-dense CRAN. The coverage probability of a CRAN is maximized by joint frequency allocation and user clustering with a satisfactory computational complexity. In the first level, there is a new binary user clustering. This clustering reduces the complexity of spectrum allocation and guarantees a minimum transmission data rate for users. Then, a new graph-based algorithm is proposed for serving clusters by considering a relationship among them without extra calculations.
Scheduling of computing resources is considered in [39] and [40]. In [39], empirical information of computation energy is used to propose a model for energy consumption in CRANs. Based on this model, power and bandwidth are allocated to all users to meet their QoS. Then, the number of active processing units in the BBU pool is optimized to minimize energy consumption. In [40], the authors present a unified framework to improve performance by using joint CRAN resource scheduling and allocation of computational resources. To achieve this, the authors formulate and solve a stochastic problem for resource scheduling and handling variable length requests.
Deadline requirement is studied for data transmission over CRANs in [41] and [42]. In [41], a complexity-efficient resource scheduler is explored to increase the CRAN throughput and to meet the deadline requirements for sub-frames of RRHs. In [42], power-efficient RRH allocation and processor scheduling are studied for CRANs in which the minimum SINR, different transmission times and deadlines required for each of users are guaranteed by the scheduling algorithm. Beside these works on CRANs, the earliest deadline first scheduling [43] is the basis for many other scheduling algorithms developed in recent papers [44], [45]. In [46], the authors implement a small-cell scheduler for a CRAN to improve its capacity, energy and spectral efficiency in a 3D indoor environments.
While joint scheduling and resource allocation have been considered in the aforementioned references, none of them pays attention to efficient communication of big data. Next, we will provide a brief review of references that have focused on big data requests.

2) Big Data Transmission
The general interplay between big data and communication networks, as well as some open problems in the field are elaborated in [6], [47]. The work in [48] utilizes channel state information statistics to obtain the distribution of energy consumed in big data transmissions. In [49], big data transmission requirements in the context of internet of things (IoT) are investigated in terms of expected delay, data length, link's capacity and load. The authors introduce a scheduling and routing algorithm to provide a lower transmission time for big data and to improve the end user's experience. In [50], a scheduling policy is developed for real time video delivery as a specific case of big data transmission. However, their strategy is simply to selectively transmit those videos with fewer packets. Many to many wireless big data delivery is addressed in [51] by construction of a group communication structure. The authors of [52] consider securityaware resource allocation, e.g., processing resources needed to perform encryption for mobile social big data.
Most works mentioned above do not pose a specific optimization problem and just provide heuristic algorithms that can handle big data. On the other hand, the optimization problems formulated in [50] and [53] strive to maximize the number of accepted requests, which inherently favor smallersized data requests. Furthermore, these two references solve the formulated optimization problems using heuristic suboptimal algorithms. In contrast, we introduce in this paper a new objective function that can ensure a certain degree of fairness while favoring big data transmissions. We also propose an algorithm that can find the globally-optimal solution for the formulated joint scheduling and resource allocation problem.

Topic
Related works User scheduling/clustering [20]- [25], [27], [29], [35]- [38] Power allocation [21], [22], [26], [34], [36], [37], [39] BBU allocation [26], [28], [39], [40] Data transmission time [26]- [28] Spectrum allocation [19], [27], [30], [38], [39] RRH scheduling/clustering [28], [30]- [36] Throughput [19], [29], [33], [34], [46] Up and down link [33]- [36] Deadline based scheduling [41]- [45] Big data transmission [6], [47]- [53] B. CONTRIBUTIONS In this paper, we consider a cognitive CRAN serving both PUs and SUs. Our first objective is to develop an efficient VOLUME 4, 2016 joint resource allocation algorithm in time, frequency (spectrum) and spatial dimensions for SUs. Second, we propose a mechanism to favor "big data" users as much as possible (which is determined by the network planner), while maintaining a certain degree of fairness in the network. To the best of our knowledge, there are no published works that consider these two aspects simultaneously. In this context, the contributions of our work are as follows: • We introduce a new objective function that ensures certain degree of fairness among SUs while favoring big data requests. The objective parameters can be selected to trade-off between these two requirements. The allocated resources are time, spectrum and RRHs to connect to. • Since the computational complexity of both exhaustive search and the branch-and-bound method is prohibitive, we propose a novel dynamic programming (DP) approach to reduce the complexity as much as possible while reaching the global optimum. • We establish a connection between the formulated resource allocation problem and the problem of finding colorful independent sets on a graph [54]. As a result, we show how to apply advanced methods (that exist for finding colorful independent sets problem) to solve our resource allocation problem with even lower complexity than the DP approach. • Performance of the proposed DP resource allocation is numerically compared to that of two well-known heuristic resource scheduling algorithms and the superiority of the proposed method is thoroughly demonstrated under various criteria.

C. PAPER ORGANIZATION AND NOTATIONS
The remainder of this paper is organized as follows. Section II introduces the system model under consideration and describes various parameters. Section III formulates the optimization problem of interest. Section 4 establishes the connection between our resource allocation/scheduling problem and the problem of finding colorful independent sets on a graph. Furthermore, our proposed DP approach is also developed in this section. The computational complexity of exhaustive search and that of the proposed algorithm are analyzed in Section V. Simulation results are presented in Section VI, and the conclusions are drawn in Section VII. We use calligraphic letters to represent various sets. Bold capital (non-capital) letters are used to denote matrices (vectors). Scalars are represented by non-bold and noncalligraphic letters. Fig. 1 illustrates the proposed CRAN system model, which is composed of a macro cell overlaid with several small cells to serve PUs and a set U := {1, 2, · · · , U } of U randomly distributed SUs, respectively. We focus on data transmission to SUs in the downlink mode. Time is divided into equal length time-slots of duration ∆t, whose value generally depends on the channel bandwidth. For instance, in the IEEE 802.11 family standard ∆t = 9 µs [55], whereas the values for ∆t in 5G are recommended in [56]. Time slot t is the slot whose time interval belongs to [(t − 1)∆t, t∆t). Both PUs and SUs utilize the same time-slot structure and are thus mutually synchronized.

II. SYSTEM MODEL
Small cells include a set of R RRHs (e.g., micro cell and pico cell RRHs), which are distributed in the service area at fixed locations. Each RRH can simultaneously serve at most u max SUs. RRHs are connected to the BBU pool via high speed and low latency fronthaul links [17]. All RRHs and SUs have single antennas. Extension to the cases where each RRH and/or SUs have multiple antennas shall be explored in future research. We assume that the BBU pool has perfect knowledge of large-scale and shadow fading parameters for all the SU-RRH links. However, it only knows the statistics of small-scale fading. In order to decode transmitted messages accurately, RRHs should have perfect knowledge of the channels connecting those SUs that are assigned to them. While accurate small-scale CSI values can be obtained via training at the beginning of each coherence time, it is out of the scope of this work as we only exploit small-scale fading statistics in our design.
The vacant spectrum bands for SUs are arranged in the spectrum pool [57] and this spectrum pool is divided into units of equal bandwidth ∆f . Although techniques such as non-orthogonal multiple access (NOMA) and/or beamforming may be used to simultaneously assign a vacant spectrum unit to more than one SU, in this work we focus on orthogonal association problem, where only one SU or PU can access a particular spectrum unit during each time slot. Each SU can use at most s max spectrum units.
The (possibly multiple) service provider(s) collects all the data requests that are sent by the SUs through RRHs, and then forwards these requests to the BBU pool for resource allocation and scheduling. Our design aims to determine which spectrum units and RRHs, and at what times should be assigned to each SU. The objective is to satisfy QoS requirements for as many SUs as possible, while providing relative priority for big data transmission requests. Our design performs batch scheduling, which means that all requests over a certain time epoch are first collected and then are jointly scheduled.

A. RESOURCE ALLOCATION AND SCHEDULING PARAMETERS
Resource allocation time epoch or T is the number of time slots that transmission requests from SUs are jointly served. During each resource allocation time epoch, the SUs who want to receive data, send their requests to the service provider via a separate control channel. Upon joint resource allocation by the BBU pool, these SUs will be served in the next resource allocation epoch. For simplicity we only consider one such epoch, i.e., a T time-slot epoch as the proposed procedure can be repeated for future resource allocation time epochs. Let S := {1, 2, · · · , S} denote the set of all spectrum units and let S t denote its subset of vacant spectrum units at time slot t. Since PUs' spectrum activities are time dependent, S t is a time-dependent set.
The SUs request various types of data. Thus, we set L u × L to be the size of data requested by SU u, where L u is a positive integer that represents the number of data frames, and L is the standard frame size (e.g., about 1500 bytes for Ethernet II and IEEE 802.3, or 2304 Bytes for WLAN, and could be lager for extended versions [55]). Some previous studies consider that service providers rent some resources from the CRAN entity [19]. Then, service providers decide by themselves the resources required by each of the SUs. Thus, a service provider should collect information about the size and QoS of the data demanded by each of the SUs, as well as CSI of the rented resources in order to perform resource allocation and scheduling for SUs. However, in practice, a service provider has no access to full information of both the CRAN and characteristics of the data (size, type, etc.) requested by the SUs subscribing to other service providers. This may lead to an unfair and suboptimal resource allocation in the network, especially when users are distributed non-uniformly in a service area [29]. In this work, by exploiting advantages of cloud processing of the BBU pool, we assign the duty of resource allocation and scheduling to the BBU pool. To this end, service providers communicate to the BBU pool the data requested by SU u, which is the quadruplet L u × L, T s u , T w u , γ u . In this quadruplet, T s u (in terms of the number of time slots) denotes the earliest time that user u can begin receiving service. T s u is usually set to zero as users can begin receiving service right upon request. However, it may be non-zero when user u has a slow processor and a full buffer from past data it has received. Similarly, T w u denotes the maximum satisfactory waiting time (in terms of the number of time slots) before beginning to receive data by the uth SU, and γ u is the minimum SNR required by SU u. The service provider determines the pair of T w u , γ u based on the type of the data requested by SU u. The BBU pool receives the quadruplet L u × L, T s u , T w u , γ u for all SUs and then performs joint scheduling using this information. Before proceeding to the next section, some definitions are given next.
• γ r,u : This is SNR seen by the receiver of SU u when associated with RRH r. The data rate [bits/sec] of the wireless communication link between SU u and the associated RRH r depends on γ r,u . We denote h r,u ∈ C as the downlink channel coefficient of this link, which includes the effect of RRH transmitter's antenna gain, SU receiver's antenna gain, small scale fading, and large scale fading, and shadowing. Therefore, γ r,u is given by where P r,u is the transmit power of RRH r to SU u, σ 2 denotes the power spectral density of background noise, and Γ is the SNR gap, which represents the mismatch between theoretical capacity at a specified SNR and its actual throughput that may be achieved in practice. It should be pointed out that (1) is independent of spectrum band utilized or s ∈ S. This is because we assume a narrowband system, i.e., a system whose total bandwidth is smaller than the coherence bandwidth of the channel. Therefore, the system experiences frequency-flat fading and all spectrum bands observe the same channel gain 1 . Since the BBU pool has access only to the statistics of small-scale fading, the ergodic rate is considered [58], [59].
where R t u and S t u , respectively, represent the set of RRHs and spectrum units assigned to user u at time slot t, whereas R T , S T represent a T times Cartesian products of the sets R, S, respectively. Thus, R u and S u denote the sets of allocated RRHs and spectrum units to SU u over the whole resource allocation epoch T . Assuming the maximum ratio transmission (MRT) of distributed downlink beamforming to every user, the data transmission time for data request by SU u is given by 1 Extension to wideband systems and frequency-selective fading is an interesting research direction. VOLUME 4, 2016 Note that, to obtain the transmission time as an integer multiple of ∆t, we have used the ceiling function, x , which maps x to the smallest integer greater than or equal to x.

III. PROBLEM FORMULATION
Before describing the objective function, we explain the temporal and resource constraints that should be satisfied by any feasible solution.

A. SCHEDULING AND RESOURCE ALLOCATION CONSTRAINTS
A service provider registers data requests of those SUs that arrive in the previous scheduling epoch T in the BBU pool to be served in the next resource allocation epoch of length T . By scheduling, the BBU pool first decides about acceptance/rejection of this request. If it is accepted, SU u may experience t w u time slots as waiting time to start receiving data. For successful data transmission, t w u should be in the range of [T s u , T w u ]. When requests of some SUs are rejected, these SUs repeat their requests in the next scheduling epoch. Below we provide definitions that are relevant to the problem formulation.
Definition 1: Each RRH R i maintains a service capacity of at most u max users in every time slot. To model this constraint, we define u max fictitious RRH units per every real RRH and represent them byR i,j . Here, i denotes the real RRH to whichR i,j refers and j = 1, 2, · · · , u max denotes the fictitious RRH unit index. Every RRH unit may be assigned to only one SU in each time slot. The set of all RRH units in one epoch is denoted byR := {R t i,j : i = 1, 2, · · · , R, j = 1, 2, · · · , u max , t = 1, 2, · · · , T }. Furthermore, the set of RRH units assigned to user u in time slot t is represented byR t u . Finally, the set of all RRH units assigned to user u is given byR u := {R 1 u ,R 2 u , · · · ,R T u }. Definition 2: Let I u t w u ,R u , S u denote the transmission time interval for data requested by SU u with waiting time t w u and R u , S u as assigned resources. It is given as Here, the scalar t w u is added to every member of the set 0, τ u R u , S u .
Definition 3: The set of all possible data transmission time intervals and resources for SU u is expressed by Definition 4: The set of all possible transmission time intervals and resources for the data requests of U is which is the union of all I u s.
Definition 5: Any set I which is a subset of all possible transmission time intervals and resources, i.e., I ⊆ u∈U I u , is a disjunctive set, if it meets all the following six resource and temporal constraints. 1) Each spectrum unit and RRH unit are assigned to only a single SU in each time slot: 2) The total assigned spectrum units to SUs are no larger than the spectrum pool capacity in each time slot: 3) The total number of assigned spectrum units for each SU is limited by s max : . . , T }, U) . (7) Note that, in general, s max could be different for various requested data types. 4) In RRH assignment, minimum required SNR is enforced: 5) Every user is allocated either none or exactly one fictitious RRH unit of a real RRH: Indicator (R t i,j ∈R t u ) ≤ 1, ∀t, ∀i = 1, · · · , R. (9) where we have used an indicator function which assumes one if its logical argument is true and zero otherwise. 6) For all (I u , t w u ,R u , S u ) ∈ I, we have: If I is a disjunctive set, we write it as I ⊆ D u∈U I u .
Constants s max and u max could be determined in an adaptive fashion based on the number of SUs, CSI, distribution of the SUs, and requested data types. Other constraints can also be added. For example, in Section VI (performance evaluation), to avoid starvation of certain users in the CRAN that could happen when some SUs consume all resources, we force each SU to occupy a minimum number of spectrum units.

B. OPTIMUM SCHEDULING WITH BIG DATA PRIORITY
We define τ (I) as the maximum data transmission time among SUs that are accepted by I. Mathematically, it is written as Note that the maximum operation in (11) is performed over accepted users only. At this point, the optimization problem in our proposed scheduling and resource allocation framework can be formally stated as follows: where maximization is carried out over all possible disjunctive sets. The objective function is defined as where α u represents the weight or priority assigned to user u.
As will be demonstrated later on, the proposed objective has the capability to simultaneously address both fairness and big data issues. By placing τ (I) in the denominator of the objective function, we favor small τ (I) values. This enforces the maximum of SU's data transmission times to be as small as possible. Furthermore, a big volume of data leads to a greater data transmission time, hence τ (I) will be dominated by big data requests. By minimizing (11), the scheduling problem allocates more resource to big data requests. To further emphasize on big data requests, the overall volume of transmitted data is placed in the numerator of the objective function, and also, α u can be adjusted properly to reflect priorities for certain users. For example, α u = Lu u∈U Lu assigns priorities directly proportional to L u , which gives a higher priority to the request with bigger data volume. The quantities α u s affect fairness in the CRAN as well. We will study performance of the proposed algorithm for both α u = 1 and α u = Lu u∈U Lu . The BBU pool should jointly perform the selection of SUs, resource allocation and time scheduling to optimize (12), which is accomplished by finding the optimal I, denoted by I * . Unfortunately, the size of the search space to exhaustively find the global optimum I * is exponential in terms of the number of optimization variables. In the next section, we apply dynamic programming to reduce the computational burden as much as possible while guaranteeing to reach the globally optimum solution.

IV. DYNAMIC PROGRAMMING ALGORITHM
First, we describe a well-known problem which bears close resemblance to our optimization problem. Then, we point to the relevant references that have tackled this well-known problem. Building on this existing techniques, we develop a dynamic programming algorithm to solve our optimization problem. To clearly demonstrate how the proposed method works, a toy example is also presented.

A. SCHEDULING TASKS ON A SINGLE PROCESSOR
Suppose there are n tasks which can be processed by a single processor. Each task can be completed only at certain time intervals. These intervals are given as part of the problem formulation. If two intervals for different tasks overlap, only one can be completed. The problem is to schedule these tasks so that a maximum number of tasks can be completed [60]. If a single completion interval is given per task, this problem can be solved optimally via greedy scheduling. However, it is well-known that if more than two possible processing intervals are given per task, this problem is NP-hard [54]. It has been shown that finding the global optimum solution for this problem amounts to determining the maximum colorful independent set on an interval graph. Below, we provide the definitions for these concepts. Returning to our task scheduling problem, vertices are connected by a direct edge if they overlap in time. Hence, we can simultaneously schedule those vertices that are not connected by an edge. If we assign the same color to all those vertices that correspond to possible intervals for processing a single task, then we want our set of scheduled intervals to be colorful. This is because each task needs to be processed only once. Hence, the problem of scheduling a maximum number of tasks amounts to finding the maximum colorful independent set of the corresponding interval graph. Although this problem is NP-hard, various advanced methods have been proposed to minimize the complexity of solving this problem (see [54], [61] for example). Here, we apply a simple dynamic programming approach to solve our problem. Yet, the methods in these references can be leveraged to obtain even more complexity-efficient optimal algorithms.

B. CRAN SCHEDULING AS A COLORFUL INDEPENDENT SET PROBLEM
To solve (12), we first enumerate all possible disjunctive sets. Then, we evaluate each to obtain the one that maximizes our objective in (13). First, we define a generalization of the interval graph.
Our equivalent interval graph, G is defined by V as the set of vertices, and E as the set of edges. Every (I u , t w u ,R u , S u ) represents a vertex in the graph. For each (I u , t w u ,R u , S u ), (I u , t w u ,R u , S u ) ∈ V , they are connected by an edge if they cannot be scheduled simultaneously. This amounts to an overlap in the time domain while simultaneously having an overlap either in frequency or RRH VOLUME 4, 2016 unit resources. Mathematically, E is defined by Also, we define a coloring function φ : V → U such that φ (I u , t w u ,R u , S u ) = u where every user is represented by a different color. So, any disjunctive set I corresponds to an independent colorful subset of vertices in graph G which can be simultaneously scheduled. It is noticeable that constraint (5) is taken into account when an edge exists between two vertices, and constraints (6)-(8) are considered in definition of vertices. Contrary to the processor scheduling problem that targets the maximum colorful independent set, our objective function in (13) is not necessarily maximized by the maximum colorful independent set. Hence, we have to enumerate over all colorful independent sets of various sizes and evaluate them one by one to find the maximum. This task is carried out via dynamic programming.

C. PROPOSED DP APPROACH
Let us define A t ν as the collection of all colorful independent sets of size ν whose time intervals end at or before t. Note that ν represents the number of users that are scheduled, and hence it is a non-negative integer that assumes values between 0 and U . The variable t is also a non-negative integer taking on a value between 0 and T . Based on the principle of optimality, the independent colorful sets of size ν may be calculated from the independent colorful sets with smaller sizes or with a shorter end time. Mathematically, this relationship is captured in the following theorem. Theorem 1. Initially, set A t 0 = ∅ and A 0 ν = ∅. Then, A t ν can be determined by the following iteration: where the subscript D ensures that only disjunctive augmentations are accepted i.e., colorfulness and independence are maintained.
Proof. To prove this theorem, we consider the following cases separately: 1) The first case is when our generalized interval graph contains no vertex whose time interval ends in time slot t. It is obvious that A t ν = A t−1 ν . 2) For the case that there exists a vertex whose time interval ends at t, it can be added to any vertex in A t ν−1 as long as the augmented set is still colorful and independent. These newly formed feasible schedules should be added to those in A t−1 ν .
Input: {< L u × L, T s u , T w u , γ u > ∀u ∈ U}, S T , R T , and statistical CSI for RRHs-SUs. Output: A T ν , ∀ν ≤ U : Independent colorful sets. Calculate J(I) 5 if J(I) > Temp then 6 I * ← I, Temp ← J(I) 7 return I * Our proposed DP method in Theorem 1 is succinctly summarized as Algorithm 1. After all colorful independent sets are collected via Algorithm 1, Algorithm 2 evaluates the objective over all these sets and selects the one yielding the maximum value as the global optimum I * . To further reduce complexity, the update in (1) needs not to be computed at every time slot but only at those time slots when a user's service is completed. With this approach, the computational complexity is reduced, especially in case of having big data requests that span over many time slots. The computational complexities of Algorithm 1 and Algorithm 2 are analyzed in the next section. Before closing this section, we present a toy example to illustrate how Algorithm 1 works.  Table 2.

D. A SIMPLE EXAMPLE
Consider a simple CRAN where the values for U , S, R, s max , and u max are all set to two. The proposed scheduling interval begins at time slot one and ends at T = 22. Also, T s 1 = 0, T w 1 = 1 and T s 2 = T w 2 = 2. The spectrum availability for SUs over time S t s is given on top of Table  2. We assume that the SNR constraint (8) is always met for any user-RRH pair. Since u max = 2 and there are exactly two users, the RRH service limit is always satisfied. Thus, it is optimal for both users to connect to both RRHs over the service time. Table 2 also shows all possible data transmission time intervals and resources for these two SUs where | I 1 |= 6, and | I 2 |= 3. We use blue and red colors for SU 1 and SU 2, respectively. The corresponding colorful interval graph is presented in Fig. 2. In this graph, each vertex corresponds to one (I u , t w u , R u , S u ) and is labeled by a number. As explained before, we only need to update the set A t ν at those time slots where a schedule ends. Therefore, we assume 6 update slots, i.e., 5, 6, 9, 10, 18, and 21 instead of all the 22 time slots. Table 3 illustrates A t ν for ν = 0, 1, 2 and update slot = 0, 1, · · · , 6. For example, A 5 2 , which is the (2,5)th entry in the table, is formed according to following steps: 1-The contents of A 4 2 in entry (2,4) are copied in A 5 2 .
Ru max r ×max(T w u − T s u + 1) ). Overall, the complexity of Algorithm 1 is dominated by the latest term. Algorithm 2 finds optimal scheduling between independent colorful sets that are calculated in Algorithm 1, so its complexity is dominated by that of Algorithm 1. It can be observed that the proposed DP complexity is of order O(U 2 × T × max (T w u − T s u + 1)) times the complexity of forming all the possible candidates for a single u given by | I u | if we assume u max is close to one. However, the complexity of exhaustive search equals to the product of complexities of forming schedules for a single user, or | I u | which is larger by many orders. It is noteworthy that the problem of interest is NP-hard so finding the global optimum solution entails exponential complexity. However, the proposed DP significantly reduces the exponential complexity order. As one of the main contributions of this work, reaching the global optimum is tractable over larger values of system parameters than what could have been achieved by exhaustive search. It is pointed out, however that, algorithms that are even more complexity-efficient could be found by leveraging prior works, such as [54] and [61].

VI. SIMULATION RESULTS
The performance metrics for scheduling in CRAN include: (i) total data rate, (ii) maximum data transmission time, (iii) average waiting times, (iv) percentage of the scheduled SUs, and most importantly, (v) percentage of the scheduled SUs which request big data. The normalized distance between the minimum and maximum requested data sizes is divided into 5 ranges, where each range is recognized by ζ as upper bound of that range. We assume the 5th range that is recognized by ζ = 1 and contains 20% of the largest requested data sizes, represents big data users. Raj Jain fairness index [63] is used to measure fair resource allocation for the scheduled SUs of each range.
While the branch-and-bound method will reach the globally optimal solution, it does so at an exorbitant cost in computational complexity. Due to its impracticality, we did not compare the proposed algorithm with the branch-and-bound method. Instead, we considered two baseline algorithms for the cloud BBU based scheduling. The first one is referred to as modified earliest deadline first scheduling (MEDFS) VOLUME 4, 2016  t  0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  s = 1  0  1  1  1  1  1  1  1  1  1  1  0  0  0  0  1  1  1  1  1  1  1  0  s = 2  0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  0  1  0       and the second algorithm is referred to by temporal resource demand-capacity ratio scheduling (TRDCS). In MEDFS, we first allocate the RRH-spectrum resources to the SU with the earliest deadline [44], [45], [64]. In fact, nearest RRHs that meet the SNR constraint, and maximum possible spectrum units of s max are allocated to that SU. Afterwards, we remove all other intervals of this SU and reject those SUs whose QoS requirements cannot be satisfied by the remaining CRAN resources. Then, we repeat this process until all SUs are exhausted. When only one schedule per user is allowed, this algorithm will reach the global optimum. When more than two schedules per user exist, this algorithm will accept more than half of the number of scheduled users at the global optimum solution [65]. It is noteworthy that MEDFS strives to schedule as many users as possible, irrespective of their data types. The idea behind TRDCS is to schedule users at those intervals where resources are abundant and the demand is low. This will reduce the burden on heavily-loaded intervals. To achieve this goal, a priority metric is defined for every possible schedule. This metric equals the ratio of the CRAN resource capacity in that interval to the overall SUs' demands during the same interval. The overall demand is defined as resource requirement of the investigated interval, while excluding other intervals of that SU, plus demands of all intersecting parts of other SUs' intervals. For each SU, one interval with the maximum overlap is considered. We first accept the schedule with the maximum priority metric, then we remove all other intervals of the corresponding selected SU and all intervals of other users that do not meet the CRAN resource constraints. Then, we iterate until all users are either scheduled or rejected.

A. SIMULATION SETUP
We consider a CRAN within 1000 × 1000 m 2 area with multiple RRHs for serving SUs, and a single BS for serving the PUs. These RRHs and SUs are uniformly and independently distributed within the square area. We assume that the SUs are either static or have low mobility, and therefore their positions remain almost constant during one scheduling epoch. As a result, large-scale fading and shadowing are almost constant in one scheduling epoch. The proposed DP algorithm does not require knowledge of small-scale fading, but only its statistics are required. Once a RRH-user assignment is made by the algorithm, exact small-scale fading can be obtained online via pilot-based methods to enable the maximum ratio transmission beamforming by the RRHs. It is assumed the capacity of the backhaul and fronthaul links are large enough to support needed data flow of all scheduled users simultaneously [66]. Infinite buffer size is assumed for both RRHs and users. SUs independently make data requests with size L u frames, which is uniformly distributed in the range 2 8 , 2 16 . The maximum transmission power of each RRH is set to 33 dBm [62]. The background noise spectral density is equal to σ 2 = −168.60 dBm/Hz [22]. By assuming an urban environment for the small cells, the distance-dependent path loss of RRHs-SUs channels is given by: P L[dB] = P L • + 36.7 log 10 d r,u − a • where d r,u ≥ 1.135 m and is in meter. Furthermore, our operating frequency equals 900 MHz, corresponding to a wave length of 1 3 m. Shadowing effect is modeled by the log-normal distribution with variance equal to 8dB [67]. P L • is equal to 30.58 dB, and a • is a correction factor which is used to take into account different antenna heights at RRHs and SUs. The total bandwidth of the network is assumed to be 20 MHz, with an activity rate 0.4 to 0.9 [68] for PUs. Correspondingly, the relative spectrum pool capacity, β, is in the range 0.1 to 0.6 of the total bandwidth. The free part of this bandwidth is partitioned into spectrum units with ∆f = 200 KHz [69]. The spectrum units are occupied by PUs with an exponential distribution dwell time equal to 10 3 × ∆t. Note that S t can be formed by using these information. The values for the rest of simulation parameters are presented in Table 4.

B. SINGLE CRAN REALIZATION
An instance realization of the coverage area is shown in Fig.  3a for a CRAN with U = 20 and R = 10. To ensure a uniform spread of both RRHs and SUs across the overall square area, each pair of SUs is forced to maintain a minimum distances equal to √ 2 × 1000m/20. Similarly, RRHs are forced to maintain at least a distance of √ 2 × 1000m/10 to each other. The minimum and maximum waiting times, minimum and maximum possible data transmission times (corresponding to various schedules) for these SUs are shown in Fig. 3b when they make requests with lengths that are shown in Fig. 3c. For illustration purpose, these times are scaled with different factors. The histogram for the number of possible transmission time intervals and resources (see (4)) is plotted versus SU indices in Fig. 3d.
It can be observed that some SUs have no scheduling option because of the constraints in (6) and (8). The possible minimum and maximum service times for these SUs are set to zero in Fig. 3b. Therefore, they are rejected right away and are not given as viable inputs to Algorithm 1. Fig.  4a shows the number of disjunctive subsets. Figs. 4b and  4c show results for the number of the scheduled SUs, and total data rate, respectively. These results clearly show better performance of our proposed DP approach.

C. MONTE CARLO SIMULATION RESULTS
The following simulation results are obtained by averaging over 10 4 randomly-generated CRAN realizations. The percentage of the scheduled SUs, total data rate, maximum  transmission time, average waiting time for scheduled SUs in the CRAN with respect to β are illustrated in Fig. 5. The results show that the proposed DP achieves better performance in the percentage of the scheduled SUs and total data rate with respect to different β, which represents spectrum pool capacity.  experience lower waiting times as compared to the TDRCS algorithm. We could use higher values of α u and lower values of T w u for the SUs with real time services, and provide better waiting times for them.

D. BIG DATA
Recall that the main motivation to propose (12) is to favor and support big data transmission in the CRAN. In Fig. 6, the percentage of the scheduled SUs, and the percentage of the allocated resources are demonstrated with respect to ζ, which classifies data sizes into one of five ranges. Curves are plotted for three cases: β = 0.1, 0.3, and 0.6. Fig. 6a shows that our proposed DP algorithm schedules more SUs with big data requests. Also, Fig. 6b demonstrates that our proposed approach has better performance in resource allocation for the big volumes of data requests. This better performance is a direct consequence of using (11) in the objective function, and is achieved in addition to the higher data rate for CRAN, and higher percentage of the scheduled SUs (as studied in Subsection VI-C).   7 illustrates the results obtained by setting priority weights of users to α u = Lu u∈U Lu to favor big data users even more. We observe that better performance for big data users is achieved. However, this performance improvement comes at the cost of reduced fairness between the requests with different sizes. This fact is illustrated in Fig. 8 where the average Raj Jain fairness index of our proposed DP method is 96.10% and 82.17% for α u = 1 and α u = Lu u∈U Lu , respectively. However, the proposed DP method achieves a better fairness than the MEDFS method (with average Raj Jain fairness index of 71.59%) and lower fairness than the TRDCS method (with average Raj Jain fairness of 97.82%).

E. LONG TERM REALIZATION
A long-term realization of the CRAN is evaluated for 1 hour. We assume SUs arriving to the CRAN according to a Poisson distribution with rate equal to λ u [SUs/min], which is coupled with the data request rate. Also, we consider the rate λ s [spectrum units/min] for the Poisson process that simulates arriving spectrum units in the spectrum pool. To have different conditions, SUs arrive in independent and uniform random locations of the CRAN.
Figs. 9a-c show the long-term performance of different algorithms versus λ s , with λ u = 30 [SUs/min] and β ≥ 0.1. Fig. 9a shows that our proposed DP algorithm achieves a higher percentage of the scheduled SUs. By incorporating nonuniform priority in our approach, the percentages of the scheduled SUs are decreased, but still these percentages are higher in comparison with those of other two algorithms. Fig  9b shows that by using this priority flexibility the BBU pool serves highest percentages of SUs with two upper deciles of data request sizes. Fig. 9c demonstrates that the BBU pool handles higher rates of data flow using our proposed DP method.
Figs. 9d-f illustrate long-term performances of different algorithms versus λ u , with λ s = 150 [Spectrum units/min]. In Fig. 9d, by increasing λ u , all algorithms have more scheduling options. However, because of the fixed spectrum pool capacity, percentages of the scheduled SUs decrease with increasing λ u for all algorithms. Also, for the cases λ u = 5 and 10, the BBU pool has large spectrum pool capacity and hence different algorithms produce the same results. Our proposed DP algorithm achieves higher percentages of the scheduled SUs in all values of λ u . In Fig. 9e the highest percentages of the scheduled SUs with two upper deciles of data request sizes are for DP with α u = Lu u∈U Lu , α u = 1, TRDCS, and MEDFS, respectively. Finally, Fig. 9f shows that by using the proposed DP algorithm, the BBU pool achieves higher data rates in all values of λ u .

VII. CONCLUSIONS
We have formulated and solved a joint SU selection, RRHs and spectrum allocation, and deadline-aware time scheduling problem for efficient downlink data transmission of SUs in a cognitive CRAN. The problem is formulated in such a way to give a higher priority to big data requests. The problem was shown to be NP-hard. We then proposed a DP-based approach to reduce the complexity in finding the optimum solution. The proposed solution approach has two phases. In the first phase, all feasible solutions are obtained by Algorithm 1, and in the second phase, optimum schedules are obtained by Algorithm 2. Simulation results demonstrate that significant improvements in total data rate, percentage of the scheduled SUs, and big data transmission are achieved by the proposed algorithm in comparison with two baseline scheduling algorithms which are referred to as MEDFS and TRDCS algorithms.    Iran University  of Science and Technology, Tehran, Iran, in 2009 and 2012, respectively, where he is currently pursuing the Ph.D. degree with the School of Electrical Engineering. His current research interests include big data, information theory, channel coding, signal processing for communications, wireless networking, and cooperative communications. Mr. Bigdeli has served as a reviewer for several IEEE journals and major conferences. BAHMAN ABOLHASSANI was born in Tehran, Iran. He received the B.S. degree from Iran University of Science and Technology (IUST), Tehran, and the M.S. and Ph.D. degrees from University of Saskatchewan, Saskatoon, SK, Canada, all in electrical engineering. He was an Instrumentation Engineer with the College of Water and Power Technology, Iranian Ministry of Energy, for three years. Then, he worked as a Communication System Engineer in a number of private and government companies. He joined the School of Electrical Engineering, IUST, where he is currently an Associate Professor. He served as the Dean of the School of Electrical Engineering and an Associate Dean for Research. He also served as a Sessional Lecturer at University of Saskatchewan. His research interests are in the fields of wireless communication systems, network planning, spread spectrum, cognitive radio networks, resource allocation, VANETs, and optimization of large systems.