SDN Assisted Codec, Path and Quality Selection for HTTP Adaptive Streaming

Adaptive streaming over HTTP is the dominant video streaming technology for more than a decade. HTTP Adaptive Streaming (HAS) systems provide a framework which enables clients to adapt quality with respect to network fluctuations during streaming, hence to optimize the perceived quality on the client side. Recently, network assistance is integrated with HAS in order to improve underlying network conditions and to provide network-related information to the clients. The performance of HAS systems can be further enhanced if the characteristics of the streamed video are considered. In this paper, we propose a HAS system architecture where Software Defined Networking (SDN) technology is utilized for assisting clients to select the most appropriate video codec and bitrate under the constraint of current network conditions as well as routing the video packet over the appropriate paths. In the proposed architecture, layered video is used, where each additional layer improves the quality. The controller estimates the packet loss probability by taking video codec characteristics, the bitrates of the layers and network capacity into account. Based on these estimations, the controller selects the appropriate codec type and video quality for the clients and manage the network. Simulation results show that the performance of the video streaming architecture can be improved significantly when codec, quality and path selection are jointly considered, and combined with SDN flexibility and advantageous.


I. INTRODUCTION
Being one of the most popular application types used on the Internet, video streaming applications offer a wide range of usage scenarios, from live video streaming services to distribution of personal videos of users. Cisco's forecasting reports state that nearly half of total devices will be video capable in 2022 and the percentage of the video packets that will be transferred over IP will have reached to 82% by then [1]. While emerging network technologies such as 5G, Software Defined Networking (SDN) and Network Functions Virtualization (NFV) enable an infrastructure that provides high connectivity and low latency, the requirements of the future multimedia applications have been increasing on the other hand. As well as maximizing the underlying bandwidth capacity, minimizing latency is very important The associate editor coordinating the review of this manuscript and approving it for publication was Chin-Feng Lai .
for the applications like Augmented Reality/Virtual Reality (AR/VR) implementations or interactive multimedia systems.
For almost a decade, HTTP Adaptive Streaming (HAS) has become a dominant technology for streaming video on the Internet. In HAS systems, more than one representation of the same content is encoded at different bitrates in order to enable smooth quality adaptation on the client side during streaming. While HTTP provides utilization of the webcaches and uses a reliable end-to-end transport infrastructure due to TCP, quality adaptation enables the selection of optimal quality under the constraint of network conditions and client side parameters. In order to provide interoperability between HAS systems developed by different vendors, Dynamic Adaptive Streaming over HTTP (DASH) standard was proposed by MPEG working group [2]. DASH has codec and format agnostic nature and can be applied with any media format [3].
In HAS systems, the general approach to produce quality alternatives, i.e. representations, is to have non-layered VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ encoded files by using a codec such as H.264 Advanced Video Codec (AVC). In this case, there is one encoded video file for each quality level. Another option for generating the alternative qualities of the same video file could be the usage of a layered video codec such as Scalable Video Coding (SVC) or Multiple Description Coding (MDC). With layered coding, all quality levels can be obtained from a single encoded file. The use of a layered video in a HAS system provides network bandwidth utilization and cache storage efficiency [4]. Although the use of SVC in commercial systems has not been preferred until today due to its higher overhead on the bitrate compared to AVC, recent developments in video codec standards have paved the way for the use of it. The newest video codec standard, Versatile Video Coding (VVC-H.266), was finalized in July, 2020 by ITU-T VCEG and ISO/IEC MPEG groups. It is expected that layered multistream and scalability will be used in the commercial system thanks to H.266, which provides up to 50% bitrate saving over High Efficiency Video Coding (HVEC) (H.265) [5]. SDN technology can be a good alternative for network operators that offer services to video streaming companies [6], [7]. Since it's possible to design application specific network solutions thanks to the decoupled data and control plane architecture of SDN, this technology can be used for increasing the performance of HAS applications. HAS clients have limited knowledge about network conditions, so they may suffer from video freezes or under utilization of available bandwidth and may not get the best possible video quality under the current conditions [3], [8], [9]. Recently proposed approaches for enhancing the performance of the video streaming applications shift toward network-assisted approaches and SDN is one of the most preferred technology in these systems [10], [11], [12].
The streaming paths used for transferring the video packets, quality adaptation techniques and video codec type jointly affect the performance of the video streaming applications. Hence, these parameters should be carefully considered when designing a video streaming system architecture. In this study, we propose a video streaming system architecture that utilizes SDN technology to determine jointly video codec type, quality, and streaming paths. The advantages of different codecs according to their characteristics are considered in this study. In the proposed system, the SDN controller has some knowledge about HAS characteristics. It utilizes underlying network information and HAS-related knowledge to select the video codec type for the clients and the optimal number of layers that the clients should request. Although SDN controller can collect the network statistics in real time, providing a good level of performance is not an easy task due to the dynamic nature of HAS and network conditions, as well as different video codec characteristics. In this work, codec type, streaming paths, and optimal quality selections are made by considering the latency at the application layer. For this purpose, the packet loss rate in the link layer is estimated and codec and quality selections are made in a way that the delay of packets due to TCP retransmissions is minimized.
The SDN controller uses an optimization model to select video codec type, the number of layers, and streaming paths for each layer when a client joins the system. The controller also runs an event-based heuristic algorithm, which re-determines the number of layers and streaming paths with respect to the network fluctuations during the streaming session. Clients run a rate adaptation algorithm which interprets the recommendations sent by the controller, which are the outputs of the optimization and heuristic algorithms.
The contributions of this study can be listed as follows: • We propose an approach that considers the characteristics of the video codecs and underlying network conditions. To the best of our knowledge, there has not been any previous study on the joint selection of the optimal number of video layers and streaming paths by taking into account video codec type and network conditions.
• We define formulas for estimation packet loss ratios at the link layer. These estimates take into account video layer dependencies and the effects of the lost layer packages on other video layers.
• We utilize packet loss estimations for assigning the optimal number of video layers. For this purpose, we estimate the video packets that is expected to be delayed and select the layers so that the latency is minimized for live video streaming applications.
• We propose a new rate adaptation algorithm for HAS clients, which can interpret the SDN controller's comments and act accordingly. With presented comparative performance results, we demonstrate the performance gain that can be achieved by (i) HAS aware network assistance and, (ii) network assistance aware client implementation.
The rest of the paper is organized as follows: In Section II, background on SDN and HAS based systems with related works is given. The details of the proposed architecture, the heuristic algorithm, and the client implementation are presented in Section III. The comparative performance results are provided in Section IV. Finally, conclusions are given in Section V, which is followed by the reference list.

II. BACKGROUND AND RELATED WORKS
Conventional network architecture which relies on the vertical design, where data and control planes are bundled together, makes efficient network configuration a difficult and complex task. SDN technology, which separates control and data planes, overcomes this complexity and gives more agility to network functions [13]. It transforms hardware and device-centric network architecture into a flexible, virtual, and programmable form that provides high agility and rapid innovation in network services. SDN is seen as a promising technology in future networks for its advantages and functionalities that can meet various types of demands and requirements of future Internet applications. The main functionalities of the SDN technology rely on the communication between forwarding devices and the controller via an open standard interface. The OpenFlow protocol [14] is the most popular communication protocol for exchanging messages between the switches and the controller.

2) HTTP ADAPTIVE STREAMING CHARACTERISTICS
In the architecture of HAS applications, a video file is encoded at various bitrates, which, in turn, encoded video files are produced at various qualities. The encoded video files are partitioned into N chunks known as segments, with each segment carrying t seconds of the video. Encoded video files are called representations. A manifest file, which is called Media Presentation Description (MPD), keeps the information about media content such as the bitrates of the representations and URLs of the segments. The clients use this information in the file to request the selected segments after downloading the file at the beginning of the streaming session.
The quality adaptation is provided by the selection of segments with different qualities over time. For the quality adaptation, the client runs a rate adaptation algorithm which determines the quality of the segments to be requested. The rate adaptation algorithms can be classified into two main classes with respect to criteria that is used for the selection of the quality: Throughput Based Adaptation (TBA) [15], [16], and Buffer Based Adaptation (BBA) [17]- [19]. Beside TBA and BBA, hybrid models [20], [21] also have been proposed. The recent studies in this area focus on adaptive playback speed while jointly adapting the quality [22], [23].
As well as non-layered video codec usage, there are several HAS architectures proposed in the literature that use a scalable video codec such as H.264 SVC. Actually, SVC codec is an extension of ISO/ITU advanced video coding (H.264/AVC) standard, which produces video layers by using spatial, temporal and Signal to Noise Ratio (SNR) scalability [24]. With scalable video coding, the video files encoded at different bitrates consist of a base layer and one or more enhancement layers. While the base layer has the lowest bitrate and the lowest quality, each additional enhancement layer increases the bitrate and consequently improves the quality of the encoded video. The base layer is the most important layer and should never be lost, because, in order to decode an enhancement layer, the base layer and the all previous enhancement layers are required.
Same as SVC, MDC encodes the video into layers called descriptions [25]. Similar to SVC, each description received by the client improves the video quality. However, in contrast to SVC, the description layers are self-decodable and each layer carries basic information to decode the video independently. This characteristic of MDC causes extra overhead compared to SVC since each description contains redundant information to be decoded without requiring other descriptions. Thus, MDC is more tolerant of loss compared to SVC, but SVC provides better compression.
The advantages of the layered video codecs have led some researchers to design streaming systems that use layered video over emerging network architectures using them such as SDN [26], NFV [27], 5G networks [28] and mobile edge computing [29], and to propose to use layered video in future video applications such as AR/VR. Authors in MS-Stream [30] presents an effective solution for enhancing the perceived quality for DASH clients, by using the different multiple descriptions sent by multiple sources. They showed MDC increased the performance of DASH applications. In [31], the authors presented a cost-effective DASH system utilizing MS-Stream. In AR/VR applications, when the users move their head and change the viewport, the new viewport data should arrive to the users quickly in order to prevent motion sickness. In [32], the authors propose to use layered video for DASH based multicasting system in order to minimize latency in AR/VR applications. In the next section, we give the literature review on layered video streaming over SDN.

B. HAS ARCHITECTURES BASED ON SDN AND SAND
HAS has a client-driven architecture, so it has limited information and view about network conditions and other clients' behavior. Also, service providers do not have control over the client's behavior. Therefore, they may not be able to guarantee a high level of service quality with client based adaptation. Since the performance of the multimedia systems in terms of Quality of Experience (QoE) mostly relies on the underlying network conditions and server characteristics such as availability and distance, there is a tendency to get assistance from the network to overcome the limitations of client based architectures. SDN is a good alternative to enhance the performance of video streaming applications and to provide network support to HAS systems.
There are many studies that propose video streaming services over SDN in the literature. In one group of studies, SDN is utilized for selection of suitable streaming paths to transfer video packets from the server to the clients, thanks to the flexibility of SDN on determining flow routes [33]- [36]. Network resources and bandwidth allocation strategies which consider parameters specific to HAS systems are also studied widely in the literature [10], [37]- [39]. Path assignment and resource allocation approaches proposed in the literature do not consider any strategy related to the layered video characteristics.
In the studies given above, there is no communication between the controller and the video streaming clients. In another group of studies that utilize SDN technology, there are several approaches proposed, where the controller explicitly signals the clients to assist quality adaptation in order to increase the performance of video streaming applications [40]- [43]. These studies mostly focus on the quality selection process, any SDN module or SDN based architecture aware client rate adaptation algorithm is not proposed in these set of studies. Furthermore, layered video transmission is not addressed in any of them.
MPEG group has been working on Server and Network Assisted DASH (SAND) proposal [44]. In order to provide a centralized control on quality adaptation for network and service providers, SAND introduces an architecture that offers asynchronous network-to-client and network-tonetwork communication. Network elements receive QoE metrics from the clients and returns network feedback measurements, which can be used by clients in their adaptation algorithm. In SAND architecture, there are DASH-Aware Network Elements (DANEs), which are the components that have the knowledge of DASH characteristics. They can be used to optimize network resources for DASH video traffic in order to improve the user's QoE. SAND also defines control messages between DANE and clients, and between multiple DANEs. SDN technology can be utilized in SAND architecture [45]. DASH clients may directly select the quality recommended by an SDN controller [46], use bandwidth information sent by an SDN controller [47] or connect virtualized DANEs which were managed by an SDN controller [27], [48]. None of these studies related to SAND in the literature focus on video codec characteristics.
Layered video streaming over SDN investigated in several studies in the literature. OpenFlow based video streaming system for streaming SVC video between multi-server and multi-client is specified in [49]. Transferring base layer and enhancement layers over different streaming paths by using UDP are proposed in [26], [50]- [52], [53], and [54]. These studies focus on transferring SVC layers over selected streaming paths and determining the suitable number of layers by taking UDP characteristics into account. The authors focus on providing QoE fairness among HAS clients, where different SVC layers can be downloaded from different servers in [55]. In [56], the SDN controller suggests the appropriate SVC layer for DASH clients to request by using a machine learning approach that takes network conditions as input. Constructing a multicast tree for each SVC layer for multicasting DASH traffic over SDN is proposed in [58]. SDN technology can also be utilized to transfer MDC coded video over different paths. In [57], a video streaming architecture that addresses MDC coded video distribution over SDN is proposed. The layers of MDC are forwarded over different multicast trees constructed by the SDN controller. The authors propose to serve different Quality of Service (QoS) class users by adjusting the number of MDC layers sent to each service class [57]. In Table 1, we show the main differences of our work from the literature. In this study we determine the optimum number of layers by considering the dependencies of layers that are sent over different paths. We also assign different codecs with respect to the current network conditions. These are the unique characteristics of our work.
Most of the prior works that implement a video streaming architecture over SDN focus on selecting the streaming paths or selecting the quality for the clients. None of them focuses or considers the video codec type when determining the streaming paths. Even for the studies that are related to layered video, the layer dependencies are not considered in the selection of the paths. In our previous work [59], we proposed an optimization model that selects video codec type and the optimal number of layers for the client which newly joined the system, where the codec type and layer selections stay constant during streaming. In this work, we enhanced our previous work [59] by adding a new module to the controller that provides to determine the optimal number of video layers by considering network fluctuations. We also modified the metrics that are used for determining the optimal number of layers and codec type. The overview and design details of the proposed video streaming system are given in the next section.

III. PATH, CODEC AND QUALITY SELECTION FOR LAYERED VIDEO STREAMING
The details of the proposed system designed for layered video streaming by using two types of layered video codecs, namely SVC and MDC, are given in this section. We present the system architecture overview before giving the details of the path and codec selection approach.

A. ARCHITECTURE OVERVIEW
The network architecture used in this study is given in Figure 1. In the architecture, the system consists of DASH server, clients, and forwarding devices that performs packet functions such as dropping, modifying, and forwarding packets to a specific port or controller. Besides these actions, forwarding devices send link statistics to the controller periodically. The middle layer in the figure represents Network Operating System (NOS) or controller. The controller consists of two groups of main function modules. While the basic functions are used for essential network services such as link and path management, the advanced functions are related to video codec type and optimal layer selection. In order to run these modules, the controller gathers network statistics from the forwarding devices and internal parameters from the clients such as buffer level and requested video representation. This information collection module can be implemented as an external module, which runs at a different network element, in a DANE to enable large-scale deployment. In this case, the controller can communicate with DANE through its northbound. Finally, the upper layer represents network service applications. Applications in this layer define the routing policies which are ultimately translated to forwarding rules and are sent via OpenFlow commands to the switches to program their behavior. The proposed framework consists of a HAS application aware SDN controller, a DASH server which encodes the video files with SVC and MDC codecs to produce different representations, and DASH clients that can interpret the commands received from the controller and have the capability of decoding SVC and MDC encoded videos. When a new client joins the network and establishes the TCP connection with the server, its request is handled by the firsthop switch. Since there is no entry for a new client in the switch's routing table, the switch forwards the client request to the controller by sending a PACKET-IN message to learn the forwarding rule for that client. In the controller, Host Manager module detects newly joined client when it receives the PACKET-IN message. After that, the controller runs an algorithm that determines the video codec type, the optimal number of layers, and the streaming paths of each layer. It uses the current network conditions as the input to the algorithm. Based on this information, it calculates the likelihood of the packet losses at the link layer by considering the bitrates of the video layers and available bandwidth information of the paths that the layers will be transferred over. It then determines the video codec related parameters such as video codec type and the optimal number of video layers by running an optimization model which aims to minimize packet losses at the link layer and maximizes video bitrate. Note that, we focus on the estimation of packet losses at the link layer since the packets are not lost at the application layer due to the reliable transfer mechanism of TCP. The controller signals the clients via REST API to direct them according to the output of the algorithm. At the same time it sends the flow route information to the switches via OpenFlow protocol on the southbound API. The server sends each layer of the encoded videos via connections opened over different ports. Hence, the controller can determine different streaming paths for each video layer by using the port information defined for each layer on the server side. An example scenario that shows the dissemination of the video packets belonging to different layers of different codecs is given in Figure 2.
During the streaming session, the clients select appropriate video quality by considering the observed network conditions and internal parameters. However, the SDN controller also helps the clients with the decisions on the quality. In the proposed architecture, the controller periodically measures the traffic amount on each streaming path and if the traffic pattern is changed, it runs an heuristic algorithm within its Optimal Layer and Path Selection module in order to determine the number of layers to be requested by the clients based on the current network conditions. The rate adaptation algorithm does not allow clients to request higher quality than the one recommended by the controller. This prevents the tendency of clients to request higher video quality than the network can transmit, and severe quality degradation is proactively eliminated. The details about the rate adaptation algorithm are given in section III-D.

B. ESTIMATION OF THE PACKET LOSS RATIO
The packet losses at the link layer affect the received video quality due to the latency introduced by TCP retransmissions. If a packet carrying enhancement layer data arrives later than its playout time, it's discarded at the application layer. Basically, these packets are lost at the application layer although they are received by the clients. Different video VOLUME 9, 2021 codec types affect packet losses at the application layer differently because of the differences in the layered structures. Table 2 shows two scenarios where the same video is encoded with both SVC and MDC codecs. In the table, + represents the layer is received timely by the client while − represents the layer is not received before its playout time, i.e. it cannot be played and is discarded by the client. As can be seen from the table, even if the same video layers are received by the client, the client plays the video with higher quality with MDC codec. It is because, unlike MDC codec, in the SVC codec, if a packet belonging to layer n is lost, other packets belonging to higher layer than n cannot be decodable even if they arrived timely. When we examine the number of layers received by both client types, MDC seems more advantageous because the clients can get higher quality. However, MDC layers have higher bitrate when compared to those of SVC layers for the same quality level, hence MDC requires higher bandwidth.
The packet loss ratio is relevant to the available bandwidth of the streaming path, the number of video layers and the bitrate of each layer that will be transferred through that path as well as video codec type as explained above. Suppose abw represents the available bandwidth of a path p and n denotes the number of video layers which are transferred through the path p. The available bandwidth of the path is the remaining bandwidth when the traffic amount of UDP and non-HAS TCP flows is subtracted from the original capacity. TCP fairness ensures that the bandwidth portion that can be used to transfer each layer roughly equals to abw/n on the path p. The controller calculates the expected packet loss ratio of the transmitted video layer packets by using the following formula: where plr L and bitrate L represents the expected packet loss rate of the packets of layer L and the bitrate of video layer L, respectively.
The packet loss rate given in (1) indicates the loss probability of an arbitrary packet belonging to a TCP flow. The lost packets are re-transmitted by TPC, hence the delay of the packets are increased. We refer to packets that are arrived later than their playout time and discarded at application layer as the lost packets at this layer. When we consider the video packet losses at the application layer due to this retransmission delay, we should also consider the layer dependencies especially for SVC coded video. If an SVC layer packet is lost, then the upper enhancement layers of this lost packet cannot be decoded on the client side, hence they are also treated as lost packets. Therefore, in the calculation of packet loss probability, codec type should also be considered. The packet loss probability also depends on the number of flows that shares the same streaming paths and the bitrate of the video layers because packets losses are highly related to the capacities of the paths and the traffic amount transferred over those paths. Let c_type represents the codec type and L is the layer number. And let plr L,c_type is the packet loss rate in each layer, which is calculated by considering layer dependencies. plr L,c_type equals to: While (1) gives the expected packet loss ratio for a specific layer, (2) calculates estimated total packet loss ratio for all layers affected by the lost packet. In our previous work, we provided a formula for estimating of packet loss ratio by considering the layer dependencies for the video layers transferred over the same paths [59]. In this current work, we generalize this formula by also considering the layers of the same client, being sent over different paths. This is important because the losses of layers sent over different paths also affect each other. The Optimal Layer and Path Selection (OLAPS) algorithm considers dependencies between layers that are transferred over different paths. We will give the details about this algorithm in section III.C.

C. THE SELECTION OF CODEC TYPE, LAYERS AND STREAMING PATHS
The selection of the video codec type is done by the controller at the beginning of the video streaming session. The selection of the codec type is done only at the beginning of the streaming session because changing the codec type during streaming would not be practical. In this initial stage, the controller also determines the optimal number of layers and streaming paths by considering the selected video codec type and current network conditions. The optimal number of layers and streaming paths are re-determined by the controller during the streaming session while video codec type stays the same.

1) VIDEO CODEC SELECTION AT SESSION STARTUP
When a client joins the network and starts the video streaming application, the request sent by the client to establish a TCP connection is grabbed by the controller. Therefore, the controller detects that a new client joins the system and it triggers the Video Codec Selection module. The Video Codec Selection module is responsible for selecting the codec type and the optimal number of layers for the newly connected client by considering the current network conditions. The selection of the codec and the optimal number of layers is determined by an optimization algorithm. Before running the optimization model, the controller runs Packet Loss Estimation algorithm (Algorithm 1), for calculating the estimated packet loss ratios to be given as one of the inputs to the optimization model. As mentioned previously, the packet losses affect differently to SVC and MDC encoded video due to the characteristics of the codecs.
The Algorithm 1 calculates the total estimated packet loss for each layer based on (2) for both video codec types. Paths are selected for the layers of both codec since the codec type is not determined yet for the newly joined client at this stage. Note that, the paths are selected virtually in order to calculate packet loss estimations. At the beginning, the algorithm initializes the path set for both codec types. The algorithm selects the path with maximum available bandwidth for each layer with respect to the codec type in the for loop. After each time a path is selected, the available bandwidth of the selected path is re-calculated by considering the bitrate of the video layer. The estimated packet loss is also calculated each time a path is selected for a layer. As the output, the total estimated packet loss ratios for each layer of both codec types are provided. The Video Codec Selection module runs an optimization model after obtaining the estimated packet loss ratios provided by Algorithm 1. The optimization model is given as follows:

Algorithm 1 Packet Loss Estimation Algorithm
The equations given in (3) and (4) represent the objectives of the model. The model aims to minimize packet loss ratio while maximizing the video quality by increasing the number of the layers. The constraint of the model is given in (5), that shows the number of layers, L, cannot be higher than maximum number of layers, L max . Although it is a multiobjective optimization model, the search space is not large, as encoded video files usually contain between 3 and 7 layers. Hence, the controller can search all the solutions in the space to find the optimal solution within a very limited of time.
The Packet Loss Estimation algorithm runs in order to determine the video codec type and the number of layers at the beginning of the streaming session. The selected codec type and quality are signaled to the client and the client starts requesting segments based on this initial configuration. In the next section, the OLAPS algorithm that determines the quality and the paths by using the packet loss ratio estimation formula is given.

2) LAYER AND PATH SELECTION DURING STREAMING
As explained in the previous section, when a client starts the video streaming application, it requests the segments of the video encoded with the video codec type recommended by the controller. The controller also determines the optimal number of layers at the beginning of the video streaming session. However, since network conditions are dynamic due to the cross traffic, fluctuations in available bandwidth may cause quality switches at the client side. Hence, the number of layers that are determined at the beginning of the session may not be optimal any longer. In order to cope with network dynamism and to determine up-to-date optimal number of video layers, the controller should re-new the path assignments and layer selections. The problem of assigning paths to a set of base layer and enhancement layers so that each client can get the highest number of layers possible is an instance of a generalized bin packing problem [60]. Therefore, the problem is NP class. Hence, we develop OLAPS, an heuristic algorithm, to solve the assignment problem.
In order to detect changes in network conditions, the controller periodically checks the changes in traffic amount on the paths and runs the OLAPS algorithm when network conditions considerably change. The controller measures the available bandwidth of each path and triggers the OLAPS module to run the algorithm if the measured value is above of a pre-defined threshold value. This threshold can be set by the network operator. The bitrate of the minimum representation could be a good alternative for this threshold since the changes in traffic amount higher than this bitrate value may cause considerable quality changes in the received video.
The purpose of the OLAPS algorithm is to determine the number of video layers that are sent to all clients in the system and the streaming paths for each layer. The OLAPS algorithm determines these items by taking into account available bandwidths of the end-to-end paths, bitrate of video layers, layer dependencies, and the total number of the layers transferred through each path. In (2), the estimated packet loss for a layer L that is transferred over a path p is given by considering the packet loss ratio of path p and codec type. However, the packets of a layer can also be lost because of lost packets of its underlying layers. Therefore, the layers which are sent over different paths should also be considered in the calculation of the estimated packet loss ratio for each layer. Figure 3 shows a scenario to explain the reason of why we should take into account the dependencies between the layers that are transferred over different paths in the packet loss calculations. In the figure, L i , L i+1 , and L j represent layer i and layer i + 1 of the video sent to the client 1, and layer j of the video sent to the client 2, respectively. The flow of L j may cause loss of client 1's i th layer packets since these two flows share the same path. As a consequence, packet loss in layer i affects to the (i + 1) th flow due to layer dependency. Hence, when packets of a layer are lost, they also affect the same client's upper layers and should be considered in the calculation of packet loss probability. Note that, the loss of the layer packets transferred over different paths is considered for only SVC layers due to its layer dependency.
The OLAPS algorithm, which is given in Algorithm 2, works in two phases. In the first phase, which is run in the first for loop, the algorithm allocates paths for the packets of the first layer for each client in order to ensure that each client receives the video at least with minimum quality. For SVC, the first layer is the base layer due to the layer dependency rules, while an arbitrary layer can be selected as the first layer for MDC coded video. In each iteration, the algorithm assigns a path with maximum available bandwidth for each client and updates path information. In the second phase, the streaming paths for additional layers are assigned. But in this phase, a path for each additional layer is assigned by considering the likelihood of the packet losses. The paths for additional layers are determined in ordered. In other words, new path assignments for the new layers only start after the paths for the same number of layers are assigned for all clients in the system. For each layer l, the path with maximum available bandwidth (P Max−abw ) is assigned for that layer, the available bandwidth value (abw P l ) is re-calculated and the estimated packet loss ratio, plr l,c_type , is estimated by using the formula (1) with respect to the codec type. Note that, the codec types (c type ) are already determined by Algorithm 1 for each client and OLAPS algorithm ensures that the paths are selected with respect to the codec type of each client. If the estimated packet loss ratio is greater than a certain threshold, which is determined differently for each layer based on their importance, the path assignment is canceled. Let p is the path that is selected for transferring the packets of layer l and thr l represents the threshold determined for layer l. When the path p is assigned for this new layer l, the controller estimates the effect of the newly assigned layer l to all layers routed via the same path p and if estimated packet loss ratio is higher than the threshold defined for the related layer, the path assignment is canceled. Furthermore, for SVC clients whose layers are transferred over path p, the packet loss ratios are checked for also their other layers transferred over different paths due to the scenario given in Figure 3.
According to the second phase of the algorithm, if adding the new layer l for the selected path does not cause unacceptable estimated packet loss effect on other clients' layers on the same path and other layers of those clients on the other paths, then the selected path is assigned for the layer l. Hence, in the second phase of the algorithm, a path for each additional layer is assigned only if assigning a path for a new layer does not affect the layers by considering the flows on the same paths and related flows in different paths.
The controller runs the OLAPS algorithm periodically when the network conditions change, i.e. a new flow arrives or a flow terminates. The controller signals clients and switches with updated information according to the output of the algorithm. Typically, when a video is encoded with a layered coding, the number of layers is limited and this number can be considered as a constant. The algorithm orders the paths according to their bandwidth at the beginning. The algorithm loops for each client, layer and the flows on the path that is selected for the current client. The number of layers and the number of HAS layer flows on a path is constant. Therefore, the complexity of the proposed algorithm is O(c + nlogn), where c is the number of clients and n is the number of paths.

D. RATE ADAPTATION ALGORITHM FOR SDN-ASSISTED VIDEO CLIENT
The purpose of the rate adaptation algorithm is to maximize video quality on the client's side by downloading video segments with the highest quality possible while minimizing rebuffering duration. The newly joined client downloads the first several segments from the lowest bitrate at the beginning of the streaming, which is a typical approach in such system in order to minimize startup delay. After the buffer fullness value reaches to a certain level, it starts requesting segments of the layer recommended by the SDN controller at the beginning of the streaming session.
As explained in the previous section, in the proposed SDNassisted system, the controller sends recommendation messages for the number of layers that can be requested by the clients as the output of the OLAPS algorithm during the streaming session. An application that runs on the northbound of the controller is responsible for sending the recommendations to the clients. Note that, this part can be easily moved to a video streaming company's DANE server. In such

Algorithm 2 Optimal Layer and Path Selection (OLAPS) Algorithm
Input: thr L : threshold level determined for layer l P Max−abw : the path with maximum available bandwidth, is updated during the algorithm runs Output: Optimal number of video layers and the streaming paths foreach client c do P base ← P Max−abw Assign selected path, P base for baselayer, client c abw P base ← abw P basebitrate baselayer,c_type end foreach layer, l do foreach client, c do P l ← P Max−abw Assign selected path, P l for layer l, client c abw P l ← abw P l - Let L rec represent the number of layers that is recommended by the controller. When L rec value is received from the controller, one option for the clients could be to start requesting the L rec layers until the controller recommends another quality layer, i.e., layer with higher or lower bitrate. However, the clients should also consider their current buffer fullness level, which is one of the crucial internal parameters that affects outage or re-buffering events. In a typical buffer based adaptation algorithm there is a mapping between buffer occupancy and video representation, such that as buffer fullness increases, clients starts to request video from higher representations. Conversely, when the buffer fullness decreases, clients request video segments of lower representations. In this current study, the clients use such mapping algorithm and determines a quality based on buffer fullness value. They also take into account the quality recommended by the controller. The quality is selected as the layer having minimum bitrate among the recommended layer from the controller and layer determined based on buffer level.
Let L n represent the number of layers that is determined by the client, which equals to the possible highest quality that can VOLUME 9, 2021 be received under the constraint of buffer occupancy. In the mapping approach used for this purpose, buffer is divided into equal levels and a quality is defined for each level. Accordingly, if the buffer fullness is at the lowest level, then the lowest quality is determined. And if the buffer fullness is at the rightest region in the buffer level, then the highest video quality is determined by the mapping function. The recommended layer by the controller (L rec ) may be higher or less than L n . When a new recommendation is received from the controller, the client requests the segments of L rec if L rec is lower than L n . On the contrary, the client might send a request for receiving L n layer packets when the buffer fullness is below a certain threshold and L n is less than L rec . This approach has several benefits. Firstly, it helps to avoid rebuffering since the rate adaptation algorithm considers both internal information about buffer level and external information provided by the assistance of the controller. Secondly, the algorithm eliminates to put additional burden on the links and provides fair bandwidth allocation since L rec is determined by OLAPS algorithm, which determines the equal number of layers for each client. And finally, the number of quality oscillations on the client side is reduced because the clients do not exposure unexpected buffer drains, which are the one of the main reasons of oscillations in requested video qualities.

IV. PERFORMANCE EVALUATION A. TESTBED AND TOPOLOGY SETUP
For the performance evaluation of the proposed approach, we used Mininet emulator to setup an SDN environment and to run the tests. Mininet provides an efficient platform for constructing SDN topologies, implementing and testing SDN applications [61]. The controller modules are built on top of the Floodlight software. OpenFlow is used to provide communication between the controller and switches. We run our experiments over a real world topology, known as ''Compuserve'', whose information is taken from the Internet Topology Zoo [62].
During the simulations, we used four different network scenarios by using Poisson distribution to generate the links bandwidth with mean values of λ = 8 Mbps, λ = 10 Mbps, λ = 12 Mbps, and λ = 15 Mbps. Total available bandwidth values between source and destination based on the defined mean values (λ) are illustrated in Table 3. Video codec selection module selects one of the video codec type, SVC or MDC, for the clients. We observed the codec type selection of OLAPS algorithm in the simulations. Accordingly, on average, when the network resources are limited, i.e., λ equals to 8 or 10, the number of MDC and SVC clients are the same. For the higher values of λ, the number of SVC clients roughly equals to twice the number of MDC client. In order to analyze the performance of the system under limited bandwidth resources, which shows how well the system adapts to the current conditions, clients were placed behind bottleneck links.   Elephants Dream-II [63] is used as the streamed video. The video file consists of 327 segments, each with a duration of 2 seconds. On the client's side, the total buffer length is set to 24 seconds. The clients start to play video after buffering 8 seconds of video at the beginning of the streaming session. There are 10 video clients, which are capable of decoding both SVC and MDC video, connected to the system. The video server provides the same video with both SVC and MDC codecs. The SVC video has one base and two enhancement layers, while MDC video has three layers. Table 4 gives the bitrate distribution of the layers for both codec types. The bitrates of the SVC enhancement layers present the bitrate of the related layer solely. The bitrate of an SVC enhancement layer can also be represented cumulatively by considering layer dependencies. In that case, for example, the cumulative bitrate of the L2 would equal to the sum of bitrate values of L1 and L2.
As mentioned earlier, the controller assigns the paths only if these path assignments keep the estimated packet loss ratios of the layer packets under certain threshold values. We considered two different sets of packet loss thresholds for each layer, which are listed in Table 5. While SVC has different threshold value for each layer, MDC only has one value for each threshold level. Different thresholds for each SVC layer were defined due to the layer dependency and the observations of the QoE values affected by the path selection approach based on packet loss estimation. The tests were repeated 10 times for each setting. All test results presented in the figures and tables in the next section are averaged values.

B. EVALUATION RESULTS
For the evaluation of the performance, we measure the following QoE metrics: (i) average received bitrate, (ii) rebuffering duration, (iii) the number of the received video segments belonging to each layer, and (iv) the number of quality switches. These metrics are among the most important metrics showing the perceived quality on the client's side [64]. Three additional approaches are also implemented and the performance of each approach is evaluated with the same set of configuration in order to compare the performance of the proposed architecture. The comparison approaches are Throughput Based Adaptation (TBA) [15], Buffer Based Adaptation (BBA) [17] and PANDA [20]. In the TBA algorithm, clients measure network throughput while downloading the video segments and adapt quality according to the estimated network bandwidth. In the BBA algorithm, the clients select the quality based on buffer fullness level so that they request the highest possible quality when the buffer fullness is high and request the lowest quality when the buffer is almost empty. TBA and BBA approaches are selected as to observe the performance of the clients having different approaches to select the quality. We prefer to run TBA and BBA algorithms because these works are successful implementation of throughput and buffer based approaches. Also, they were selected as comparison approaches for several studies in the literature and they can be seen as benchmark algorithms [19], [21], [22].
Different than TBA and BBA, PANDA uses a special technique to estimate bandwidth on the basis of probes [20]. As being one of the approaches having a remarkable bandwidth estimation method, PANDA is a suitable approach to observe the performance of the proposed approach in this current study, since while the proposed approach utilizes network assistance to select quality, PANDA's quality selection utilizes a bandwidth estimation method based on HAS characteristics on the client's side. Therefore, it is possible to observe whether an improvement over an approach uses HAS specific client side bandwidth estimation method can be obtained if network assistance is provided. The clients using these rate adaptation algorithms measure the bandwidth and select the quality as the algorithms proposed in [15], [17] and PANDA [20]. However, since we stream SVC video for these clients, we added a mechanism in which when an enhancement layer segment is delayed for more than 4 seconds, it is discarded. We implement this mechanism to provide a fair comparison since the proposed algorithm has the same approach and discarding the delayed packets provides to prevent long re-buffering duration. However, if the base layer packets are delayed, each type of clients experiences re-buffering events.
The selection of the paths for transferring video layers are determined by the OLAPS algorithm for the proposed approach. Therefore, the SDN controller is HAS aware and it considers the characteristics specific to the layered video in the selection of the paths. For the other approaches, the controller forwards client's requested layers over the paths with maximum available bandwidth. The performance results given in this section will also show the performance improvement provided by the proposed path selection approach, where the routing is done by considering layer dependencies, compared to the one of the best path selection approach used for layered video in the literature.
The video bitrates received by the clients according to the network capacity are given in Figure 4. In this figure and other figures and tables represented in this section, OLAPS_thr 1 and OLAPS_thr 2 refers to OLAPS algorithm results obtained with threshold values thr 1 and thr 2 , respectively. The clients using TBA and BBA algorithms have higher bitrate than the clients in the OLAPS algorithm when network has limited bandwidth (λ = 8 Mbps). However, as it is going to be shown later, the clients using TBA and BBA approaches experience higher re-buffering duration and higher number of quality changes, which negatively affect the perceived quality. It is worth to mention that, seamless video streaming with minimum number of video stalls and minimum number of quality switches are so much preferable than the small quality degradation. While requesting video from the lowest bitrate results in poor video quality, requesting video from the higher layers increases the probability of re-buffering if there is not enough bandwidth. The main advantage of the OLAPS algorithm is directing clients to select a good point in the trade-off between bitrate and re-buffering risk. When comparing with the TBA, our proposed approach based on the OLAPS algorithm performs better for both threshold levels. PANDA clients also select layers providing a better trade-off point between bitrate and re-buffering than TBA and BBA approaches due to its bandwidth estimation approach. Among all approaches, OLAPS_thr 1 has the best performance considering this trade-off. This indicates that further improvement can be achieved with the assistance of the HAS aware network compared to even the case where the client perfectly observes the network conditions and SDN routes the video packets over the paths with maximum available bandwidths.
The graphs that show average throughput as a function of time for each bandwidth setting are given in Figure 5. The graphs in the figure show jointly the performance of the path selection approaches and the performance of bandwidth utilization obtained by each approach. It is observed that, especially if the bandwidth is limited, as in the tests where λ equals to 8 Mbps, the performance obtained with the proposed approach is better than the other approaches for both threshold values. The reason of that is, in the proposed approach, the available bandwidth is estimated with high precision due to the knowledge of the SDN controller about the HAS flows and characteristics. Since HAS clients request VOLUME 9, 2021  the segments intermittently, the traffic caused by the segments sent to the clients are not permanent. This phenomenon is known as ON and OFF periods [65]. The SDN controller without the SAND characteristics interprets the OFF periods as the increase in bandwidth as this is the case in other approaches. Hence, this leads to miscalculations in available bandwidth. On the other hand, although our approach uses the traffic measurements done by Floodlight like other studies, we also use the knowledge about HAS ON/OFF periods, the number of online flows, and the bitrate of the layers. Hence, our approach makes more successful estimation about available bandwidth, which results in better path assignments. In all cases, PANDA approach achieves higher throughput than TBA and BBA approaches. This shows that the bandwidth estimation method of PANDA algorithm is very successful since the clients were able to utilize bandwidth more than the others in the same position. When we examine Figure 5(d), we see that all approaches obtain similar results. These observations lead us to conclude that, if the controller has knowledge about video streaming application characteristics, especially when the network is limited, the better routing decisions can be made.
Receiving more video segments from the higher layers provides to play the video with better quality. Figure 6 shows the received number of segments from each video layer during the simulations, where confidence interval is set to 95%. In the first scenario, it is observed that the BBA algorithm receives more segments from the first enhancement layer while the clients with other approaches receive more video segments from the base layer. At the first glance, this is seen as an indicator that the BBA algorithm outperforms other approaches while the network has less bandwidth. However, the clients experience high delay while downloading segments which, in turn, causes to unacceptable re-buffering duration. In the other scenarios where the link's capacities are increased, we observed that the OLAPS algorithm for both thresholds provides the clients to adapt quality so that the clients receive more segments from higher layers. This result shows that the clients with the OLAPS algorithm receive a minimum number of segments from the base layer among all algorithms only when the network has enough capacity to transfer packets from higher layers. Another important observation is that, SDN controller assists client to select more appropriate codec and quality with the proposed approach. This observation can be made by especially examining the results when λ equals to 15 Mbps, where all approaches has similar network throughput (as it can be observed in Figure 5(d)). Since the bitrate amount arrived to the clients are similar, the results give more information about the performance of rate adaptation algorithm rather than performance improvement provided by the path selection approaches.
Startup delay is one of the parameters that affect the perceived quality and this time should be as minimum as possible. In Table 6, the startup delay values that were observed for each application with different network settings are provided. The results show that OLAPS_thr 1 managed to keep the startup delay under a certain level for each network setting. On the other hand, the higher threshold value used in OLAPS_thr 2 causes clients to request video from higherquality segments and results in elongated startup delays.
When the network links are congested, the clients adapt to lower bitrates in order to avoid re-buffering. As shown in Table 7 and Table 8, when (λ = 8 Mbps), the proposed algorithm has lesser re-buffering duration than the other algorithms. But, especially the TBA and BBA algorithms experience unacceptable values for re-buffering, in both forms of duration and frequency. The main reason for that result is the client's greedy behaviour of these approach. As a result, the greedy behavior of a client in a short period of time may not ensure appropriate bitrate adaption. On the contrary, since the controller runs the OLAPS algorithm by using its SAND characteristics, it helps clients to request the highest number of video layers under the constraint of network capacity, avoiding sudden reactions based on bandwidth measurement changes caused by ON and OFF periods. Note that, when (λ = 8 Mbps), the re-buffering values observed in OLAPS is also high because the network capacity is so limited.
The dependency among SVC layers enforces the clients waiting for receiving all lower layers in order to play video with maximum quality, which can cause packets to be delayed   especially when network is highly congested. Greedy behavior of the comparison algorithms trigger the clients to adapt to higher video quality which could result in congestion in competitive links. On the other hand, in the proposed approach, if the controller detects that the network has more bandwidth for only a very short time period due to the OFF periods of the clients, it restricts clients to request video from higher layers. Also, selecting optimal codec by considering SVC and MDC characteristics and network conditions, help to improve QoE. By considering those facts we can conduct that clients which are assisted by the OLAPS algorithm experience less re-buffering thanks to its approaches considering codec types and HAS characteristics.
The effects of Algorithm 1 on the QoE can be clearer when examining the Table 7 and Table 8, for the λ values of 12 and 15 Mbps. As explained previously, Algorithm 1 is used for calculating packet loss estimations and threshold values are determined by examining the correlation of the outputs of this algorithm and achieved QoE. The disadvantage of setting high values for packet loss thresholds is observed for these bandwidth capacities. By increasing network capacity, the rebuffering duration reduces significantly in all approaches except OLAPS_thr 2 . Increasing packet loss threshold value gives more flexibility to the client to request video segments from the layer with the higher bitrate. Hence, the network becomes more congested and this leads clients to experience higher delay and more re-buffering events. Table 9 shows the number of quality switches during streaming. Less number of quality switches means that the client experiences stable video quality. It is observed that the OLAPS algorithm provides stable video quality in all scenarios. The main reason is that, it prevents clients to switch higher layers when clients estimate that the network has more bandwidth in a short period of time due to OFF periods. In other words, the proposed algorithm bounds clients' greedy behaviors. In the table, +(−)x shows the increase (decrease) in the video quality, where x is the difference of the video layer numbers between the successive requests of the clients. For example, +2 represents that the client receives the next segment from two quality levels higher than the lastly downloaded one. Clearly, the big jumps affect QoE more negatively. In all scenarios, it is clear that the OLAPS algorithm has the lowest value in terms of the number of increments and decrements. In addition, the BBA algorithm avoids a high number of quality switches since clients increase the video quality with increase in the buffer level. On the contrary, in the TBA algorithm, clients have greedy behavior to receive video from upper layers. Accordingly, when network is congested, clients receive base layer for a short period of time, and then may request video segments from upper layers when network capacity changes. Hence, TBA has higher number of quality switches when compared to other approaches. PANDA has better adaptation compared to the TBA and BBA algorithms.
In addition to measuring different QoE parameters, we also measure the overall QoE values. In [64], the QoE parameters that are used in the overall QoE calculation is well analyzed. We use the QoE formula given in [64] to calculate the overall QoE as follows:  The QoE formula calculates QoE value of the client which downloads K segments. The first term in the formula is the bitrate of the segments and the second term is the number of quality changes. D r , N r and T s represents total re-buffering duration, total number of re-buffering events, and startup delay, respectively. We use the same values given in [21] for the coefficients of the terms in this formula. The QoE values are calculated according to (6) for all approaches. The normalized values of the calculated QoE values are given in Figure 7. We theoretically calculate the optimal QoE value by considering the available bandwidth value that is shared by the clients and the optimal bitrate selection under this bandwidth value constraint. The optimal QoE value for given network conditions always equals to 1. The graph shows how close our proposed solution is to the optimal value.

V. CONCLUSION
In HAS applications, the clients have limited information about network conditions and other clients' behavior which also affects the client's experience. Therefore, network-based approaches, which direct clients and provide more information to the clients help them to adapt the quality optimally.
Layered video coding has some advantages due to provided storage optimization such as provided by SVC or robustness against lost layers such as provided by MDC. SVC and MDC have different characteristics. In this paper, we proposed a video streaming system architecture where the SDN controller is aware of video codec types and HAS characteristics. By taking into account layer dependency constraints of both codecs, estimated packet loss ratios, and current network conditions, the controller selects the appropriate codec type for the clients, dynamically assigns streaming paths for each layer of the videos transferred to all clients. Furthermore, the controller recommends the client the optimal number of layers under the constraint of current network conditions. The clients utilize these recommendations within the rate adaptation algorithm and decide the video segments to be requested by also considering their own adaptation logic and buffer fullness level.
We presented HAS aware SDN controller assistance and SAND characteristics leveraging SDN technology can provide improvement in various QoE metrics, compared to other approaches where the clients are not directed by the controller. Furthermore, we showed that an HAS aware controller can estimate the available bandwidth with higher precision, compared to a regular SDN controller, although both controllers use the same information about the current bandwidth and traffic amount. Simulation results show that the proposed architecture provides an increase in received video quality up to 76% and up to 10% decrease in re-buffering duration when it is compared to another approaches where the paths with maximum available bandwidths are also assigned to the clients.
As the future work, we plan to implement an enhanced architecture of this proposal, where HAS aware web-caches cooperate for deciding which videos and which qualities should be kept. In such system, different layers of the video files can be distributed among caches within a particular proximity by considering their storage capabilities and the number of connected users to each of the caches.