By Topic

IEEE Quick Preview
  • Abstract

SECTION I

INTRODUCTION

TRADITIONAL linear television (TV) services are being challenged by the trend of converged multimedia services where high quality audio-visual content is distributed to a variety of end systems via packet-switched content networks. Web-based applications and smart media centers are becoming increasingly important for media services, community building and social networking. The response of the consumer electronics, broadcast, and telecommunications industries thus far has been supplying abundant heterogeneous terminals (e.g., personal TV, mobiles, game consoles, and web interfaces) with converged services using IP networks. Meanwhile, recent developments in peer-to-peer (P2P) technologies have encouraged energy efficient and low-cost delivery for commercial and user-generated multimedia content. For traditional unicast and broadcast services, content from dedicated hosting servers is requested and delivered directly to designated user devices. This process of data acquisition on a device is independent from others. In order to facilitate efficient content distribution, certain technologies (e.g., HTTP adaptive streaming) allow streams to be fragmented into small pieces (a.k.a., chunks). P2P-based services use a similar mechanism, but allow pieces to be distribution over a mesh-based overlay, which enables decentralized content distribution. A participating peer requests and receives pieces of content from multiple other peers, while contributing its upload bandwidth capacity to share received pieces with peers that request them. The P2P mechanism can potentially achieve a very high efficiency of data exchange between end devices, which is especially useful for particular network infrastructures [e.g., fast local area networks (LANs) with limited throughput to content services in external networks].

Unlike file-sharing services where the integrity of files is not affected by the receiving order or arrival time of individual pieces, pieces for a video streaming service must be delivered on time and intact for video decoding. However many operational issues still exist such as flash crowds, network address translation (NAT), firewall traversal, and content authentication. Therefore, using P2P technologies for commercial audio-visual services requires a purpose-built P2P core that is optimized for high-throughput and interactive applications, and the deployment of an effective peer sharing network. Furthermore, the user experience of video services can be influenced by distortions caused by impairments from several entities within a P2P-based service including link quality, packet networks, P2P overlay, end system and video coding. Although a number of advanced mechanisms (e.g., [13], [18]) have been designed to better manage the user experience of video services in P2P networks, there is still a lack of a methodology to systematically evaluate P2P video service quality with respect to user perception.

This paper introduces the recent design and deployment of the Lancaster Living Lab, which has been distributing a P2P-based live and video-on-demand (VoD) IPTV service to thousands of users on a university campus and in a rural village. An overview of the infrastructure and functional components presents a number of key designs that facilitate the entire IPTV eco-system including content ingest, transcoding, P2P tracking, distribution, statistics, heterogeneous end systems, and the integration of social networking. A multimodal quality evaluation framework that is specifically designed and implemented for the assessment of video streaming services in P2P-based IPTV systems is also presented. Under this framework, multiple types of subjective and objective evaluation models are designed and integrated to collaboratively measure key service metrics that are relevant to user experience.

The paper is organized as follows. Section II gives an overview of the existing work of P2P-based content distribution networks. Motivations and challenges of exploiting P2P techniques for commercial live and VoD services are summarised in Section III. The design and deployment of IPTV services within a live test-bed (Living Lab) at Lancaster University is described in Section IV followed by the introduction and test results for a multimodal QoE framework detailed in Section V.

SECTION II

BACKGROUND AND RELATED WORK

A. Multimedia Content Distribution

Consumption of audio-visual media is moving from a collective and passive approach to a personalized and active one. Concurrently, there is a shift towards nonlinear usage patterns from the classical model of linear broadcast TV. The TV set no longer has the monopoly of delivery of audio-visual content, as the PC and related media centers become increasingly important [4]. The response of the consumer electronics, broadcast, and telecommunications industries thus far has been supplying abundant heterogeneous terminals with converged services using IP networks. Such rise of converged platforms is even fostering cooperation and innovation within the traditional content distribution network (CDN) industries. For example, in the United Kingdom, the major actors of terrestrial TV (e.g., the BBC and ITV) which in recent years have been delivering VoD content through web browsers, are co-developing a set-top box (STB) based IPTV platform called YouView [6].

It is expected that in 2012 IPTV will become the Internet's largest traffic type, accounting for over 50% of all consumer internet traffic, and further rising to 62% by 2015 [8]. The current infrastructure of the Internet, however, is not suited to simultaneous transmission of live events to millions of people. Content distribution networks traditionally follow a client-server architecture, delivering dedicated streams of data to individual users (i.e., unicasting). While this approach has proven effective in delivering content for a small number of channels to a large number of synchronous users, questions remain about the efficacy of its ability to scale to the demands of future multimedia. A CDN's operational costs rise as its viewership rises, which introduces potential concessions between viewer count and data stream quality. Furthermore, current consumer internet infrastructures were not designed to withstand abundant niche channels, and delivery of video content to users with asynchronous viewing patterns [24].

Multicasting of content has recently been exalted as a potential solution, whereby data streams are distributed to a consolidated number of intermediary servers (i.e., caching servers), which subsequently re-distribute the content to end users [17]. However, several studies (e.g., [19]) question the technological and economic feasibility of this approach. This concern is exemplified by the sparse deployment of compatible hardware in the Internet's underlying infrastructure, predominantly due to the lack of financial incentive for internet service providers (ISPs) [17]. As a consequence of this there has been an increased focus on the development of applicative solutions using chunk-based mechanisms. The most prevalent example that leverages traditional infrastructures is HTTP adaptive streaming, a pull-based mechanism that allows clients to dynamically switch between streams of differing bitrates, according to that which best suits current network conditions. HTTP adaptive streaming is an umbrella term, which encompasses a broad variety of technologies including MPEG's Dynamic Adaptive Streaming over HTTP (DASH) [9], [34], Apple's HTTP Live Streaming (HLS) [10], and Microsoft's Smooth Streaming (MSS) [25]. Despite these developments, the scalability and resilience of P2P networks are increasingly leading them to be seen as an alternative platform to provide content delivery, for both professional and user generated content.

B. P2P-Based Content Distribution Systems

A P2P system is a self-organizing system of equal, autonomous nodes (i.e., peers) that aims for the shared usage of distributed resources in a networked environment, while avoiding central services [36]. Unlike the client-server architecture, a peer's upload bandwidth is leveraged to forward content to other peers [26]. Content data is partitioned into tagged pieces. The information about pieces is then exchanged between peers so that each peer can download data pieces from other peers concurrently [33], usually as either a mesh-based or multiple tree-based overlay [15]. In order to support the more strict requirements of piece discovery and exchange efficiency for audio-visual services, swarming technologies such as the well-known BitTorrent are commonly preferred over the traditional tree-like P2P structures. Using the BitTorrent protocol, a peer contacts other peers in the peer-list to trade its pieces for other required parts of content. This tit-for-tat mechanism automatically locks out peers who are unwilling to share, leading to a balanced economy with suppliers meeting demand and achieving fast download at the same time.

Using a purpose-built P2P engine and an optimized deployment strategy, a P2P-based CDN could potentially achieve a higher level of overall data exchange rate and energy efficiency compared with traditional linear TV services. In contrast to unicast or multicast streaming, a P2P-based IPTV service offers scalability and redundancy. The availability of digital resources is increased as the number of unique users and network connection increases, leading to greater efficiency for file sharing, while potentially delivering a better quality of service. The sharing mechanism adopted between user devices can also reduce the investment in underlying infrastructures for over-provisioned networks and central caching machines, circumventing the inexorable bottlenecks of mass video content distribution, especially during peak hours. However, the actual benefit of exploiting P2P mechanism for content distribution is still to be investigated in an actual content service.

SECTION III

CHALLENGES AND MOTIVATION

A. Challenges of P2P-Based Multimedia Services

The inexorable development of P2P technologies has created new business opportunities for the production and distribution of audio-visual content. These developments are exemplified by the rising number of large-scale P2P IPTV deployments, such as Sopcast, PPLive, and PPStream. Despite this, it remains challenging to exploit P2P for commercial IPTV services due to the nature of its distribution paradigm. This paradigm involves pieces of distributed content being received and reassembled from a number of intermediate receivers. Although a well deployed P2P-based IPTV service could significantly reduce the throughput of access aggregation and backbone networks, the upload capacity of end systems fundamentally limits the effectiveness of piece exchange. The unreliable and heterogeneous bandwidth capacity, performance, and availability of peers can cause excessive delay and delay variation between requesting and receiving pieces. Contemporary P2P systems face a multitude of further issues in that they are often inefficient for distributing small files, lack seamless media playback across platforms (e.g., integration within web browsers), incentives to contribute bandwidth, and the ability to adapt to a changing aggregate bandwidth availability [22]. Furthermore, the behaviour of peers operated by end users or other upper-layer applications is also unpredictable, which further exacerbates any issues in the underlying design of the P2P system. For example, certain peers may saturate bandwidth, or through the legitimate action of leaving the service at any time, any descendant peers would be abandoned, which could potentially cause content deficiencies to propagate throughout the P2P overlay [37].

There is an abundance of literature which further highlights the fundamental challenges of making P2P CDNs competitive with traditional CDNs (e.g., [23], [40], [15]). These challenges include those of long playback latencies, heterogeneity management, peer dynamics, churn-induced delays, building dedicated infrastructures, the blind construction of the P2P overlay over the underlying network without performance considerations, traffic throttling by ISPs, and NAT and firewall traversal. For example, it has been previously measured that a peer may take tens of seconds to join a channel and pre-buffer for video playback in the popular PPLive service [21]. Increasing the size of buffer potentially relaxes the requirements of piece distribution in P2P overlay, however, the long start-up delay and channel switching time would significantly influence customer satisfaction, especially for commercial services [14]. Flash crowds occur when the peer arrival rate increases suddenly and remains high for a sustained period [35]. Two open challenges for P2P-based IPTV systems are servicing this rapid admission of new peers without degrading the quality of service for existing peers, and rapidly repairing the overlay when these peers depart [23]. NAT and firewall traversal is a further crucial issue for service deployment. Around half of the nodes on the Internet are behind a NAT or firewall, which may significantly reduce the efficiency of P2P distribution or even cause some peers to be unreachable by other peers [12].

B. P2P-Next Project

P2P-Next is an European Commission Framework Programme 7 supported project that is building a next generation P2P content delivery platform, designed, developed, and applied jointly by a consortium consisting of both academic (e.g., Lancaster University) and industrial partners (e.g., the BBC and Pioneer Digital Design). Specifically, our recent work enables the creation of (user generated) content, provides innovative audio/video technology and user interfaces, develops an integrated consumer electronics device for the NextShare platform, and realises the Living Lab Trials as a large scale public test infrastructure facilitating experimentation with the system architecture.

Research topics of P2P-Next include evolutionary content distribution, easy access to vast amount of content with metadata federation, social networking, and innovative business models for advertising. The sum of these efforts is a large step towards moving the information access from the hands of a producer to the hands of the consumer, and allowing consumers to enjoy and utilise content resources in a mobile and pervasive manner, across the great online space.

SECTION IV

LANCASTER LIVING LAB

There are many examples of promising technical solutions that dramatically failed, because users perception did not match with marketers' vision. Moving Television beyond the classical channel paradigm and pioneering new business models is only possible with a reality check from real-world usage. We can only incrementally expand testing services with an active and expanding community of users to test assumptions, identify bottlenecks, and guide the way forward. Technology cannot be disconnected from the social context and audiovisual practices of communities. In P2P-Next an incremental approach has been followed by early and short-cyclic releases involving actual user communities in real-world networks and homes rather than lab environments. The Living Labs are a series of live test-beds across Europe which is used to continuously evaluate the Formula${\rm NextShare}^{PC}$ and Formula${\rm NextShare}^{TV}$ reference implementations. Their purpose is to facilitate experimentation and assessment of the proposed system architecture between all project partners and the end-user community, across both TV and PC platforms and the various communications infrastructures in place. The Lancaster Living Lab has the most comprehensive infrastructure and services within the P2P-Next project.

The Living Lab approach benefits both the development of a product, but also reduces risk and cost to commercialisation. The short-cyclic process, provides rapid feedback to developers highlighting technical issues, but also motivates stakeholders and reduces risk in commercialisation. The real-world environment ensures the technologies can stand the rigors of the wild-Internet, rather than laboratory conditions.

A. NextShare Platform

The NextShare platform is comprised of a Formula${\rm NextShare}^{Core}$ software component, two derivative application platforms Formula${\rm NextShare}^{PC}$ and Formula${\rm NextShare}^{TV}$ that target the PC and consumer electronics (CE) environments, respectively, Formula${\rm NextShare}^{Mobile}$ that allows users to control all aspects of the set-top box from their smartphone, and Living Lab services (Fig. 1). The platform is realized in practice on top of IP-based content distribution networks.

Figure 1
Fig. 1. NextShare platform.

Formula${\rm NextShare}^{Core}$ encapsulates the inner workings and underlying protocols and present a common API to applications. Therefore applications using Formula${\rm NextShare}^{Core}$ need not be aware of protocols, messages, or network transport issues; nor be concerned with how the API is implemented. The Formula${\rm NextShare}^{Core}$ is a BitTorrent-based P2P distribution engine with a number of new features that are specifically designed for Live and On-demand streaming of audio visual content under the NextShare framework. In order to address the incompatibility of BitTorrent's tit-for-tat mechanism and sequential downloading (e.g., as when streaming media), the Formula${\rm NextShare}^{Core}$ introduces a novel give-to-get mechanism, which “discourages free riding by letting peers favour uploading to peers who have been proven to be good uploaders” [26].

Formula${\rm NextShare}^{TV}$ provides an integrated and easy to use consumer electronics device showcasing the feasibility and benefits of the NextShare platform. The NextShare platform services and content are made accessible to low-costs devices at high quality and ease-of-use. The Formula${\rm NextShare}^{TV}$ set-top box [ Fig. 2(a)] is an IPTV receiver based on the STB7200 system-on-chip technology provided by ST Microelectronics. Recent work on Formula${\rm NextShare}^{TV}$ include the design and implementation of an electronic program guide (EPG) and a generic Atom-based Feed Navigator, allowing content discovery by end-users. Besides Formula${\rm NextShare}^{Core}$ and video processing functions, a number of social networking features have also been integrated on the Formula${\rm NextShare}^{TV}$ platform. Users can associate their user accounts in Formula${\rm NextShare}^{TV}$ with their Facebook accounts and receive Tweets from Twitter that are relevant to the program. To facilitate a better social experience, we have also implemented waving and linking functions which allow content playback to be initialized and synchronized between friends in different locations.

Figure 2
Fig. 2. Formula${\rm NextShare}^{TV}$. (a) Formula${\rm NextShare}^{TV}$ set-top box. (b) Formula${\rm NextShare}^{Mobile}$ interfaces.

The opportunity to exploit the emerging trends towards media multitasking behavior and complement, rather than compete with, the users' experience of television formed the motivation for the development of a second screens application, named Formula${\rm NextShare}^{Mobile}$, which allows users to control all aspects of the set-top box from their smartphone. Fig. 2(b) shows the user interface of a native Formula${\rm NextShare}^{Mobile}$ application developed for Apple iOS devices that connects to the set-top box using Universal Plug and Play. Functions supported by the application include remote control, media browse and search, and management of playback sessions.

Formula${\rm NextShare}^{PC}$ is a multi-platform component which provides web browser integration to the Formula${\rm NextShare}^{Core}$. Lancaster University has developed its own web-browser-based user experience for IPTV services. The website within which Formula${\rm NextShare}^{PC}$ is embedded utilizes Web 2.0 technologies to provide a mixture of dynamic AJAX-enabled HTML pages and XML feeds in order to build the client and its supporting pages. Our web based client interface (Fig. 3) consists of several components including the primary navigation menu enabling a user to choose between live TV, radio and VoD content, the video window based on the Formula${\rm NextShare}^{Core}$ plug-in, the content carousel showing the live or VoD programs, and the social buzz panel, for showing information retrieved from social networks.

Figure 3
Fig. 3. Web-based Formula${\rm NextShare}^{PC}$.

The NextShare platform demonstrates a next generation converged TV architecture. Featured and user generated audio-visual content are distributed over an IP-based distribution network to both IPTV set-top boxes and a web-based portal using an unified service architecture. The user experience of TV watching is also enhanced by the seamless integration of social networking. To provide functional and performance tests of our converged NextShare platform, a whole chain of IPTV services must be constructed and maintained. The following sections introduce the design and deployment of such a comprehensive P2P-based IPTV service at the Lancaster Living Lab.

B. Service Architecture

The Lancaster Living Lab needs to support the delivery of technical services to ensure continuous operation of the platform, and also provide the user-support mechanisms and processes to ensure delivery of the Formula${\rm NextShare}^{TV}$ devices and Formula${\rm NextShare}^{PC}$ clients to end users. It is also essential to ensure the necessary measurement and evaluation procedures are in place to support the future development and research activities on the platform. Fig. 4 gives an overview of the Lancaster Living Lab architecture. DVB-S and DVB-T channels are received by the headend infrastructure and redistributed to core operational services. To achieve maximum flexibility for service integration and research capacity, most Living Lab services are realized using a high performance virtualization cluster. Every service instance is an independent virtual machine (VM) that can be modified, cloned or moved with minimal (if any) interruption to other services. The converged live and VoD services are distributed to both web clients and set-top box over 6000 households on the University campus via ethernet networks and 300 families of a rural village via a combination of optical fiber links and wireless mesh networks [16].

Figure 4
Fig. 4. Lancaster living lab services architecture.

A brief overview of the operational services is provided here:

  • Content Transcoding—This service takes audio-visual content from the headend and transcodes using a H.264 profile which is more appealing to the packet-based high quality video distribution.
  • Content Ingest—The service takes content from the transcoding service and packages into the correct format for distribution within NextShare, this includes the hosting of the P2P tracker and providing an initial seed for content distribution.
  • Living Lab Portal—This provides the front entry point for the Living Lab, a focal point through which each of the specific Living Labs can be reached.
  • Statistics Service—The statistics service provides a centralized point at which relevant service status can be gathered. Functions to support statistics service are embedded in both Formula${\rm NextShare}^{TV}$ and Formula${\rm NextShare}^{PC}$ client. Statistics obtained from all clients are archived in statistics databases for service diagnosis and research such as user behavior analysis and system performance benchmarking.
  • Software Update Service—This service is specifically designed for firmware update of Formula${\rm NextShare}^{TV}$ set-top boxes. It provides device authentication through a PIN code as well as being the central point for software update distribution.
  • Video on Demand Server–A repository of video files which can be distributed over the P2P system are maintained by the VoD server. Using different configurations for VoD server, the Living Lab is able to provide catch-up content of up to 30 days.
  • Remote Testing Service—This provides a management portal on which remote tests can be designed and executed on the set-top boxes. The remote test allows set-top boxes to run scheduled commands allowing an experiment to be performed over night to test various parameters and test streams.

C. Service and Social Analysis

In order to facilitate the streamlining of trials and experiments, the Living Lab maintains a central statistics service, which provides a means to store data collected from both Formula${\rm NextShare}^{TV}$ and Formula${\rm NextShare}^{PC}$. Over an 8 month period in 2011 the statistics service reported that there were 1027 unique users, 50 039 playback event, 5 years works of content consumed (over 2187 days), an average of 50 views per user, with each user watching 1 day and 18 hours worth of content. A total volume of 28.7 terabytes of data have been exchanged in the services with 13.4 terabytes of them being contributed between peers (end devices). This shows how service load can be greatly reduced using P2P-based distribution mechanisms.

Besides providing a general overview of service usage, the large amount of Living Lab service data hosted by the statistics service are also considered as an invaluable platform for engineers and researchers to conduct performance and social analysis, which are not possible in any simulation environment.

1) Deployment of Formula$NextShare^{TV}$ and Formula$NextShare^{PC}$

At the Lancaster Living Lab, Formula${\rm NextShare}^{TV}$ and Formula${\rm NextShare}^{PC}$ followed different deployment patterns. The results of these processes provide insights into the changing behaviors of how end users consume content. Formula${\rm NextShare}^{TV}$ followed a phased deployment. Students and university staff who have signed up for participating the service gradually received the set-top box. In March 2011, the Formula${\rm NextShare}^{PC}$ was launched in the Living Lab. The long-expected PC version of NextShare service has become particularly popular across campus. The number of unique users observed per day between April 2011 and February 2012 is given in Fig. 5. The fluctuating user activities reflect the weekly viewing pattern in the short term and the university term time and national holidays in the long term. Very few users are observed during July and August when most of the students (our main trialists) left campus for summer holidays. The same decrease in user activity is also seen near the Christmas holidays. Overall, the Living Lab services have been steadily gaining popularity in the student population. This solid user base is becoming the foundation of various research activities, including studies on VoD program popularity and social informatics.

Figure 5
Fig. 5. Number of unique users per day.

Although the deployment of Formula${\rm NextShare}^{TV}$ was to some extent influenced by the total number of set-top box available (300 in total), the experience from pre-deployment demand, the ease of its execution, and subsequent usage reflect the rapid adoption of web-based TV applications and the rise of media multitasking [7]. In June 2011 at the end of the academic year, trial users within the Lancaster University Living Lab were asked to complete a questionnaire about various aspects of the services. Over 67% of Formula${\rm NextShare}^{TV}$ users continued to use web browser-based IPTV services, of which 14% also used the Formula${\rm NextShare}^{PC}$ service. These figures are only representative of a student population, however, and there may be a multitude of socio-economic, cultural and practical reasons (e.g., limited accommodation space) for such usage.

2) Royal Wedding

Using the open platform of the Living Lab environment, investigators have also been able to analyze service performance efficiently. On Friday, April 29, 2011, Prince William married Catherine Middleton at Westminster Abbey in London. During the early stages and build up to the event, the live stream worked without issues, but as the time of the wedding drew closer, the buffer time increased significantly up to a point at which it was no longer possible to start watching the live stream at all. It was observed that the existing clients could continue to watch the stream without issue.

Fig. 6 shows the concurrent connections to the BBC One live stream for the day of the royal wedding. The graph shows two main spikes of viewing, (10:30 and 10:50): the first spike represents the peak of 84 users using the service, shortly after which the service is then restarted, followed by a second spike of users. The initial investigations as to the reason for failure related to a lack of sharing amongst peers during the event, with the rational being that a lack of sharing would potentially starve peers of pieces and so prohibit downloading of new pieces. A review showed that good levels of sharing occurred between the majority of peers, with some peers not requiring access to download from the seed server. A number of anomalous peers were identified which had inconsistent sharing behavior.

Figure 6
Fig. 6. Concurrent viewers of BBC1 (P2P stream) for April 29, 2011.

On closer inspection, the peers that performed poorly and had poor sharing, were those clients that were off-site from Lancaster University's network infrastructure. The majority of users on Lancaster University's campus infrastructure are behind a NAT device and as a consequence they cannot be contacted by users from off-site. Therefore off-campus peers can quickly starve of content when connections at the seed (locates on-campus) are saturated by on-campus requests. This is an example of issues that might be faced by any commercial P2P-based IPTV service providers. This NAT issue has since been resolved by increasing the piece availability to external peers. This use case also shows how one of the many challenges of deploying video content using P2P technologies in a complex service environment is effectively identified in a Living Lab environment.

D. Summary

The Lancaster Living Lab has provided an unprecedented method of validating a research technology in a real world operational environment. The continuous evaluation approach has provided an evaluation process with which we have been able to complete an assortment of assessments on various technologies for content processing, media distribution and human computer interaction.

The vision behind NextShare has always been one of highly distributed services in a server-less environment and as such is low-cost to the distributor. While this P2P technology provided benefits during its deployment at Lancaster, the technology was not as low cost or as infrastructure friendly as originally conceived. The significant benefit from the service is that of the bandwidth saving obtained from the service, yet this comes at the cost of requiring some server infrastructure (similar to a traditional multicast) for delivery of even the most basic of services. Once in place the service has the potential to scale to a significant number of users with the same amount of resource which could only support very limited services using traditional delivery technologies. Using the virtualization technologies, operations like fundamental changes to the server and load balancing of multiple service instances have been made relatively convenient for research and benchmarking purposes. One exclusive feature of the Lancaster Living Lab services is the statistics service. With the rich and comprehensive statistics provided by the report engine in end devices, the statistics server offers a wide spectrum of service information. The statistics service enables a more accurate and also real-time evaluation of service performance as well as social behaviors.

The scalability and expansion we experienced in the Living Lab came at the disadvantage that identifying issues and solving problems were a time consuming processes. Our analysis has come to the conclusion that in order for P2P technologies to be used successfully in the wild, a method through which quality of experience (QoE) can be determined and delivery failures tracked must be operational. Without the use of in-the-wild experimentation, such findings would not have been possible.

SECTION V

MULTIMODAL QOE EVALUATION IN P2P-BASED IPTV SYSTEMS

End users' experience of video content can be deteriorated by distortions caused by impairments of several entities of a P2P-based service including packet networks, P2P overlay, end system and video coding. In order to fulfill user expectation of service quality and to provide a benchmarking platform that evaluates designs for audio-visual content distribution system, a quality evaluation service is required. This evaluation service must provide accurate assessment of video service with respect of user perception while supporting service diagnosis to identify root-causes of quality degradation.

A. Challenges and Requirements

Commercial video services have strict QoE requirements to guarantee a high quality level as specified by service level agreement (SLA) recognized between service providers and content consumers. In [5] the quality requirements for IPTV services are specified. It is concluded that video streams are highly sensitive to information loss and that the QoE impact is in turn correlated to variables such as type of data loss, codec, loss profile and decoder concealment algorithms. QoE requirements must therefore be defined with regard to discrete quality violation events [29]. In the ITU Focus Group on IPTV, quality target metrics have been extended to consider the actual quality as perceived by end users. User level quality metrics such as “Maximum one visible artifact per x hours” were defined to evaluate the delivery of IPTV services [1]. The Broadband Forum has also defined a set of user level QoE metrics in [31]. For example, a criterion of “one impairment event per 12 hours or better” is defined for HDTV services.

The quality of user experience is ultimately determined by users' subjective opinions. To collect users' opinions, subjective experiments are usually carried out using well-specified test plans in dedicated test environments (e.g., [3]). However, conventional subjective experiments are time-consuming, costly and therefore not suitable for an in-service evaluation, especially in commercial IPTV services.

A number of objective quality evaluation models have been designed to conduct quality assessment. Objective models analyze video content displayed at the receiver and evaluate visual distortions using image and video signal processing tools [2], [39]. However, the analysis at end systems is not able to support comprehensive service diagnosis to identify the causes and network location of quality degradation.

Figure 7
Fig. 7. Framework for multimodal QoE evaluation.

Network QoS models are widely used to evaluate the impact of network impairments to services. Some advanced QoS models have been recently developed by integrating application-level metrics [20], [41]. Discrete analysis of perceptual impact of individual packet loss has also been designed to better capture user-level QoE [27], [29]. However these models are not specifically designed to evaluate impairments within the P2P overlay layer and within end systems. Furthermore, the compression distortions caused by lossy video coding can not be evaluated directly using network-layer models.

Overall, the level of evaluation given by existing QoE models are not sufficient to fully support quality assessment and service diagnosis for P2P-based IPTV services. The objective is only achievable by the collaboration of multiple subjective and objective quality models, with each providing a required type of evaluation function. This section introduces our design of a multimodal evaluation process as well as current implementation on analysis algorithms and methods.

B. Framework Overview

Fig. 7 gives the framework for the evaluation system. The framework interacts with five key elements of a video distribution system including source content, audio-visual encoder (transcoder), distribution network, end system and end user. Multimodal assessment of service quality is conducted by the collaboration of measurement, analysis and diagnosis modules, which are realized by groups of functional components.

Relevant service metrics from all key elements of the distribution system are extracted and summarized by the measurement module before data analysis and visualization of the analysis module are initiated. The diagnosis module coordinates analysis results for different measurement functions to enable comprehensive evaluation for service diagnosis. Functional modules and blocks can be selectively activated according to specific test plans and strategies.

Fig. 8 illustrates a use case of a multimodal evaluation to analyse the influence of network impairments to our P2P-based IPTV services. Using the time-stamp information that is associated with all evaluation processes, we are able to correlate pre-defined events that are detected in different layers. For instance, one can investigate how some packet losses are repaired by retransmission mechanisms of the P2P overlay, while others reach the video decoder and are eventually perceived by human users as visual distortions. The same set up can also be used to benchmark the robustness of different designs on P2P piece discovery and distribution. An earlier architectural design of the multimodal evaluation framework is also presented in one of our recent work [28].

Figure 8
Fig. 8. Multimodal evaluation associated by timestamp information.

C. Measurement and Analysis Modules

In order to realize a comprehensive evaluation of video distribution services, multiple different functional components have been designed to capture service statistics with respect to transmission, distribution, video codec and human perception. Overall, the measurement module is comprised of subjective feedback, objective video analysis, system statistics, P2P statistics, and network statistics.

The subjective feedback function provides a simple and interactive interface for viewers to notify perceived audio-visual distortions and to answer questionnaires regarding the overall service quality. The objective video analysis function captures decoded video signals from output of set-top box and analyses video quality using multiple no reference video processing models. System, P2P and network statistics functions report metrics reflect service status regarding video decoding, P2P piece distribution and packet-based networks, respectively.

On top of each measurement component, an analysis module is realized to conduct data-mining (e.g., pattern recognition and time-series analysis) on large-scale raw measurement data and to interpret results for investigators using visualisation tools. Depending on the nature of underlying measurement, an offline analysis or/and a real-time analysis can be enabled. For the offline analysis, raw measurement results (usually in a format based on XML) are processed (e.g., parsed and concentrated) before registered into a statistics database. The database exposes interfaces for front-end applications (e.g., a dynamic PHP program) to effectively retrieve, obtain and visualize the archived measurement results. Using offline analysis, different evaluations can be applied repeatedly to any specified section of the entire archive reflecting specific measurement purposes. Unlike the offline analysis, the real-time analysis models are implemented to actively pull data from measurement elements without intermediate procedures. Using real-time analysis, one can specify measurement strategies that are not feasible by offline analysis. For example, a direct connection to a set-top box can be established to request comprehensive live statistics of decoding process several times per second.

The following sections introduce each measurement element with the associated analysis modules.

1) Subjective Feedback

The user feedback function can embedded in IPTV set-top box to provide a mean for end users to efficiently provide both instantaneous feedback and questionnaire regarding the service quality. With Formula${\rm NextShare}^{TV}$ set-top box, a user can conveniently report a perceived audio or video distortion such as picture breakup by pressing the dedicated blue button on the remote controller. A small icon appears on bottom-right corner of screen for a short period of time as an acknowledgement of receiving user feedback [ Fig. 9(a)]. The interactive and simple button press is more practical and user friendly than other existing proposals such as the clicking based mechanism [11].

Figure 9
Fig. 9. Subjective feedback. (a) Distortion report. (b) Service questionnaire.

The overall impact of multiple perceived distortions is also monitored taking into account relevant psychological effects. For instance, a forgiveness effect (a.k.a. memory effect [30]) exists due to the fact that the objection felt by an observer immediately following an impaired video segment is compensated after a long period of unimpaired video [32]. Our system uses a sliding attention window, which defines a certain period of time backwards from the current time. If the number of distortion reports triggered within this window reaches a predefined threshold, a notification is issued and a questionnaire is initiated for end users to provide details of corresponding service disruption [ Fig. 9(b)].

Both instantaneous distortion report and service questionnaire are forwarded to the offline analysis module. Details of subjective feedbacks are extracted and stored in statistics database. The logical relationship between questionnaires and distortion reports is maintained using MAC address, questionnaire id and timestamps. A user interface is also implemented to visualize subjective feedback records. Fig. 10(a) shows a recreation of a selected questionnaire recorded in database and details of all associated distortion reports (specified by the attention window) which triggered this questionnaire. In this specific case, the user reported perception of bad picture breakup and audio loss but not black screen. The service being watched was BBC NEWS 24 distributed over P2P networks. The questionnaire is associated with ten distortion reports around 1:30 on June 30, 2011. An interactive timeline chart application is also designed to give a more intuitive representation of user feedback. The blue bars represent the distortion reported whereas the red dots indicate completed questionnaires. Using this tool, investigators can easily explore responses from customers to facilitate service diagnosis.

Figure 10
Fig. 10. Analysis of subjective feedback. (a) Table view and screen recreation. (b) Chart view.

2) Objective Video Analysis

Objective video analysis function aims at analyzing distortions introduced during the life-cycle of content processing and delivery (i.e., coding, network transmission, decoding) in the video signal level. The measurement is carried out on the decoded video frames using image signal processing tools.

In the Lancaster Living Lab, digital video signals are transmitted from the set-top box to a display unit using High-Definition Multimedia Interface (HDMI). For signal processing algorithms to process video frames, a video capture device with HDMI input is exploited. In order to conduct objective video analysis and subjective feedback simultaneously, a HDMI-splitter device is employed to duplicate the HDMI signal from set-top box to form two identical HDMI feeds.

Objective quality models are commonly designed to recognise and evaluate only certain types of distortions reflecting different quality evaluation strategies. Three quality assessment models are implemented to provide a wide spectrum of video analysis. The picture quality analysis model is an realization of a no-reference perceptual quality assessment algorithm initially designed by Wang [38]. The model measures two main distortions, i.e., blurriness and blockiness, caused by lossy video compression on each video frame. Results of distortion measurements are combined using an aggregation function to derive an overall quality rating. Fig. 11(a) gives results of a quality assessment test. The picture quality analysis model is a valuable tool to evaluate the influence of video encoding/transcoding process to the picture quality.

Figure 11
Fig. 11. Objective video analysis. (a) Picture quality analysis. (b) Edge detection. (c) Frame-freezing detection.

In practice, distortions in video frames can be caused by video compression, transmission impairments or system errors. To better identify the distortions such as the severe blockiness and frame-freezing caused by transmission impairments or system errors, edge detection and frame-freezing detection models are also designed and integrated. The edge detection model extracts visual edges appearing at the boundaries of all 16 × 16 macroblocks that are above a predefined threshold (Fig. 12). This process filters out most of the light blockiness caused by video compression and also the edges of objects within videos. Fig. 11(b) shows the number of edge units detected on all video frames in a test. Sudden impulses of measurement are considered as potential severe distortions in video. The visualization chart is also made interactive so investigators can click on a measurement point to verify the results by visually check corresponding video frames (archived by the analysis function). Fig. 13(a) shows the video frame associated with the impulse manifests at 14:41:40 in Fig. 12.

Figure 12
Fig. 12. Edge detection. (a) Severe blockiness distortions. (b) Edge extraction.
Figure 13
Fig. 13. Visual verification of video analysis. (a) Detected distortion. (b) Detected frame-freezing.

The frame-freezing detection function is realized by exploiting the correlation between consecutive video frames captured by the video capture card. The detection function marks the event as possible frame-freezing when the correlation reaches 0.98 and very possible when the correlation is over 0.99. Results of a frame-freezing detection test is given in Fig. 11(c). very possible, possible, and others are marked in distinctive colors. In practice, genuine still scenes exist. Therefore, video frames associated to the events of very possible frame-freezing are made available for visual verification. Fig. 13(b) shows the interface with which the detected frame-freezing events are verified.

3) System Statistics

Network packets are received and de-capsulated by end systems before P2P pieces are assembled for video decoding. A report engine is implemented in our Formula${\rm NextShare}^{TV}$ set-top boxes to accumulates and reports both system statistics and P2P statistics. The system statistics gives details reflecting the software and hardware status of set-top boxes, the handling of incoming packets, and video decoding. Examples of system statistics are box status (playing, standby, off), total number of decoded video frames, min, max and average bit-rate, buffer overflow and underflow events, and decoder syntax error. Some metrics of system statistics such as the ones that indicate errors reported by decoder are highly correlated to distortions manifest on video frames. Some other metrics, such as the buffer level report, provide insights into the root-causes of decoder errors.

The report of system statistics to a dedicate statistics server is triggered periodically (e.g., every 15 min) and also by a number of pre-defined events such as the media playback begin and stop. All system statistics are parsed and stored in an archive database for service analysis. A visualization interface is designed so that investigators can extract statistics based on specific criteria (e.g., device id, IP address, time range, etc.). Fig. 14(a) shows a number of system statistics reported by a particular set-top box. In order to better support service diagnosis, a more intuitive timeline chart interface illustrates a few selected metrics of system statistics. Fig. 14(b) gives an example of statistics analysis to identify the causes of visual distortions. It is noticed that at around 10:40 on August 18, 2011 nearly 20 decode syntax errors were identified. The number of syntax error increased to 41 at about 11:40 which caused over 200 display queue lock events. This is an indication of severe blockiness effects or frame-freezing appeared on user's screen. The set-top box was turned off afterwards which reset both metrics to zero as captured just before 12:00. Timeline analysis as the one shown in Fig. 14(b) is specifically valuable to establish the correlations between metrics captured in different layers of an IPTV system.

Figure 14
Fig. 14. End system statistics archive. (a) Table view. (b) Chart view.

4) P2P Statistics

Timely and orderly reception of P2P pieces is essential to the smooth playback of audio-visual content. Request pieces may arrive out of order or fail to arrive on time due to network impairments or lack of content availability in P2P networks. In the Formula${\rm NextShare}^{Core}$, the request and receive time of all pieces are internally registered and analyzed to derive four metrics for service evaluation. Late and Drop are defined according to a prioritized piece download range which shifts with media play time. A piece arrived beyond the download range is marked as late. Although late pieces may not be useful for video decoding on the same machine, the pieces can still be distributed to other peers. Pieces failed to arrive after a pre-defined threshold are considered lost and registered as drop.

Low availability of pieces can eventually cause the video buffer to starve leading to video distortions such as frame freezing. The stall metric give an estimate of playback stalls by analyzing the pieces in the egress queue for the decoder. When the video buffer is effectively empty and incoming data can not keep up with content playback, an underrun message is triggered. Fig. 15 shows the results of an experiment where piece availability in a network suddenly deteriorates.

Figure 15
Fig. 15. P2P statistics archive.

5) Network Statistics

Network impairments such as packet delay and losses are the main causes of detrimental quality degradation in audio-visual content distribution networks. Meanwhile, issues raised in end systems (e.g., buffer over-flow) and in P2P-layers (e.g., ineffective exchange of pieces) can both cause abnormal behaviors in packet networks (e.g., the TCP reset). The objective of network-layer analysis is to inspect and analyze key network-layer metrics in distribution networks to better identify the root-cause of service interruptions as experienced by end users. In order to measure the TCP-based P2P piece distributions, several TCP transaction capture (e.g., tcpdump) and analysis tools (e.g., Wireshark) are implemented at network interface (e.g., an access aggregation point) adjacent to the Formula${\rm NextShare}^{TV}$ box. Fig. 16 shows a number of network statistics captured, including TCP retransmission, TCP out of order, TCP zero window, TCP lost segment, TCP reset, and TCP congestion window reduced. All metrics are defined as the number of events detected within a second.

Figure 16
Fig. 16. Network statistics.

In order to test system performance under different network conditions, various types of controlled transmission impairments can also be emulated using either traffic control with Netem or radio signal management of wireless mesh boxes within our wireless P2P-based distribution services.

D. Use Case

Fig. 17 shows an ongoing evaluation session which captures statistics in all service layers. A Formula${\rm NextShare}^{TV}$ set-top box is connected to campus network via a gateway (two wireless mesh network devices). Network statistics are captured at the gateway using packet inspection and transmission analysis tools. P2P pieces are assembled and decoded by Formula${\rm NextShare}^{TV}$ box where P2P statistics and system statistics are acquired and posted to statistics server. Digital video signals from Formula${\rm NextShare}^{TV}$ box are displayed on an LCD TV unit and also fed to a video capture card on an evaluation workstation where a Matlab program conducts objective quality assessment. An investigator uses a remote controller to submit subjective feedback. The test results are analyzed using a master QoE analysis console.

Figure 17
Fig. 17. Ongoing in-service quality evaluation.

While table and chart view of individual measurement is used for analyzing a specific service metric (such as buffer level of set-top box), the master console is designed to identify the root-cause of service interruptions. The console integrates multiple visualisation charts and synchronises all relevant charts when a time range is specified. Fig. 18 shows how master console is used to identify the causes of a number of subjective feedbacks received around 11:34. The system and frame-freezing detection statistics both report severe errors (“Display queue locked by lack of picture” and “Very possible frame-freezing”) that resonate with users' responses. This is also verified by visual check (as shown in Fig. 13) and survey recreation of user feedbacks [as shown in Fig. 10(a)].

Figure 18
Fig. 18. Master console showing monitoring results (excerpt).

Tracing back on the timeline of network statistics from the points of subjective responses, a large number of TCP retransmission and TCP out of order are recognized around 11:32 and 11:33 (Fig. 18). For this specific evaluation test, the cause of frame-freezing distortions as perceived by users is believed to be the impulses of packet losses and jitters emerged in distribution networks.

The use case provides a simple example which demonstrates the necessity and effectiveness of a multi-modal evaluation framework for the assessment of a complex audio-visual service like the P2P-based IPTV services of Lancaster Living Lab. Using different combinations of measurement tools and metrics, comprehensive analysis for service diagnosis and research activities have been made possible.

E. Summary

This section introduces a multimodal QoE evaluation framework that is specifically designed for quality assessment of the IPTV service in the Lancaster Living Lab. A number of subjective and objective evaluation methods are defined and implemented within this framework. Our test demonstrates the advantages of comprehensive measurement for service evaluation and benchmarking when collaborations between service providers, device manufactures, and network carriers are available. Although the metrics visualized by the framework are relatively complicated, a screened view can be implemented so that end users can carry out self-diagnosis and receive recommendations should service interruptions are experienced. Future work will focus on further improve the efficiency of measurement components such as objective video analysis and packet inspection so that they can be widely distributed in service networks. Moreover, the service metrics derived by the evaluation framework could also be utilized for relevant content and service management mechanisms such as scalable video coding and unequal packet protection.

SECTION VI

CONCLUSION

Traditional linear TV services are being challenged by the trend of converged IPTV services where high quality audio-visual content can be distributed to heterogenous end systems via a variety of IP-based networks. Meanwhile, recent development in P2P technologies has encouraged energy efficient and low-cost delivery for commercial and user-generated multimedia content. This paper introduces the design, deployment and QoE measurement of a P2P-based IPTV service that has been deployed to both set-top box and web-based platform in the Lancaster Living Lab with thousands of users. Using this open service architecture, researchers have been able to investigate and analyze emerging operational and measurement technologies for future audio-visual services in a Living Lab environment. A number of use cases and experiments are also presented to demonstrate how challenges related to service design and quality assurance are tackled. Focuses of future work include further integration of mobile platform, QoE-aware service management, and also applications for social networking and community development.

Footnotes

This work was supported by the European Commission within the FP7 Project: P2P-Next and FIRM project (Framework for Innovation and Research in MediaCityUK). The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Oscar Bonastre.

M. Mu, J. Ishmael, W. Knowles, M. Rouncefield, and N. Race are with the School of Computing and Communications, Lancaster University, Lancaster LA1 4WA, U.K. (e-mail: m.mu@lancaster.ac.uk; j.ishmael@lancaster.ac.uk; w.knowles@lancaster.ac.uk; m.rouncefield@lancaster.ac.uk; n.race@lancaster.ac.uk).

M. Stuart is with Pioneer Digital Design, Slough SL2 4QP, U.K. (e-mail: mark@pddresearch.com).

G. Wright is with BBC R&D, London W12 7SB, U.K. (e-mail: george.wright@bbc.co.uk).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

References

No Data Available

Authors

Mu Mu

Mu Mu

Mu Mu received the M.Sc. degree from Darmstadt University of Technology, Darmstadt, Germany, in 2006 and the Ph.D. degree in computer science from Lancaster University, Lancaster, U.K.

In 2007, He joined the School of Computing and Communications at Lancaster University, where he leads the research on the topic of quality of experience (QoE). As a Research Associate at Lancaster University, he has been a named researcher of several European and U.K. projects. His publications appeared in top-rank conferences and journals. His current research interests include user experience in multimedia services and social media technologies.

Johnathan Ishmael

Johnathan Ishmael

Johnathan Ishmael is a Software Engineer and technologist at the British Broadcasting Corporation (BBC), working in future media and mobile platforms. He is responsible for the development of public facing mobile websites and services. His prior research focus has revolved around the delivery of next-generation Internet access to rural regions and emerging social media technologies. He is also undertaking rapid prototyping and back-office system design and administration. Before joining the BBC, he was a Research Associate at the School of Computing and Communications, Lancaster University, Lancaster, U.K.

No Photo Available

William Knowles

William Knowles, biography and photograph not available at time of publication.

Mark Rouncefield

Mark Rouncefield

Mark Rouncefield received the B.A. degree in social studies from Exeter University, Exeter, U.K.; the M.A. degree in education from Durham University, Durham, U.K.; and the M.A. and Ph.D. degrees in sociology from Lancaster University, Lancaster, U.K.

He is an ethnographer, sociologist, recent holder of a Microsoft European Research Fellowship for work on social interaction and mundane technologies, and currently a Senior Research Fellow in the School of Computing and Communications, Lancaster University. His research interests embrace various aspects of the empirical study of work, organization, human factors, and interactive computer systems design. He is the author of six books and over 150 conference and journal papers. His most recent book is Doing Design Ethnography (New York: Springer, 2012).

Nicholas Race

Nicholas Race

Nicholas Race received the Ph.D. from Lancaster University, Lancaster, U.K., in 2000, with his thesis examining support for video distribution through multimedia caching.

He is a Senior Lecturer in the School of Computing and Communications at Lancaster University. His research interests are primarily focused around two key areas: wireless mesh networks and content distribution. He has built a Wireless Mesh Network within the village of Wray in the North West of England, which provides a Living Lab environment for a range of research activities focused on media distribution.

Dr. Race has won the Lancaster University Community Prize and the Queen's Anniversary Prize for his work in connecting communities using wireless technology.

Mark Stuart

Mark Stuart

Mark Stuart received the B.Sc. (Hons) degree in computer science from Warwick University, Coventry, U.K., and the M.Sc. degree in intelligent systems from Brunel University, Uxbrudge, U.K.

He is a R&D Manager at Pioneer Digital Design and Technical Director of the P2P-Next project. He led the hardware and software teams that produced the P2P Media Receiver NextShareTV. He also represents Pioneer in Europe within standardization bodies such as EBU, IETF, DVB, and DTG. His research interests relate to P2P content networks and distributed systems, together with their integration and optimization within low-cost consumer electronics devices.

George Wright

George Wright

George Wright is head of the Internet Research & Future Services Team. He leads a cross platform, multi-discipline team on a number of future facing products, in collaboration with colleagues inside and outside BBC Future Media and Technology. Before joining BBC Research and Development, he worked in Interactive TV Development in BBC Red Button, leading production on services for the Cable and Freeview digital platforms. Before this, he made websites for pre-school children and their parents, reported about pop music and technology for the BBC, and was a professional musician.

Cited By

No Data Available

Keywords

Corrections

None

Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available

Text Size