Skip to Main Content
Cart (Loading....) | Create Account | Sign In
Select All on Page | Deselect All
IEEE Conference Publications
| Quick Abstract | PDF (225 KB)
In this paper, we describe disparity, a tool that does parallel, scalable anomaly detection for clusters. Disparity uses basic statistical methods and scalable reduction operations to perform data reduction on client nodes and uses these results to locate node anomalies. We discuss the implementation of disparity and present results of its use on a SiCortex SC5832 system. View full abstract»
| Quick Abstract | PDF (5903 KB)
In this paper, a novel framework for scalable multi-view video coding is described. A well known wavelet based scalable coding scheme for single-view video sequences has been adopted and extended to match the specific needs of scalable multi-view video coding. Motion compensated temporal filtering (MCTF) is applied to each video sequence of each camera. The use of a wavelet lifting structure guarantees perfect invertibility of this step, and as a consequence of its open-loop architecture, SNR and temporal scalability are attained. Correlations between the temporal subbands of adjacent cameras are reduced by a novel disparity compensated view filtering (DCVF), method which is also lifting based and open-loop to enable view scalability. Spatial scalability and entropy coding are achieved by the JPEG2000 spatial wavelet transform and EBCOT coding, respectively. Rate allocation along the temporal-view-filtered subbands is done by means of an RD-optimal algorithm. Experimental results show the high scaling capability in terms of SNR, temporal and view scalability View full abstract»
| Quick Abstract | PDF (1536 KB)
This paper addresses the evaluation of a new disparity map computing algorithm characterized by a novel spurious removal strategy. Using this algorithm we eliminate a high percentage of wrong values with a low performance penalty. When testing images, incorrect percentages were reduced by 65% and 85%. This algorithm has been designed for scalable architectures with massive parallel processing elements. It works line by line with low memory requirements. To evaluate the scalability of the algorithm we have implement it on two different architectures, a GPU and a CPU: The GPU was a NVIDIA GTX260 graphic card using CUDA. The CPU was an Intel Core i7 920 with 4 cores and hyperethreading (8 virtual cores). Our implementation of the algorithm on the CPU was capable of using an arbitrary number of threads. The tests presented in this paper show that the CPU performance scales from 1 to 4 threads with a factor of 0.66, while the speedup when comparing the GPU with the single-threaded CPU solution is between 38 and 47 times. From the power consumption point of view, the GPU is more than ten times more efficient than the CPU. View full abstract»
| Quick Abstract | PDF (2068 KB)
This paper describes a new architecture and the corresponding implementation of a stereo vision system that covers the entire stereo vision process including noise reduction, rectification, disparity estimation, and visualization. Dense disparity estimation is performed using the non-parametric rank transform and semi-global matching (SGM), which is among the top performing stereo matching methods and outperforms locally-based methods in terms of quality of disparity maps and robustness under difficult imaging conditions. Stream-based processing of the SGM despite its non-scan-aligned, complex data dependencies is achieved by a scalable, systolic-array-based architecture. This architecture fulfills the demands of real-world applications regarding frame rate, depth resolution and low resource usage. The architecture is based on a novel two-dimensional parallelization concept for the SGM. An FPGA implementation on a Xilinx Virtex-5 generates disparity maps of VGA images (640×480 pixel) with a 128 pixel disparity range under real-time conditions (30 fps) at a clock frequency as low as 39 MHz. View full abstract»
IEEE Journals & Magazines
| Quick Abstract | PDF (1157 KB) | HTML
This paper presents a novel framework to achieve scalable multiview image compression and view synthesis. The open-loop wavelet-lifting scheme for geometric filtering has been exploited to achieve signal-to-noise ratio scalability and view-type scalability (mono, stereo, or multiview). Spatial scalability is achieved by employing in-band prediction which removes correlations among subbands (level-by-level) via shift-invariant references obtained by overcomplete discrete wavelet transforms. We propose a novel in-band disparity compensated view filtering approach, akin to motion compensated temporal filtering, for achieving a scalable multiview codec. In our codec, hybrid prediction is proposed to deal with occlusions, and a novel cost function in dynamic programming (DP) for disparity estimation is introduced to improve view synthesis quality. Experiments show comparable results at full resolution and significant improvements at coarser resolutions, compared to a conventional spatial prediction scheme. View synthesis efficiency is extensively improved by utilizing disparity estimation from the proposed DP approach. View full abstract»
| Quick Abstract | PDF (4202 KB)
The paper proposes a scalable video coding scheme for vision based subway platform monitoring system. The proposed scheme is designed for providing flexible video service among various display environments, such as displays of total traffic control, station employees and train driver and so on. The proposed scheme uses enhanced compatible coding scheme for predicting P- and B-types of pictures in the right-view sequence. It predicts matching block by interpolating both motion and disparity predicted macro-blocks. To provide flexible stereo video service, we define both temporally and spatially scalable layers for each eyes-view by using the concept of spatio-temporal scalability. The experimental results show the efficiency of proposed coding scheme by comparison with already known methods and the advantages of disparity estimation in terms of scalability overhead. According to the experimental results, we expect the proposed functionalities will play a key role in establishing highly flexible stereo video service for ubiquitous computing environment where devices and network connections are heterogeneous View full abstract»
| Quick Abstract | PDF (3684 KB)
In this paper, we propose two wavelet-based frameworks which allow fully scalable multi-view video coding. Using a 4-D wavelet transform, both schemes generate a bitstream that can be truncated to achieve a temporally, view-directionally, and/or spatially downscaled representation of the coded multi-view video sequence. Well-known wavelet-based scalable coding schemes for single-view video sequences have been adopted and extended to match the specific needs of scalable multi-view video coding. Motion compensated temporal filtering (MCTF) is applied to each Video sequence of each camera to exploit temporal correlation and inter-view dependencies are exploited with disparity compensated view filtering (DCVF). A spatial wavelet transform is utilized either before and after temporal-view-directional decomposition (2D+T+V+2D scheme) or only after the temporal-view-directional decomposition (T+V+2D scheme) for spatial decorrelation. The influence of the two different approaches on spatial scalability is shown in this paper as well as the superior coding efficiency of both codecs compared with simulcast coding. View full abstract»
| Quick Abstract | PDF (376 KB)
A number of lifting-based video coding schemes have been recently proposed for scalable video coding. We present a novel multi-view image codec based on a wavelet lifting scheme. The proposed lifting scheme with disparity compensated channel filtering is very efficient in terms of compression performance, memory requirements and implementation. We propose a number of enhancements to the basic scheme, such as hybrid prediction, adaptive weighing in update step and overlapped block disparity compensation which yield significant improvements in rate distortion performance. Experimental results show image quality gains of up to 1.5 dB and 1.2 dB against well established methods such as block-matching Haar and 5/3 wavelet lifting respectively. View full abstract»
| Quick Abstract | PDF (1158 KB)
Research on error resilience in multi-view coding is currently receiving considerable interest. While there is a multitude of literature concerning error recovery in 2D video, due to the statistical difference in motion compensation among temporal frames and disparity compensation among view points, such methods are inadequate to cater to the requirements of multiview video transmission. This paper addresses the above issue by transmission of redundant disparity vectors for error recovery purposes. The proposed system, which is implemented using the Joint Scalable Video Model (JSVM) codec and tested using a simulated Internet Protocol (IP) packet network environment, can be used along with a suitable error concealment scheme to provide robust multi-view video transmission. The experimental results suggest that the proposed algorithm experiences a slight degradation of quality in error free environments due to the inclusion of redundant data. However, it improves the reconstructed picture quality significantly in error prone environments, specifically for Packet Loss Rates (PLRs) greater than 7%. View full abstract»
| PDF (45 KB)
| Quick Abstract | PDF (4249 KB)
In this paper, we propose a stereoscopic video coding scheme for subway accident monitoring system. The proposed scheme is designed for providing flexible video service among various display environments, such as displays of total traffic control, station employees and train driver. The proposed scheme uses enhanced compatible coding scheme for predicting P- and B-types of pictures in the right-view sequence. It predicts matching block by interpolating both motion and disparity predicted macro-blocks. To provide flexible stereo video service, we define both temporally and spatially scalable layers for each eyes-view by using the concept of spatio-temporal scalability. The experimental results show the efficiency of proposed coding scheme by comparison with already known methods and the advantages of disparity estimation in terms of scalability overhead View full abstract»
| Quick Abstract | PDF (952 KB)
This paper describes a novel framework for object extraction from images utilizing multiple cameras. Focused regions in images and disparities of point correspondences among multiple images are 3-D clues for the extraction. We examine the extraction of focused objects from images by these automatically acquired clues. Edges in images captured by the cameras are detected, and disparities of the edges in focused regions become the clues, called disparity keys. A focused object is extracted from an image as a set of edge intervals with the disparity keys. The falsely extracted parts can be detected by discontinuous contours of the object and recovered by contour morphing. Some experimental results under different conditions demonstrate the effectiveness and robustness of the proposed method. The method can be applied to image synthesis methods, such as synthesis/natural hybrid coding (SNHC) and to object-scalable coding in MPEG-4 View full abstract»
| Quick Abstract | PDF (184 KB)
We present a VLSI architecture and implementation for a highly parallel trellis-based stereo matching algorithm that has been previously presented by the authors. The algorithm obtains disparity (depth) information from a pair of images and has a complexity of O(N 2) for N pixel scan lines and O(N) operations can be performed in parallel. The architecture consists of a linear array of identical processing elements with only nearest-neighbor communication. The host provides pixel data to each end of the array during the forward iterations and reads the computed disparity during the backward iterations. The design is highly scalable. A 10 processor array that can handle 340 pixel scan lines has been fabricated using a 0.65 μm CMOS process and has achieved 8 Mpixels/s throughput View full abstract»
| Quick Abstract | PDF (230 KB)
An approach to scalable multi-view video coding with joint best basis wavelet packets is examined in this paper. A 4-D wavelet transform is used to decorrelate the multi-view video data temporally, view-directionally, and spatially for efficient scalable compression. Motion compensated temporal filtering (MCTF) is used for temporal, and disparity compensated view filtering (DCVF) for view- directional decomposition. Adaptive wavelet packets as a generalized wavelet decomposition are presented for spatial decomposition. Two algorithms to find the best basis wavelet packets are evaluated and compared with classical dyadic wavelet transform: a low complexity entropy based best basis search and a search algorithm in a rate-distortion framework. In both cases, the joint best basis is determined for a group of frames rather than for each frame individually. Therefore, the rate to spend for the tree description is minimal while advantage is taken of the similarity of frames within a temporal-view-directional subband. View full abstract»
MIT Press eBook Chapters
| Quick Abstract | Full Text: PDF
As we enter the "decade of data," the disparity between the vast amount of data storage capacity (measurable in terabytes and petabytes) and the bandwidth available for accessing it has created an input/output bottleneck that is proving to be a major constraint on the effective use of scientific data for research. Scalable Input/Output is a summary of the major research results of the Scalable I/O Initiative, launched by Paul Messina, then Director of the Center for Advanced Computing Research at the California Institute of Technology, to explore software and algorithmic solutions to the I/O imbalance. The contributors explore techniques for I/O optimization, including: I/O characterization to understand application and system I/O patterns; system checkpointing strategies; collective I/O and parallel database support for scientific applications; parallel I/O libraries and strategies for file striping, prefetching, and write behind; compilation strategies for out-of-core data access; scheduling and shared virtual memory alternatives; network support for low-latency data transfer; and parallel I/O application programming interfaces. View full abstract»
| Quick Abstract | PDF (872 KB)
The proliferation of mobile computers and wireless networks requires the design of future distributed real-time applications to recognize and deal with the significant asymmetry between downstream and upstream communication capacities, and the significant disparity between server and client storage capacities. Zdonik et al. (1994) have proposed the use of broadcast disks as a scalable mechanism to deal with this problem. In this paper, we propose a new broadcast disks protocol, based on our Adaptive Information Dispersal Algorithm (AIDA). Our protocol is different from previous ones in that it improves both timeliness and fault tolerance, while allowing for a finer control of multiplexing of prioritized data. We start with a general introduction to broadcast disks. Next, we propose broadcast disk organizations that are suitable for real-time applications. Next, we present AIDA and show its fault-tolerance properties. We conclude with the description and analysis of AIDA-based broadcast disk organizations that achieve both timeliness and fault-tolerance, while preserving downstream communication capacity View full abstract»
| Quick Abstract | PDF (300 KB) | HTML
The fast increasing Internet applications need accurate, high performance and scalable packet classification in traffic control systems. Although there are several designs of packet classification implemented on heterogeneous hardware platforms, accurate and ultra-speed packet classification remains elementary. The disparity arises because traditional packet classification algorithms with imprecise port-based method and packet processing have unacceptably memory access latency. This paper discusses an efficient hybrid packet classification in gigabits traffic control systems using second-generation programmable network processor. Firstly, we address the problem of inaccurate packet classification and analyze the payload of applications. Secondly, we present the packet classification using not only packet header but the first 64-bit payload. Finally, we describe the software pipeline architecture and hardware design for our approach with network processor. Compared with traditional solutions, the hybrid packet classification has 93% accuracy and speed up to 7.6 Gbps in a real network environment. View full abstract»
| Quick Abstract | PDF (219 KB)
In this paper, we develop a novel multi-view video coding scheme that can efficiently realize region of interest (ROI) support in scalable multi-view video coding (SMVC). The point of the algorithm is that an ROI can be treated as a whole picture while the background can be regarded as another one. The motion compensated temporal filtering (MCTF) and disparity compensated view filtering (DCVF) is then separately applied for them in proper order. Experimental results show that the reconstructed quality of the ROI can be improved at low bit rate compared to SMVC and content-based SMVC (CSMVC). View full abstract»
| Quick Abstract | PDF (842 KB)
Both temporal prediction and inter-view prediction are employed to improve the coding efficiency in multiview video coding. Hierarchical B pictures are usually used as the basic structure for temporal prediction. The inter-view prediction in each temporal hierarchy level brings different improvement to the entire coding efficiency. We propose a scalable prediction structure in which inter-view prediction would be disabled if the picture redundancy can be almost exploited by temporal prediction and intra prediction. In this way, time-consuming computation of disparity estimation can be saved. Experimental results show that lower encoding complexity, smaller decoded picture buffer size and better random access ability can be achieved with a slightly gain loss. View full abstract»
| Quick Abstract | PDF (249 KB)
A field programmable gate array (FPGA), when used as a platform for implementing special-purpose computing architectures, offers the potential for increased functional parallelism over the alternative approach of software running on a general-purpose microprocessor. However, the increasing disparity between the logic speed and density of a state-of-the-art FPGA versus a state-of-the-art microprocessor has already begun to negate the benefits of this increased functional parallelism for all but a limited set of applications. The authors believe that the solution to this problem is to construct distributed multi-FPGA architectures to aggregate the parallelism of multiple FPGAs. Such a system would require a high-capacity interconnect, and thus arranging the FPGAs onto a scalable direct network was proposed. This strategy requires each FPGA to contain an integrated router that must share the logic fabric with the application logic. This paper proposed a novel routing technique that can significantly boost such a network's capacity and be implemented into compact and efficient routers. The authors begin with an existing lightweight routing algorithm and augment it with a novel technique called predictive load balancing, where routers collect information about the blocking behavior on their output ports and use this information when making routing decisions View full abstract»
| Quick Abstract | PDF (664 KB)
Commodity hardware and software are growing increasingly more complex, with advances such as chip heterogeneity and specialization, deeper memory hierarchies, fine-grained power management, and most importantly, chip parallelism. Similarly, workloads are growing more concurrent and diverse. With this new complexity in hardware and software, process scheduling in the operating system (OS) becomes more challenging. Nevertheless, most commodity OS schedulers are based on design principles that are 30 years old. This disparity may soon lead to significant performance degradation. Most significantly, parallel architectures such as multicore chips require more than scalable OSs: parallel programs require parallel-aware scheduling. This paper posits that imminent changes in hardware and software warrant reevaluating the scheduler's policies in the commodity OS. We discuss and demonstrate the main issues that the emerging parallel desktops are raising for the OS scheduler. We propose that a new approach to scheduling is required, applying and generalizing lessons from different domain-specific scheduling algorithms, and in particular, parallel job scheduling. Future architectures can also assist the OS by providing better information on process scheduling requirements. View full abstract»
| Quick Abstract | PDF (142 KB)
In this paper we present Energy Aware Random Asynchronous Wakeup (RAW-E), a novel crosslayer power management and routing protocol for heterogeneous wireless sensor and actor networks. RAW-E is an extension of our previously presented Random Asynchronous Wakeup (RAW), a power saving technique for sensor networks that has been shown to reduce energy consumption without significantly affecting the latency or connectivity of the network. RAW-E is a distributed, randomized algorithm where nodes make local decisions on whether to sleep, or to be active based on the energy level of its neighbors. The primary result of RAW-E is the reduction of energy disparity among sensor nodes. Therefore, while the energy reduction is spread uniformly among nodes, the life of network connectivity is prolonged. RAW-E is scalable to the change in network size, node type, node density and topology. RAW-E accommodates seamlessly such network changes, including the presence of actors in heterogeneous sensor networks. RAW-E takes advantage of actor nodes, and uses their resources when possible, thus reducing the energy consumption of sensor nodes. The performance of our protocol remains very good even in large networks, and it scales with density. Previously we have shown by analysis and simulations that RAW improves communication latency and system lifetime compared to current schemes. Through simulation evaluations, we show that RAW-E adds to those features the improvement on energy consumption and extension of connectivity life for heterogeneous sensor and actor networks View full abstract»
| Quick Abstract | PDF (630 KB)
In this paper, we presents a depth adjustment technique that controls depth of region of interest using scalable multiplexing in 3D display. Autostereoscopic displays offer different 3D images depending on the viewing direction. In order to display various viewpoint images on the autostereoscopic display at the same time, we usually divide the space of display panel spatially. This is called spatial multiplexing. However, this spatial multiplexing causes degradation of each viewpoint image resolution. Each viewpoint source is resized with the ratio of the inverse proportion to the number of view due to limited resolution of 3D display. When the images are downscaled with a certain ratio and, the downscaling ratio of the object of interest is bigger than that of the others, the object of interest is enlarged more than the other region without any loss. Moreover, enlarged object region covers the other regions belong to the background of image. In this case, the covered area can offer enough space to make the object that has binocular disparity without any interpolation of holes. It can also control the object's depth in the 3D scene. View full abstract»
| Quick Abstract | PDF (1471 KB)
In this paper we present a method for disparity map computing and its correspondent high parallel hardware accelerator. Our solution considers a two step processing algorithm. First, we compute a one-dimensional biased sum of absolute differences, and later a spurious removal technique is performed to eliminate wrong estimations. The hardware accelerator introduces a memory organization, an address generation scheme and data-path units that have scalable features for several resolutions, frame rates, silicon use, and power consumption instantiations. We have implemented a five stage pipelined organization that operates at 174.5 MHz over an VIRTEX II PRO 2vp30fg676-7 FPGA device, carries out an equivalent of 9.074 GOPS and processes 142 frames per second of Common Intermediate Format (CIF). View full abstract»
A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology. © Copyright 2013 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
Back to Top