Loading [MathJax]/extensions/MathMenu.js
GPU-Based PostgreSQL Extensions for Scalable High-Throughput Pattern Matching | IEEE Conference Publication | IEEE Xplore

GPU-Based PostgreSQL Extensions for Scalable High-Throughput Pattern Matching


Abstract:

Numerous fields require large-scale pattern matching to achieve a variety of computational goals. Herein, we present novel graphics processing unit (GPU) extensions that ...Show More

Abstract:

Numerous fields require large-scale pattern matching to achieve a variety of computational goals. Herein, we present novel graphics processing unit (GPU) extensions that facilitate high-throughput pattern matching in a PostgreSQL database. We have developed an extension framework to perform data block processing of large pattern data sets, using a stream processing design that results in global k-nearest neighbor matches. This framework was specifically designed to support pattern matching on GPU from within the database environment. This approach avoids the necessity of storing an entire data set onto GPU hardware, which facilitates significant scale-up of pattern databases. This provides enormous potential to incorporate or exploit auxiliary (meta)data as part of the pattern matching process, as well as pipelining the results into traditional relational algebra expressions. By pipelining pattern matching results into a relational expression, the power of the database can be leveraged to build result sets based on various parameterized correlations between the query pattern(s) and the results. In this preliminary work, we have integrated GPU-based high-throughput p-norm metric functions into the database server. This allows one to design heterogeneous data processing techniques that combine large-scale content-based image retrieval (CBIR) with traditional data processing capabilities of the database such as relational, spatial, or text search. We present timing characteristics for various pattern sizes and metric combinations, as well as address the balancing of database and GPU parameterization. Our feature vector datasets range from 18 to 85 GB in database table storage size, reaching 100 million 128 dimensional vectors. We are able to efficiently execute global top k searches from within the database.
Date of Conference: 24-28 August 2014
Date Added to IEEE Xplore: 06 December 2014
Electronic ISBN:978-1-4799-5209-0
Print ISSN: 1051-4651
Conference Location: Stockholm, Sweden

Contact IEEE to Subscribe

References

References is not available for this document.