We propose the PRDC (Pattern Representation based on Data Compression) scheme for media data analysis. PRDC is composed of two parts: an encoder that translates input data into text and a set of text compressors to generate a compression-ratio vector (CV). The CV is used as a feature of the input data. By preparing a set of media-specific encoders, PRDC becomes widely applicable. Analysis tasks - both categorization (class formation) and recognition (classification) - can be realized using CVs. After a mathematical discussion on the realizability of PRDC, the wide applicability of this scheme is demonstrated through the automatic categorization and/or recognition of music, voices, genomes, handwritten sketches and color images
Published in:
Pattern Analysis and Machine Intelligence, IEEE Transactions on
(Volume:24
,
Issue:
5
)
Date of Publication: May 2002