Skip to Main Content
This paper is dedicated to the performance analysis of content-based identification using binary fingerprints and constrained list-based decoding. We formulate content-based identification as a multiple hypothesis test and develop analytical models of its performance in terms of probabilities of correct detection/miss and false acceptance for a class of statistical models, which captures the correlation between elements of either the content or its extracted features. Furthermore, in order to determine the block/codeword length impact on the identification's accuracy, we analyze exponents of these probabilities of errors. Finally, we develop a probabilistic model, justifying the accuracy of identification based on list decoding by evaluating the position of the queried entry on the output list. The obtained results make it possible to characterize the performance of traditional unique decoding, based on the maximum likelihood for the situations when the decoder fails to produce the correct index. This paper also contains experimental results that confirm theoretical findings.