Detection of design pattern occurrences is part of several solutions to software engineering problems, and high accuracy of detection is important to help solve the actual problems. The improvement in accuracy of design pattern occurrence detection requires some way of evaluating various approaches. Currently, there are several different methods used in the community to evaluate accuracy. We show that these differences may greatly influence the accuracy results, which makes it nearly impossible to compare the quality of different techniques. We propose a benchmark suite to improve the situation and a community effort to contribute to, and evolve, the benchmark suite. Also, we propose fine-grained metrics assessing the accuracy of various approaches in the benchmark suite. This allows comparing the detection techniques and helps improve the accuracy of detecting design pattern occurrences.