Skip to Main Content
Buggy software is a reality and automated techniques for discovering bugs are highly desirable. A specification describes the correct behavior of a program. For example, a file must eventually be closed once it has been opened. Specifications are learned by finding patterns in normal program execution traces versus erroneous ones. With more traces, more specifications can be learned more accurately. By combining traces from multiple parties that possess distinct programs but use a common library, it is possible to obtain sufficiently many traces. However, obtaining traces from competing parties is problematic: By revealing traces, it may be possible to learn that one party writes buggier code than another. We present an algorithm by which mutually distrusting parties can work together to learn program specifications while preserving their privacy. We use a perturbation algorithm to obfuscate individual trace values while still allowing statistical trends to be mined from the data. Despite the noise introduced to safeguard privacy, empirical evidence suggests that our algorithm learns specifications that find 85 percent of the bugs that a no-privacy approach would find.