Journals & Magazines >IEEE Transactions on Software... >Volume: 50 Issue: 5

Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Deep neural networks (DNNs) often accept high-dimensional media data (e.g., photos, text, and audio) and understand their perceptual content (e.g., a cat). To test DNNs, ...Show More

Metadata

Abstract:

Deep neural networks (DNNs) often accept high-dimensional media data (e.g., photos, text, and audio) and understand their perceptual content (e.g., a cat). To test DNNs, diverse inputs are needed to trigger mis-predictions. Some preliminary works use byte-level mutations or domain-specific filters (e.g., foggy), whose enabled mutations may be limited and likely error-prone. State-of-the-art (SOTA) works employ deep generative models to generate (infinite) inputs. Also, to keep the mutated inputs perceptually valid (e.g., a cat remains a “cat” after mutation), existing efforts rely on imprecise and less generalizable heuristics. This study revisits two key objectives in media input mutation — perception diversity (Div) and validity (Val) — in a rigorous manner based on manifold, a well-developed theory capturing perceptions of high-dimensional media data in a low-dimensional space. We show important results that Div and Val inextricably bound each other, and prove that SOTA generative model-based methods fundamentally fail to mutate real-world media data (either sacrificing Div or Val). In contrast, we discuss the feasibility of mutating real-world media data with provably high Div and Val based on manifold. Following, we concretize the technical solution of mutating media data of various formats (images, audios, text) via a unified manner based on manifold. Specifically, when media data are projected into a low-dimensional manifold, the data can be mutated by walking on the manifold with certain directions and step sizes. When contrasted with the input data, the mutated data exhibit encouraging Div in the perceptual traits (e.g., lying vs. standing dog) while retaining reasonably high Val (i.e., a dog remains a dog). We implement our techniques in DeepWalk for testing DNNs. DeepWalk constructs manifolds for media data offline. In online testing, DeepWalk walks on manifolds to generate mutated media data with provably high Div and Val. Our evaluation tests DNNs execu...

Published in: IEEE Transactions on Software Engineering ( Volume: 50, Issue: 5, May 2024)

Page(s): 1040 - 1064

Date of Publication: 07 March 2024

ISSN Information:

DOI: 10.1109/TSE.2024.3370807

Funding Agency:

Contents

References is not available for this document.

Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?