Conferences >2024 IEEE/ACM 32nd Internatio...

Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer Forums

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Regular expression Denial of Service (ReDoS) represents an algorithmic complexity attack that exploits the processing of regular expressions (regexes) to produce a denial...Show More

Metadata

Abstract:

Regular expression Denial of Service (ReDoS) represents an algorithmic complexity attack that exploits the processing of regular expressions (regexes) to produce a denial-of-service attack. This attack occurs when a regex’s evaluation time scales polynomially or exponentially with input length, posing significant challenges for software developers. The advent of Large Language Models (LLMs) has revolutionized the generation of regexes from natural language prompts, but not without its risks. Prior works showed that LLMs can generate code with vulnerabilities and security smells. In this paper, we examined the correctness and security of regexes generated by LLMs as well as the characteristics of LLM-generated vulnerable regexes. Our study also examined ReDoS patterns in actual software projects, aligning them with corresponding regex equivalence classes and algorithmic complexity. Moreover, we analyzed developer discussions on GitHub and StackOverflow, constructing a taxonomy to investigate their experiences and perspectives on ReDoS. In this study, we found that GPT-3.5 was the best LLM to generate regexes that are both correct and secure. We also observed that LLM-generated regexes mainly have polynomial ReDoS vulnerability patterns, and it is consistent with vulnerable regexes found in open source projects. We also found that developers’ main discussions around insecure regexes is related to mitigation strategies to remove vulnerable regexes. CCS CONCEPTS • Software and its engineering

$\rightarrow$ State based definitions; • Security and privacy

$\rightarrow$ Denial-of-service attacks; • Computing methodologies

$\rightarrow$ Multi-task learning.

Published in: 2024 IEEE/ACM 32nd International Conference on Program Comprehension (ICPC)

Date of Conference: 15-16 April 2024

Date Added to IEEE Xplore: 18 June 2024

ISBN Information:

ISSN Information:

Conference Location: Lisbon, Portugal

Contents

References is not available for this document.

Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer Forums

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer Forums

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?