Abstract:
Stack Overflow hosts millions of solutions that aim to solve developers’ programming issues. In this crowdsourced question answering process, Stack Overflow becomes a cod...Show MoreMetadata
Abstract:
Stack Overflow hosts millions of solutions that aim to solve developers’ programming issues. In this crowdsourced question answering process, Stack Overflow becomes a code hosting website where developers actively share its code. However, code snippets on Stack Overflow may contain security vulnerabilities, and if shared carelessly, such snippets can introduce security problems in software systems. In this paper, we empirically study the prevalence of the Common Weakness Enumeration – CWE, in code snippets of C/C++ related answers. We explore the characteristics of Code_w, i.e., code snippets that have CWE instances, in terms of the types of weaknesses, the evolution of Code_w, and who contributed such code snippets. We find that: 1) 36 percent (i.e., 32 out of 89) CWE types are detected in Code_w on Stack Overflow. Particularly, CWE-119, i.e., improper restriction of operations within the bounds of a memory buffer, is common in both answer code snippets and real-world software systems. Furthermore, the proportion of Code_w doubled from 2008 to 2018 after normalizing by the total number of C/C++ snippets in each year. 2) In general, code revisions are associated with a reduction in the number of code weaknesses. However, the majority of Code_w had weaknesses introduced in the first version of the code, and these Code_w were never revised since then. Only 7.5 percent of users who contributed C/C++ code snippets posted or edited code with weaknesses. Users contributed less code with CWE weakness when they were more active (i.e., they either revised more code snippets or had a higher reputation). We also find that some users tended to have the same CWE type repeatedly in their various code snippets. Our empirical study provides insights to users who share code snippets on Stack Overflow so that they are aware of the potential security issues. To understand the community feedback about improving code weaknesses by answer revisions, we also conduct a qualitat...
Published in: IEEE Transactions on Software Engineering ( Volume: 48, Issue: 7, 01 July 2022)