Skip to Main Content
Increased realism in software engineering experiments is often promoted as an important means of increasing generalizability and industrial relevance. In this context, artificiality, e.g., the use of constructed tasks in place of realistic tasks, is seen as a threat. In this paper, we examine the opposite view that deliberately introduced artificial design elements may increase knowledge gain and enhance both generalizability and relevance. In the first part of this paper, we identify and evaluate arguments and examples in favor of and against deliberately introducing artificiality into software engineering experiments. We find that there are good arguments in favor of deliberately introducing artificial design elements to 1) isolate basic mechanisms, 2) establish the existence of phenomena, 3) enable generalization from particularly unfavorable to more favorable conditions (persistence of phenomena), and 4) relate experiments to theory. In the second part of this paper, we summarize a content analysis of articles that report software engineering experiments published over a 10-year period from 1993 to 2002. The analysis reveals a striving for realism and external validity, but little awareness of for what and when various degrees of artificiality and realism are appropriate. Furthermore, much of the focus on realism seems to be based on a narrow understanding of the nature of generalization. We conclude that an increased awareness and deliberation as to when and for what purposes both artificial and realistic design elements are applied is valuable for better knowledge gain and quality in empirical software engineering experiments. We also conclude that time spent on studies that have obvious threats to validity that are due to artificiality might be better spent on studies that investigate research questions for which artificiality is a strength rather than a weakness. However, arguments in favor of artificial design elements should not be used to justify studies - - that are badly designed or that have research questions of low relevance.