Skip to Main Content
The development process for a given software system is a combination of an idealized, prescribed model and a messy set of ad hoc practices. To some degree, process compliance can be enforced by supporting tools that require various steps be followed in order; however, this approach is often perceived as heavyweight and inflexible by developers, who generally prefer that tools support their desired work habits rather than limit their choices. An alternative approach to monitoring process compliance is to instrument the various tools and repositories that developers use - such as version control systems, bug-trackers, and mailing-list archives - and to build models of the de facto development process through observation, analysis, and inference. In this paper, we present a technique for recovering a project's software development processes from a variety of existing artifacts. We first apply unsupervised and supervised techniques - including word-bags, topic analysis, summary statistics, and Bayesian classifiers - to annotate software artifacts by related topics, maintenance types, and non-functional requirements. We map the analysis results onto a time-line based view of the Unified Process development model, which we call Recovered Unified Process Views. We demonstrate our approach for extracting these process views on two case studies: FreeBSD and SQLite.