By Topic

Validating the Use of Topic Models for Software Evolution

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Thomas, S.W. ; Software Anal. & Intell. Lab. (SAIL), Queen''s Univ., Kingston, ON, Canada ; Adams, B. ; Hassan, A.E. ; Blostein, D.

Topics are collections of words that co-occur frequently in a text corpus. Topics have been found to be effective tools for describing the major themes spanning a corpus. Using such topics to describe the evolution of a software system's source code promises to be extremely useful for development tasks such as maintenance and re-engineering. However, no one has yet examined whether these automatically discovered topics accurately describe the evolution of source code, and thus it is not clear whether topic models are a suitable tool for this task. In this paper, we take a first step towards deter-mining the suitability of topic models in the analysis of software evolution by performing a qualitative case study on 12 releases of JHotDraw, a well studied and documented system. We define and compute various metrics on the identified topics and manually investigate how the metrics evolve over time. We find that topic evolutions are characterizable through spikes and drops in their metric values, and that the large majority of these spikes and drops are indeed caused by actual change activity in the source code. We are thus encouraged by the use of topic models as a tool for analyzing the evolution of software.

Published in:

Source Code Analysis and Manipulation (SCAM), 2010 10th IEEE Working Conference on

Date of Conference:

12-13 Sept. 2010