Skip to Main Content
Concept identification is the task of locating and identifying concepts (e.g., domain concepts) into code region or, more generally, into artifact chunks. Concept identification is fundamental to program comprehension, software maintenance, and evolution. Different static, dynamic, and hybrid approaches for concept identification exist in the literature. Both static and dynamic techniques have advantages and limitations. In fact, they can be considered to complement each other. Indeed, recent works focused on hybrid techniques to improve the performance in time as well as accuracy (i.e., precision and recall) of the concept location process. Furthermore, sometimes only a single execution trace is available, however, to the best of our knowledge, only few works attempt to automatically identify concepts in a single execution trace. We propose an approach built upon a dynamic-programming algorithm to split an execution trace into segments likely representing concepts. The approach improves performance and scalability with respect to currently available techniques. We also plan to use techniques derived from Latent Dirichlet Allocation (LDA)to automatically assign meanings to segments.