Skip to Main Content
Class names represent the concepts implemented in object-oriented source code and are key elements in program comprehension and, thus, software maintenance. Programming conventions often state that class names should be noun-phrases, but there is little further guidance for developers on the composition of class names. Other researchers have observed that the majority of Java class identifier names are composed of one or more nouns preceded, optionally, by one or more adjectives. However, no detailed analysis of class identifier name structure has been undertaken that could be leveraged to support program comprehension activities. We investigate the lexical and syntactic composition of Java class identifier names in two ways. Firstly, as others have done for C function and Java method names, we identify conventional patterns found in the use of parts of speech. Secondly, we identify the origin of words used in class names within the name of any super class and implemented interfaces to identify patterns of class name construction related to inheritance. Through the analysis of 120,000 unique class names found in 60 open source projects we identify both common and project specific class naming conventions. We apply this knowledge in a case study of the mind-mapping tool Freemind to investigate whether class names that follow unconventional naming schemes are candidates for refactoring either a name refactoring that conforms to established naming conventions within the code base, or refactoring of the class that results in conventionally named classes.