Skip to Main Content
Data-intensive systems are designed to handle data at massive scale, and during the years they might evolve to very large, complex systems. In order to support maintenance tasks of these systems several techniques have been developed to analyze the source code of applications or to analyze the underlying databases for the purpose of reverse engineering, e.g. quality assurance or program comprehension. However, only a few techniques take into account the specialties of data-intensive systems (e.g. dependencies arising via database accesses). In this thesis we conducted research to analyze and to improve data-intensive applications via different methods based on static analysis: methods for recovering architecture of data-intensive systems and a quality assurance methodology for applications developed in Magic 4GL. We targeted SQL as the most widespread databases are relational databases using certain dialect of SQL for their queries. With the proposed techniques we were able to analyze large scale industrial projects, such as banking systems with more than 3 million lines of code, and we successfully recovered architecture maps and quality issues of these systems.