Since 5 years ago, civil aviation medicine certification management information system (CAMC system) has been applied in China. As relative dataset increases rapidly, data mining techniques become more and more important. A significant initial step toward the most effective mining is the development of civil aviation medicine data warehouse which integrates data from CAMC system and other sources into separated, fully integrated databases. This paper introduces the process of building the Civil Aviation Medicine Data Warehouse (CAM DW) which allows the investigation of potential flight safety-related issues, by integrating multiple databases and documents. According to the analysis of actual requirement, by analyzing the structure of the data sources, we provide the model of data warehouse and determine disease subject snowflake schema. We also describe the implementation strategy and methods of data extraction, transformation and loading. Finally, the future work and challenges are pointed out.
Published in:
Bioinformatics and Biomedical Technology (ICBBT), 2010 International Conference on
Date of Conference: 16-18 April 2010