Skip to Main Content
This paper presents a systematic approach to mine co- location patterns in Sloan Digital Sky Survey (SDSS) data. SDSS Data Release 5 (DR5) contains 3.6 TB of data. Availability of such large amount of useful data is an opportunity for application of data mining techniques to generate interesting information. The major reason for the lack of such data mining applications in SDSS is the unavailability of data in a suitable format. This work illustrates a procedure to obtain additional galaxy types from an available attributes and transform the data into maximal cliques of galaxies which in turn can be used as transactions for data mining applications. An efficient algorithm GridClique is proposed to generate maximal cliques from large spatial databases. It should be noted that the full general problem of extracting a maximal clique from a graph is known as NP-Hard. The experimental results show that the GridClique algorithm successfully generates all maximal cliques in the SDSS data and enables the generation of useful co-location patterns.