Skip to Main Content
DNA sequence analysis has been widely studied by gene-expression microarray techniques. Few results, however, have been provided by Terahertz spectroscopy which reveals the absorbtion or reflectance percentage from different DNA sequences. Previous Terahertz methods have lacked a quantitative analysis of the spectroscopy features, and no definitive conclusion regarding the data can be easily drawn. In this paper, we use a signal processing approach which gives a quantitative interpretation of the DNA spectroscopy. Due to the presence of physical noise, the data can be contaminated by both random fluctuations and impulsive noise. A new signal processing tool called empirical mode decomposition (EMD) is employed to remove the noise and extract the trend of the signal. The data is subsequently partitioned by clustering methods. Experimental results of Terahertz spectroscopy of several different DNA samples show that the EMD aids the clustering process and yields clustering of higher validity than that obtained from the raw data.