Skip to Main Content
Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Data mining has many applications in security including for national security as well as for cyber security. The threats to national security include attacking buildings, destroying critical infrastructures such as power grids and telecommunication systems. Data mining techniques are being investigated to find out who the suspicious people are and who is capable of carrying out terrorist activities. Cyber security is involved with protecting the computer and network systems against corruption due to Trojan horses, worms and viruses. Data mining is also being applied to provide solutions such as intrusion detection and auditing. The first part of the presentation will discuss my joint research with Prof. Latifur Khan and our students at the University of Texas at Dallas on data mining for cyber security applications For example; anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace the viruses to the perpetrators. Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations. Data mining is also being applied for intrusion detection and auditing. Other applications include data mining for malicious code detection such as worm detection and managing firewall policies. This second part of the presentation will discuss the various types of threats to national security and describe data mining techniques for handling such threats. Threats include non real-time threats and real-time threats. We need to understand the types of threats and also gather good data to carry out mining and obtain useful results. The challenge i- s to reduce false positives and false negatives. The third part of the presentation will discuss some of the research challenges. We need some form of real-time data mining, that is, the results have to be generated in real-time, we also need to build models in real-time for realtime intrusion detection. Data mining is also being applied for credit card fraud detection and biometrics related applications. While some progress has been made on topics such as stream data mining, there is still a lot of work to be done here. Another challenge is to mine multimedia data including surveillance video. Finally, we need to maintain the privacy of individuals. Much research has been carried out on privacy preserving data mining. In summary, the presentation will provide an overview of data mining, the various types of threats and then discuss the applications of data mining for malicious code detection, cyber security and national security. Then we will discuss the consequences to privacy.