Skip to Main Content
Security concerns involved in dealing with sensitive information conveyed in human languages must be able to handle speech, which is the most basic, natural form of human communication and a huge amount of data are being generated daily. Dealing with such data is naturally associated with typical big-data problems in terms of both computational complexity and storage space. Unfortunately, compared with written texts, speech is inherently more difficult to browse, if no technical support is provided. In this paper we are interested in spotting keywords, which could reflect a security agent's information needs, and study its usefulness in helping automatically disclose topic changes (boundaries) in speech data under concern. Our results show that keyword spotting can help identify topics with a competitive performance.