Skip to Main Content
Searching and mining medical time series databases is extremely challenging due to large, high entropy, and multidimensional datasets. Traditional time series databases are populated using segments extracted by a sliding window. The resulting database index contains an abundance of redundant time series segments with little to no alignment. This paper presents the idea of "salient segmentation". Salient segmentation is a probabilistic segmentation technique for populating medical time series databases. Segments with the lowest probabilities are considered salient and are inserted into the index. The resulting index has little redundancy and is composed of aligned segments. This approach reduces index sizes by more than 98% over conventional sliding window techniques. Furthermore, salient segmentation can reduce redundancy in motif discovery algorithms by more than 85%, yielding a more succinct representation of a time series signal.