Abstract:
It is time consuming to train support vector regression (SVR) for large-scale problems even with efficient quadratic programming solvers. This issue is particularly serio...Show MoreMetadata
Abstract:
It is time consuming to train support vector regression (SVR) for large-scale problems even with efficient quadratic programming solvers. This issue is particularly serious when tuning the model’s parameters. One way to address the issue is to reduce the problem’s scale by selecting a subset of the training set. This paper presents a fast pattern selection method by scanning the training data set to reduce a problem’s scale. In particular, we find the k-nearest neighbors (kNNs) in a local region around each pattern’s target value, and then determine to retain the pattern according to the distribution of its nearest neighbors. There is a high probability that the pattern locates outside the \varepsilon -tube. Since the kNNs of a pattern are found in a very small region, it is fast to scan the whole training data set. The proposed method deals with the year prediction Million Song Data set, which contains 463 715 patterns, within 10 s on a personal computer with an Intel Core i5–4690 CPU at 3.50 GHz and 8GB DRAM. An additional advantage of the proposed method is that it can predefine the size of the selected subset according to the training set. Comprehensive empirical evaluations demonstrate that the proposed method can significantly eliminate redundant patterns for SVR training with only a slight decrease in performance.
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 29, Issue: 8, August 2018)