Classification can often benefit from efficient feature selection. However, the presence of linearly nonseparable data, quick response requirement, small sample problem and noisy features makes the feature selection quite challenging. In this work, a class separability criterion is developed in a high-dimensional kernel space, and feature selection is performed by the maximization of this criterion. To make this feature selection approach work, the issues of automatic kernel parameter tuning, the numerical stability, and the regularization for multi-parameter optimization are addressed. Theoretical analysis uncovers the relationship of this criterion to the radius-margin bound of the SVMs, the KFDA, and the kernel alignment criterion, providing more insight on using this criterion for feature selection. This criterion is applied to a variety of selection modes with different search strategies. Extensive experimental study demonstrates its efficiency in delivering fast and robust feature selection.