We prove that given a nearly log-concave distribution, in any partition of the space to two well separated sets, the measure of the points that do not belong to these sets is large. We apply this isoperimetric inequality to derive lower bounds on the generalization error in learning. We further consider regression problems and show that if the inputs and outputs are sampled from a nearly log-concave distribution, the measure of points for which the prediction is wrong by more than epsi0 and less than epsi1 is (roughly) linear in epsi1-epsi0, as long as epsi0 is not too small, and epsi1 not too large. We also show that when the data are sampled from a nearly log-concave distribution, the margin cannot be large in a strong probabilistic sense
Published in:
Information Theory, IEEE Transactions on
(Volume:53
,
Issue:
3
)
Date of Publication: March 2007