Training Data Distribution Estimation for Optimized Pre-training Data Management | IEEE Conference Publication | IEEE Xplore