The injection of software faults in software components to assess the impact of these faults on other components or on the system as a whole, allowing the evaluation of fault tolerance, is relatively new compared to decades of research on hardware fault injection. This paper presents an extensive experimental study (more than 3.8 million individual experiments in three real systems) to evaluate the representativeness of faults injected by a state-of-the-art approach (G-SWFIT). Results show that a significant share (up to 72 percent) of injected faults cannot be considered representative of residual software faults as they are consistently detected by regression tests, and that the representativeness of injected faults is affected by the fault location within the system, resulting in different distributions of representative/nonrepresentative faults across files and functions. Therefore, we propose a new approach to refine the faultload by removing faults that are not representative of residual software faults. This filtering is essential to assure meaningful results and to reduce the cost (in terms of number of faults) of software fault injection campaigns in complex software. The proposed approach is based on classification algorithms, is fully automatic, and can be used for improving fault representativeness of existing software fault injection approaches.