Skip to Main Content
When handling the SARS data set we find some mistakes. Because the amount of the values is more than 700,000, we cannot identify and correct the mistakes manually. Additionally, as different doctors observe or measure different attributes for different patients, there are large amounts of missing values in the whole information table, and so the conventional methods such as ANN, SVM, Bayesian network, etc. cannot be used to this case. Fortunately, some attributes have been measured together. So we can use rough set to induce the dependencies among them, and use the dependencies to identify and correct the mistakes. Furthermore we give a measure for finding and correcting the mistakes. The values corrected with the algorithm correspond to the ones corrected by medical experts, which indicates that the algorithm is available.