Skip to Main Content
Data quality is crucial to any data analysis task. Many imperfection-handling techniques avoid overfitting or simply remove offending portions of the data. Polishing identifies blemishes in the data and makes corrections to retain and recover as much information as possible. When using information collected from channels susceptible to disturbances, data quality is a concern-especially when the primary objective is to assimilate and understand the data. Imperfections can arise from many sources, including transmission and bandwidth constraints, faults in sensor devices, irregularities in sampling, and transcription errors. An intuitive application that exemplifies handling data imperfections is the spell-checker. Developing such a spell-checker would require novel techniques for repairing data imperfections. We are exploring such techniques using a data correction method called polishing. Here, we compare polishing to two alternative approaches to handling data imperfections, focusing on how to evaluate and validate data correction mechanisms.