ActiveClean has been featured in today’s “the morning paper“. The ActiveClean project aims to develop tools and algorithms to address one of the key steps in model training pipelines: handle dirty or inconsistent data including extracting structure, imputing missing values, and handling incorrect data.