Google Play icon

Scientists created AI system to recognize and remove mistakes from your data

Posted April 28, 2019

We live in the world of data. Our devices are smart, our surroundings are full of various switches and sensors and our data is being analysed and used in many different ways. However, humans are not present in these processes and a lot of this data is actually dirty. How do we sift through it to find what actually matters? Scientists the University of Waterloo, University of Wisconsin and Stanford University developed a tool called HoloClean, which can recognize and remove dirty data.

The more data you have to analyse, the more mistakes there may be. Identifying and correcting them is very important. Image credit: W.Rebel via Wikimedia (CC BY 3.0)

Dirty data is essentially noise that is collected by various sensors or algorithms. Imagine a system which is analysing data of your websites. It can access all kinds of information, but not all of it is relevant. In fact, some of it is not even real – it is noise, which naturally occurs in all electronic systems. HoloClean is world’s first artificial intelligence-based technology, designed to recognize dirty data and correct it before passing it on for processing. Scientists say that this tool could become useful for various organizations that are working with vast amounts of data.

Scientists note that banks, utility companies and many other enterprises are working with a lot of data. Inevitably, some of it is bad – it can be inaccurate, false or simply irrelevant. HoloClean can be trained to find errors and correct them on its own. Of course, training AI is a long process in itself, but eventually HoloClean would go to town on that data, separate errors and correct them. Or exclude them from the data pool if that is the best decision. This would provide users with a cleaner dataset to use in their analytics. The end goal is and easier analysis with more accurate, dependable results.

Up until today incorrect data has to be identified and corrected manually. It is a long and expensive process, which is not even entirely accurate. Scientists hope that HoloClean could speed up this job, make it easier and more accurate. Ihab Ilyas, one of the developers of HoloClean, said: “This system addresses the problem where the information is out there, and people are using it to run analytics, but it is not correct. It doesn’t provide information that was not there, but instead corrects information you assume is correct”.

Operating on accurate data is hugely important. Only in this way you can hope to reach accurate results and make meaningful decisions. This is one of those jobs that is probably better off in the hands of artificial intelligence. Such system can be trained to sift through a lot of data, recognize errors and correct them, and this process can be both speedy and accurate.


Source: University of Waterloo

Featured news from related categories:

Technology Org App
Google Play icon
83,898 science & technology articles

Most Popular Articles

  1. Efficiency of solar panels could be improved without changing them at all (September 2, 2019)
  2. Diesel is saved? Volkswagen found a way to reduce NOx emissions by 80% (September 3, 2019)
  3. The famous old Titanic is disappearing into time - a new expedition observed the corrosion (September 2, 2019)
  4. Moisturizers May Be Turning Your Skin Into "Swiss Cheese" (August 19, 2019)
  5. The Time Is Now for Precision Patient Monitoring (July 3, 2019)

Follow us

Facebook   Twitter   Pinterest   Tumblr   RSS   Newsletter via Email