The importance of clean data in powering effective AI
By Barley Laing, the UK Managing Director at Melissa
Artificial intelligence (AI) is rapidly changing the way we work and is adding value in ways that were unthinkable only twelve months ago. However, in the rush to get on the AI bandwagon it’s vital to recognise that AI tools are only as good as the data they have access to.
If the data is incorrect or out of date AI will not add any value, in fact quite the reverse. Poor data will lead to poor results.
For example, if someone living in a low crime rate part of a city applies for home insurance, and the postcode the insurer has on their system for them is in an area of the city with a higher rate of crime, this will impact on the premium calculated by AI. A higher than expected quote because of incorrect data may encourage the customer to source a quote elsewhere.
Data decays quickly
A huge factor impacting on the effective implementation of AI is the fact that data decays rapidly. The causes of this include people moving home, death and divorce, and is the main reason why user contact data lacking regular intervention degrades at 25 per cent a year. Also, 20 per cent of addresses entered online contain errors; these include spelling mistakes, wrong house numbers, and incorrect postcodes, largely due to the typing of contact information on small mobile screens. All of these factors are why 91 per cent of organisations have common data quality problems.
The good news is that incorrect contact data can be avoided by having verification processes in place at the point of data capture, and when cleaning held data in batch. This usually involves simple and cost-effective changes to the data quality process.
Use address autocomplete
A valuable piece of technology to use at the customer onboarding stage is an address autocomplete or lookup service. It delivers accurate address data in real-time when onboarding new customers by providing a properly formatted, correct address when they start to input theirs. It also reduces the number of keystrokes required, by up to 81 per cent, when typing an address. This speeds up the onboarding process and reduces the probability of the user not completing an application to access a service.
This approach to first point of contact verification can be extended to email and phone, so that these valuable contact data channels can also be verified in real-time.
Data duplication is a big issue, with the average database containing 8-10 per cent duplicate records. This happens for a number of reasons, such as when two departments merge their data and mistakes in contact data collection occur at different touchpoints. Not only does duplication have the potential to confuse an AI application, but it adds cost in terms of time and money, particularly with printed communications, and it adversely impacts on the sender’s reputation.
To deduplicate data the answer is to use an advanced fuzzy matching tool. By merging and purging the most challenging records it’s possible to create a ‘single user record’ and obtain an optimum single customer view (SCV) that AI can make learnings from. Also, organising contact data in this way will maximise efficiency and reduce costs, because multiple outreach efforts will not be made to the same person. An additional benefit is that the potential for fraud is reduced because a unified record will be established for each customer.
Undertake data suppression / cleaning
Data suppression, or cleaning, using the appropriate technology that highlights people who have moved or are no longer at the address on file, is a vital part of the data cleansing process, and therefore in supporting efforts with AI. As well as removing incorrect addresses, these services can include deceased flagging to stop the sending of mail and other communications to those who have passed away, which can cause distress to their friends and relatives. By implementing suppression strategies organisations can save money, protect their reputations, avoid fraud and support their AI efforts.
Source a SaaS data cleaning platform
Today, it’s never been easier or more cost-effective to manage data quality in real-time to support AI and wider business efficiencies. It’s possible to source a scalable data cleaning software-as-a-service (SaaS) platform that requires no coding, integration, or training. This technology cleanses and corrects names, addresses, email addresses, and telephone numbers worldwide. It matches records in real-time, ensuring no duplication, and provides data profiling to help source issues for further action. A single, intuitive interface provides the opportunity for data standardisation, validation, and enrichment, resulting in high-quality contact information across multiple databases. It can do this with held data in batch and as new data is being collected, and can be accessed on-premise, if required.
AI can give your business a competitive edge, but this is dependent on the quality of data fed into the AI models. Poor data leads to unreliable predictions, and poor decisions. Therefore, to maximise the success of your AI efforts implement best practice data quality procedures.