When it comes to choosing a place to stay, eat, visit or enjoy extra-curricular activities, most of us rely on the information provided by the business listing of our choice. However, it is highly probable that, at least once, the following happened to us: directions that lead to an incorrect place or calling a real estate agent instead of a restaurant. There are countless directories that publish contact information related to organizations of all categories. The problem is, many businesses do not update their corporate information across all of these and the official business website is usually (but not always) the place to go for the most accurate info.
One of our customers, a large experience provider, faced this exact problem: how to ensure that, in a database containing 40 million places, data validity is not called into question. The company was looking for cost effective ways to proactively avoid poor user experience; such as a wasted trip or simply pure frustration.
Business information that can be verified and/or enriched includes name, address, coordinates, phone, email, category, working hours, photos and description. To ensure accuracy and freshness, smaller batches of data should be verified on a monthly basis, as well as verification of recently purchased data.
The results of the sample analysis for the first month showed that 10% of the businesses were permanently closed down.
Of the remaining 90%, 20% of the businesses had incorrect categories, 20% had wrong phone numbers and at least 5% needed to update business names.
Approximately 20% of the places had inaccurate coordinates.
In addition, 85% of the data was enriched with the content from the official business website.
The process that PlaceLab employs is done automatically. Verification is performed by a cross-check of the most relevant online sources. In addition, machine learning is used to extract data from unstructured website content, common to small businesses’ official websites. The system is trained to understand the content and recognize the appropriate category. Most importantly, all of this is done 15 times faster than a human, proving that there is a cost-effective way to provide a seamless user experience, in the ever changing world of business data.