Data matching is the task of finding the records that actually belong to the same entity across several data sources. You should usually employ it after retrieving data from different sources, like other websites, partners or different sales channels. One of the major challenges in data matching is the lack of common identifiers in the data sets that need to be matched. As a result, most often, you have to match data using attributes that contain only partial information. Moreover, the available information is, in some cases, of low quality. Especially, information like personal details suffers from frequent typographical variations and even errors.
As a result, we witness a continuously increasing number of domains where you can use data matching. Its applicability ranges from e-commerce, mailing lists, to more traditional domains like the health sector or national census.
In this blog post, we will show a real-world use case and we will discuss how WinPure Clean & Match can help you.
Real World Use Case
Let’s say we are running an e-commerce store, specialized in photographic equipment. The sales are not as good as expected. Hence, we decided to extract the prices from several competitors’ websites and compare these with our own prices.
This looks like a simple thing to do. Just get the products’ descriptions and match them against your own to enable the price comparison. Unfortunately, most often data from different websites do not share an identical key that can be used for data matching.
It is true that some items have unique identifiers, like ISBN. But, as RFID and barcode solutions are becoming widespread, the description of the same product can be quite different on your site than the one found on one of your competitors.
Let’s suppose that your data extraction exercise provided these results: “Olympus 14-42 mm EZ Pancake”, “Olympus M.Zuiko Digital 14-42mm f/3.5-5.6 II R Lens”, “Olympus 14-42 EZ”, “Olympus M.Zuiko Digital ED 14-42mm F3.5-5.6 EZ”, and “Olympus 14-42mm f/3.5-5.6 EZ ED”.
As you already guessed, all these descriptions belong to a single product.
The similarities between these descriptions could be obvious to you. But for a computer algorithm, things are not so easy. Hence, if you try to use a deterministic approach, the process will be resource intensive and time-consuming and. But most importantly, you will not be able to accurately match your records.
To perform an accurate data matching that, in turn, will allow a comprehensive price comparison, you need to a use a software that is able to rigorously identify which product descriptions are pointing, in fact, to the same product.
WinPure Intelligent Data Matching Engine (IDME)
WinPure’s intelligent Data Matching Engine is unique in its capacity to understand and parse the various data sets, which results in an unmatched ability to find more true matches and less false matches than any other similar product. The IDME (Intelligent Data Matching Engine) was specially designed to accurately identify duplicate records in mailing lists, databases, spreadsheets, and so on, that would otherwise be very difficult to find. Here are some of the most important features provided by WinPure IDME:
- Non-Phonetic Fuzzy Matching – matches keying errors and transpositions and also reading errors.
- Firstname Nickname Matching – identifies near-matching person names and even nicknames like Dave & David, Debora & Debra, and so on.
- Company Name Matching – matches company names with different definitions like Marketing World PLC & Marketing World.
- Street Address Matching – identifies identify matching addresses with different abbreviations.
- Advanced Telephone Number Matching – matches numbers with international codes or local codes, like +447958 12 34 56, and 0044 7958123456, and 07958 123 456)
Download the 30 days free trial and see for yourself how our Intelligent Data Matching engine can help your business.