Most companies have multiple sources of data. These include transaction systems that track customer purchases, e-mail systems, CRM databases, and numerous other data sets that track individual customer interactions with the business. Oftentimes, these data sit in silos with little interaction or cross-pollination of information between databases. As companies become more inclined to run analytics projects, they often ask how to merge such data sets into a unified whole. The first step to doing so is through the use of a Universal Customer ID (UCID).

If your company tracks individuals, you already have some idea of what a customer ID is: a unique identifier meant to track customers within the software tools you use. In eCommerce settings, the most common type of ID is the e-mail address — every individual is considered unique, and individuals use e-mail addresses for receiving receipts, subscribing to newsletters, and other functions.

Matching people using their e-mail addresses is easy and you can reconcile people’s purchases and newsletter subscriptions. Step out of the simple eCommerce scenario, however, and things get very tricky. Suppose you have a brick and mortar store where individuals do not provide e-mail addresses, or suppose you use an affiliate program that does not share personal e-mails. Now you’re out of luck.

This is where a UCID comes in. A UCID is a universal identifier that tracks customers regardless of what information is available. If you are familiar with Social Security Numbers (SSNs) then you are familiar with one form of UCID — the US government gives every working adult a number that can track them throughout their lives, across multiple governmental systems. Your passport number is another example.

Such tracking is crucial if you are merging across databases and want to understand your customers from a holistic perspective. Indeed, it’s the only way to accurately map people across sales channels, transactions, and other interactions. Unfortunately, creating a UCID is no easy task. Below are some tips for you to get started:

  • Make the ID independent of all other data points or data sets. Don’t use e-mail or phone number or anything else of the sort! Come up with an ID that can be generated without third-party information. Otherwise, you will come across customers who are missing that data type and you will be out of luck.
  • Track the ID in a centralized database. UCIDs are universal and should be tracked in a centralized data warehouse or database. Ideally, as much customer-level demographic information, names, addresses, and other information should also be stored in this database. Such additional information will allow you to check which customers already have an existing UCID, and which do not.
  • Have a fuzzy lookup function. Do “Bobby Jones” and “Bob Jones” represent two customers or one? What if they share the same address and zip code? Build a function that can match people in ways that account for typos, mistakes, and other issues. These are called “fuzzy matches” because they need not be exact.
  • Test, test, test. Then test some more! Not every fuzzy matching rule you make will be correct, and some customers will have such dirty data that your database might crash. Dedicate time to ensuring that your system is robust and regularly pick customers at random to ensure that the databases are correct and that matches are being done properly.

Have any stories or challenges around UCIDs? Let us know, we’re happy to help.