What Is Identity Resolution? How It Works and Why You Need It

Identity resolution is critical in customer-centric marketing and providing exceptional customer experiences. Here’s what you need to know about how it works, who needs it, and how to get started.

What is identity resolution?

Identity Resolution is the process of creating a unified record for a person. In its simplest form, it can just be joining two datasets together on a common piece of personal information, like an email address, phone number, or ID. However, it can get complex quickly when there are many datasets to link together. Resolving contradictory data points and data quality issues can torpedo even the most clever plan. Even the question, what is a person? is not straightforward.

How does identity resolution work? An example

Here are seven steps to perform successful identity resolution. Let’s take the example of emailing a credit card rewards programs offer to someone who has abandoned their cart. You want to suppress the campaign to people who are already rewards members.

1. Identify dataset
First, you’ll want to identify your business objective.

In this example you’ll need web form data to get abandoned carts; CRM data to get email addresses, phone numbers, and street addresses; and rewards program data to know who is a rewards member.

Here are four sample records: Two from web forms, one from the rewards data, and one from CRM. A quick visual inspection reveals that this looks like the record in the upper left is one person and the remaining three records should be merged into one person.

Click to expand

2. Choose Personal Information to merge datasets:
Then you’ll want to analyze each dataset to determine which pieces of personal information can be used to merge them together.

When considering which attributes can be used to link the data sets, consider the trustworthiness or reliability of each attribute. For example, if a data set has three email fields—one of which is the primary—will it always have a unique value? Also, if two data sets have a CRM (or another system) ID, are they both referring to the same ID record as the CRM, or is one referring to something else (or is it possibly unpopulated)? This exercise is crucial before finalizing the attributes that will be used for merging.

In this example, email address, phone number, and the CRM User ID would be great candidates to use to merge datasets.

First and last names are not good choices, as many people can have the same name. Diminutive names—such as Jim and Jimmy, being common nicknames for James—complicate name matching. Non-English names might have characters like an accent aigu or umlaut that might not be represented the same way in every dataset.

Street addresses can be unique identifiers but can have various representations like “st,” “st.,” or “street,” having identical meaning and prefixes and suffixes like “NE” that are frequently omitted.

In this example, it is likely that every column is nullable. The web form data will only have the User ID if the user has logged in. All other fields will only be available if they have been filled out.

In the example below the highlighted fields have been selected as the merging personal information, but as you can see the unification did not go well as there are some more steps needed to clean the data.

Click to expand

3. Remove bogus values
Bogus values can wreak havoc in an Identity Resolution algorithm. Imagine if all records with a given email address were considered to the same customer. While this seems like a safe assumption, if there are string representations of NULL, like “Unspecified,” “Unknown,” or “Not Provided,” everyone with emails with those values would be considered the same person. A good way to audit for bogus values is to review values with very high occurrence counts. Adding regular expressions to check data format can also be useful to sniff out filler data.

4. Standardize Personal Information
Since we want to use the personal information columns as key dimensions to merge datasets, it is imperative to make sure the formats are identical. For case-insensitive columns, pick a convention of always representing that value as uppercase or lowercase. Trim off any extraneous white space or quoting characters that may be present.

Values like phone numbers are often represented in various formats, sometimes containing country codes or formatting characters like parentheses, spaces, and dashes. Pick a convention and represent all values in that convention.

Any values for names and addresses should be checked for non-English characters to ensure that the encoding is representing those characters identically in each dataset.

For street addresses, there are APIs that can be used to standardize the address format.

Let’s look at the example again. By replacing the ‘Unspecified’ string with a true NULL and standardizing emails and phone numbers a successful resolution is completed. However, there are still duplicate address records that look like the same address.

Click to expand

Standardizing the address in this example gives an additional link between the web form and CRM data, in addition to resolving duplicate entries in the resolved record.

Click to expand

5. Build a resolution algorithm
Once the bogus values have been sanitized and merging personal information has been standardized, you’re ready to merge the data. Commonly, when building an identity resolution algorithm, there won’t be one piece of personal information that is present in every dataset. New customer IDs can be created by generating random UUIDs or hashing existing columns. Your algorithm will need to collect every record that shares a common piece of personal information and assign it the same customer ID.

The data model used for storing the resolved customer ID and all of its personal information should be capable of storing multiple values for a property, since many people may have multiple phone numbers, email addresses, etc. It’s a good idea to store dates with each value for personal information, in case you need to arbitrate the multiple values for a property down to the most recent one.

6. Analyze the results
After you’ve run the identity resolution algorithm, check if the results make sense. If you see customers with large numbers of unique values for a given piece of personal information like a customer with 15 phone numbers and 12 different last names, then you probably have bogus data that is combining disparate customers as one.

7. Enhance original datasets with the resolved customer ID
Finally, you’re ready to inject your resolved customer id into the original datasets. From there you can run your campaigns.

Who needs it and what are the benefits of identity resolution?

So who can benefit from identity resolution and where/how is it best used? The simple answer is that identity resolution benefits marketers and anyone trying to move their customers through the sales funnel.

1. Customer-centric marketing
Expanding on that, identity resolution is critical in customer-centric marketing and providing exceptional customer experiences. Each one of us wants the brands we love to know us, i.e., what we like/don’t like and to reach us at the right time, with the right message, on our preferred channel.

This could look like an email reminder of what I left in my cart with a 10% me-only promo code to get me to complete that transaction; sending me a push offer on my phone when I’m near a physical store; having my loyalty info ready to apply when I’m dining at my favorite restaurant (instead of asking me for it for the hundredth time); recommending similar articles to ones I’ve been reading on your site, while I’m on-site; or having the call center representative know about my last few transactions and preferences so that when I call in, I don’t have to wait for 5 minutes to “pull up that information.”

2. Excellent experiences in non-marketing scenarios
This benefit also applies in non-marketing scenarios, where your customer is internal. For example, when two companies merge, linking identities across systems and creating a unified view would provide critical input into decisions such as which systems to retire, system access, how many people will be affected by an upgrade or change, notifications, etc.

3. Privacy and regulation compliance
Identity resolution can help with legal compliance to privacy laws. It can be used to assist fulfilling data deletion requests. It can help determine which regions a person is in to know which regulations apply to their data.

4. Real-time insights
If you are taking the time and effort to bring together all the data you have on your customers in a single system, say a Customer Data Platform (CDP), and then perform Identity Resolution to get those unique customer profiles, you have the (identity and behavioral) information necessary to market at a 1:1 level with your customers, i.e., hyper-personalization! Note: this will only be effective if it’s done in real-time. For example, sending that “Get it Now with 10% off” email a week later is a waste of marketing dollars, because, at that point, the customer may have already bought the item somewhere else or lost interest.

5. Stellar business impact
The deeper understanding and insight into your customers that the practice of identity resolution brings, will help make your marketing efforts more effective. If you have the right people, process, and technology in place, it will also help make your efforts more efficient. That means better ROI, better Return on Ad Spend (ROAS), and increased customer retention. Not only can you market to customers more efficiently, but you can avoid bad practices like bad retargeting, e.g., Sending recommendations for a product that they’ve already purchased, violating compliance or sending emails when they’ve opted out, and the many other completely preventable ways in which a brand can lose a customer.

All these taken together, give your organization a competitive advantage.

Is identity resolution privacy-friendly?

It depends on how it is being used.

On one hand, no. It can be used to un-anonymize activities your customers are doing and they may be unaware that they are being tracked.

On the other hand, it can help greatly with legal compliance to privacy laws. As a patchwork of legislation crops up in various regions, knowing where your customers are is critical to understand which laws apply to their data. If a person requests that you delete their data, having a map of all of their identities and having a single universal ID can streamline these requests. It can enable cross-channel opting out, where if someone opts out of email, you could opt them out of text messages and second emails as well.

How to get started

There’s no single piece of software or tool that magically performs identity resolution. Because it is based on your business, how you store data, what you trust within that data, and how you want to use that data, you need a trusted partner who can build the right, custom solution for your business.

We are that organization.

Fill out the form below today to get started down your journey with Identity Resolution.

Related Posts

Join the Conversation

Check out Kelly Wortham’s Optimization based YouTube channel: Test & Learn Community.

Search Discovery
Education Community

Join Search Discovery’s new education community and keep up with the latest tools, technologies, and trends in analytics.

Follow Us

Scroll to Top