Data Cloud / Admins

Data Cloud Match Rules vs. Salesforce Duplicate Rules

By Lucy Mazalon

Data Cloud takes care of identity resolution. In short, this is to determine a single view of one individual. What the cocktail of systems in your tech stack portrays is likely different to reality; an individual in real-life is one person, but they could have different ‘identities’ through your organization’s database.

Consider the variation that one individual could have when engaging on all the platforms in your organization. If my name is Rebecca, I could refer to myself as ‘Rebecca’, ‘Becky’, or ‘Becca’ depending on the situation or my mood. I could have also moved house, and so have different mailing addresses – however, I would be the same person.

You may be familiar with Salesforce duplicate and matching rules, however, Data Cloud’s rules can take this further. In this guide, we’ll be sharing an overview on how the two work.

Salesforce Duplicate Rules

In order to fully understand duplicate rules, you need to learn about how matching rules come into the picture.

Matching rules will identify what field and how to match. For example, “Email Field, Exact Match” or “Account Name, Fuzzy Match”. Matching rules, alone, don’t do anything. Compare this to a recipe without a chef.

Duplicate rules will use those matching rules to control when and where to find duplicates. For example, “Use Account Name, Fuzzy Match” to find duplicates on the account object upon creation or “Use Email, Exact Match” to find duplicates on leads and contacts, upon create and edit. Compare this to a chef with the recipe (i.e. the matching rule).

READ MORE: Complete Guide to Salesforce Duplicate Rules

The limitation is that these are strictly rules-based. Traditional deduplication rules may not pick up that I’m ‘Rebecca’ but it would be challenging to make a fuzzy match that could account for ‘Becky’, ‘Becca’, and any other variation. 

Data Cloud Doesn’t Merge Records

It’s important to remember that Data Cloud doesn’t merge records – the records will still exist in the source system as they were when they were ingested. What Data Cloud is doing is compiling records together to render a ‘golden record’, which can be leveraged in the activation stages.

What Salesforce duplicate and matching rules are intended for is to clean up obvious duplicates before those duplicates enter Data Cloud.

Note how I said “obvious” duplicates. Every organization defines “duplicate” differently, which makes the situation less straightforward. In some organizations, there are deliberate duplicates, typically seen in Salesforce orgs with strict (private) sharing rules between business units, or regional divisions. How your organization defines duplicates is up to you, however, they need to be enacted to prevent and purge actual duplicates.

Duplicate records pose great risks to businesses. From misleading users when they have to sift through multiple records and skewed reporting to jeopardizing the customer/prospect experience if sales or service teams get their ‘wires crossed’ with colleagues who may already be working on that opportunity or case. That’s why you need to take duplicate rules seriously, to ‘nip that in the bud’ before you begin introducing other data sources.

READ MORE: Ultimate Guide to Salesforce Data Quality and Data Cleansing

Data Cloud Match Rules

Assuming you have a clean Salesforce database (i.e. one without preventable duplicates), you can now introduce your CRM data into Data Cloud. If preventable duplicates still exist, you’re essentially exacerbating the identity resolution effort. 

Data goes into Data Cloud, becoming a data stream. The reason why these are ‘streams’ is that data is continually being pumped into Data Cloud according to the frequency you define. Here, your CRM data joins forces with data from various other sources, such as your eCommerce platform. 

As mentioned, Data Cloud’s rules can take this further. There are two broad categories to how Data Cloud performs matching:

  • Deterministic: There is no doubt that the data points belong to the same individual, even if there is a difference in capitalization (i.e. not case-sensitive). This is comparable to exact matches in Salesforce duplicate rules. 
  • Probabilistic: This is comparable to fuzzy matches in Salesforce duplicate rules. This caters to the nuances in how people represent themselves in their data, catering to abbreviations, nicknames, etc. These can be set to a precision level (high, medium, low) depending on how much freedom you want the match rules to have. 

Data Cloud Reconciliation Rules

Again, comparing this concept to Salesforce deduplication, if you’ve ever used a third-party tool, you will know that you can apply certain ‘rules of thumb’ to help deduplicate in mass. This is to determine the value that should be used for a field, when there are multiple values that could be used. Reconciliation rules can use: 

  • Last updated
  • Most frequent 
  • Source priority (next point)

Note: Reconciliation rules can only be used on certain types of fields (text, number, alphanumeric) – in other words, they can’t be used for phone or email fields. 

Summary

The main take away from this is that Salesforce duplication rules work on your Salesforce CRM data. Data Cloud identity resolution works on any data source you connect – that is, the match rules and reconciliation rules. 

While Salesforce Duplicate rules prevent the creation of duplicate records based on a rule criteria, Data Cloud allows multiple records to co-exist for the same individual. But the identity resolution process determines what attributes to use in a unified profile to represent the individual while maintaining all source data and lineage, instead of flattening the data into a super record. And as the source data changes for an individual, the identity resolution process will update the unified profile to provide the best representation of an individual, based on “moment-in-time” data.

Remember the saying: ‘garbage in, garbage out’. Clean your Salesforce CRM data before as you could end up exacerbating the problem in Data Cloud. Get rid of preventable duplicates (i.e. ones that have been accidentally created, not caught by existing duplicate/matching rules).

There’s no denying that deduplication is a tedious exercise. While ‘rules of thumb’ are useful, probabilistic matching is where Data Cloud shines.

The Author

Lucy Mazalon

Lucy is the Operations Director at Salesforce Ben. She is a 10x certified Marketing Champion and founder of The DRIP.

Leave a Reply