CRM Core provides support for identifying duplicate contacts through CRM Core Match. It is designed to allow administrators to control the logical rules through which the system decides whether or not a contact already exists, and pass back consistent information other modules can use to control how records are handled.
Matching Engines and CRM Core Match
CRM Core Match is designed to allow administrators to configure logical matching rules for each contact type in the system. CRM Core Match does not actually do the matching itself; what it does do is process contact information in an orderly way through one or more matching engines.
Matching engines are what actually identify the contacts. CRM Core ships with a module called CRM Core Default Matching Engine, which provides a basic matching engine that is appropriate for most Drupal installations. With it, you can identify duplicate contacts based on the contents of fields in the contact record, by assigning weights to each field and comparing them to a threshold score that indicates when a contact is a match.
CRM Core does not presume simple field based matching is the only way you might need to identify potential duplicates in a system. Like we said, CRM Core Match is capable of handling multiple matching engines. You don't have to use the Default Matching Engine, you can write something that handles contacts exactly the way you want.
There is complete documentation on how to create your own matching engine in the developer documentation for CRM Core.
Configuring CRM Core Match
Here are the steps you would take to configure matching rules for contacts using CRM Core Match and CRM Core Default Matching Engine.
- Ensure that CRM Core Match and CRM Core Default Matching Engine are enabled on the modules page for your Drupal website.
- The screen for administering matching engines is available at Administration > Configuration > CRM Core > Matching Engines. Alternately, you can access this link by just going to admin/config/crm-core/match.
- The Matching Engines screen provides a list of all matching engines provided for your Drupal website. As a precaution, matching engines are not enabled by default. You should see a record for Default Matching Engine in the table on the screen. Click the link that says 'Enable' in order to turn it on.
- Once Default Matching Engine is enabled, you can configure it by clicking the link for Configuration, all the way to the right. This will take you to a screen allowing you to configure the matching rules for specific contact types.
- Select a contact type. You will be brought to a screen that allows you to select the fields that will be used for matching.
- On the Matching Rules for Contact Type screen, you will be presented with a series of options for how to configure the rules for how duplicate records are identified. There are some things you want to do for each contact type in order to have matching work effectively.
- Check the box at the top that says 'Enable Matching for this Contact Type.' This will allow the matching engine to actually find matches for this specific contact type.
- Select a threshold. This should be a positive, whole number that will be used to assess the weighted value of each contact.
- Select a sorting option. Occasionally, CRM Core Match might need to decide between a few records with the same score. The sorting option will allow CRM Core to decide how to break a tie.
- Select fields for matching from the list below. To enable matching on a specific field, check the box to the left of the field name. Select a logical operator for each field. Most field types are capable of identifying exact and fuzzy matches.
- For each field you have selected, assign a weight to be used in calculating matches. This should be a positive, whole number that will be used to create a score for each matching contact.
A Practical Example
Let's say you have the Individual contact type configured with the following fields:
- First name
- Last name
- Email address
- City
On the matching rules screen, you have some options for how to customize the way matches are identified. Assuming a threshold of 50 to the contact type, you can identify contacts a number of different ways:
-
Assign a weighted score of 50 for exact matches on the email address. This means CRM Core will identify any contacts with the same email address as a match 100% of the time.
- Assigna weighted score of 20 for the city name. This means CRM Core will find any contacts with a matching city name, but will not identify them as a match without a little help from another field. Great way to handle situations where the spelling of the first name does not exactly match the one recorded in the database.
Assign a weighted score of 20 for the first name, and 30 for the last name. This means CRM Core will find any contacts with the same last name, but will not identify it as a match without a matching first name.