Blog
Wild & Free Tools

How to Deduplicate a Lead List Before Importing to HubSpot or Salesforce

Last updated: March 10, 2026 6 min read

Table of Contents

  1. Which column to deduplicate on for leads
  2. Step-by-step: clean a lead list before CRM import
  3. Reviewing duplicate groups before deleting
  4. Multi-column matching for B2B lists
  5. After deduplication: validate before import
  6. Frequently Asked Questions

You collected leads from five sources: a webinar registration form, a trade show scan, a LinkedIn export, a purchased list, and your own website form. Now you want to import them all into HubSpot or Salesforce. Before you do, you need to deduplicate.

CRMs handle duplicates differently — some merge automatically, some create separate records, some skip them. None of them do it well. A duplicate in your CRM means the same prospect getting two emails, two sales reps reaching out to the same company, and your pipeline metrics being inflated by phantom records.

The CSV Deduplicator cleans duplicates out of your lead list before it ever touches the CRM. You control what counts as a duplicate, the tool shows you each flagged group, and you download a clean list ready to import.

Choosing the Right Deduplication Key for Lead Lists

The deduplication key is the column (or columns) that defines uniqueness. For lead lists, you have a few options:

Email address — best for B2C and most B2B lists. Email is the most reliable unique identifier. One person = one email address (in most cases). Deduplicate on email first. The tool normalizes case, so "[email protected]" and "[email protected]" match.

Phone number — useful when email is missing. Some leads have phone but no email. Run a second deduplication pass on phone for records where email is blank. The tool normalizes phone formats, so "(555) 867-5309" and "5558675309" match.

Email OR phone (ANY mode). Select both columns and choose "Match on ANY selected column". A row is flagged as a duplicate if either the email OR the phone matches another row. This catches more duplicates but risks false positives if phone numbers are shared (like a company main line used by multiple contacts).

Email AND company (ALL mode). For enterprise B2B lists where you want one contact per company domain, match on both email domain and company name. More nuanced, but requires a column with the email domain already extracted.

The Full Lead List Deduplication Workflow

Here is the complete process, start to finish:

  1. Combine all your source CSVs into one file. If you have leads from five sources, merge them first. Use the CSV Merger to combine multiple CSVs with matching headers into one file.
  2. Standardize the email column name. Make sure all sources use the same column name for email. If one file has "email" and another has "Email Address", rename them to match before merging. The CSV Column Mapper handles this.
  3. Sort by the most complete/recent records first. If you have a date column, sort descending so the most recent record is first. The deduplicator keeps the first occurrence — sorting first means you keep the best version.
  4. Open the CSV Deduplicator. Drop in your merged CSV.
  5. Select the email column. Leave normalization options on (case, whitespace, phone).
  6. Click Find Duplicates. Review the duplicate groups panel — spot-check a few groups to confirm the matches are real.
  7. Download the deduplicated CSV.
  8. Optional: download the duplicates-only CSV to keep a record of what was removed.

Now import the clean, deduplicated CSV into your CRM.

Sell Custom Apparel — We Handle Printing & Free Shipping

Why You Should Review Groups Before Removing

The tool shows you each duplicate group — the row being kept and the rows being removed. This matters because automated deduplication is not perfect.

Common cases where you want to review before accepting:

Shared email addresses. A company's "[email protected]" appears as the email for five different contacts from that company. All five will be flagged as duplicates of the first. But they may be real different people who share a company email. In this case, you might prefer to keep all of them and remove the actual duplicates manually.

Different people at the same company with similar names. If you are deduplicating on company name + last name, "John Smith at Acme" and "Jane Smith at Acme" would not be flagged as duplicates. But if two "John Smith" records exist for different companies that both normalized to "acme", you have a false positive.

The group review panel makes it fast to catch these cases. Spot-check a few groups, especially any where the "duplicate" rows look meaningfully different.

Multi-Column Matching for Enterprise B2B Lead Lists

For B2B lead lists where you want one representative per company, single-column email deduplication is not enough — the same company might have ten legitimate contacts with different emails, all of which you want to keep.

Multi-column matching handles this. Select "Company" and "Job Title" and choose "Match on ALL selected columns". Now two rows are duplicates only if both the company name AND job title match — meaning you keep one VP of Marketing per company, but different roles at the same company remain separate.

Alternatively, if your goal is one contact per domain rather than per company name (to avoid "Acme" vs "Acme Corp" variations), extract the email domain into a separate column first, then deduplicate on domain.

The "Match on ANY" vs "Match on ALL" toggle is powerful. Think through what "duplicate" means for your specific use case before clicking Find Duplicates.

One More Step: Validate Email Addresses

Deduplication removes duplicate rows. It does not check whether email addresses are real, deliverable, or formatted correctly. Before importing a deduped lead list, run the emails through the Email Validator.

The Email Validator flags:

Removing invalid and role-based emails before CRM import improves your bounce rate, protects your sender reputation if you send outreach, and keeps your pipeline metrics accurate.

The full clean lead list pipeline: merge sources → deduplicate → validate emails → rename columns for CRM → import. Each step takes 2-5 minutes. The whole thing is under 20 minutes and produces a genuinely clean dataset.

Try It Free — No Signup Required

Runs 100% in your browser. No data is collected, stored, or sent anywhere.

Open CSV Deduplicator

Frequently Asked Questions

Should I deduplicate before or after merging multiple source CSVs?

After merging. Combine all your source files into one CSV first, then deduplicate the combined file. This catches cross-source duplicates — the same person appearing in both your webinar list and your purchased list.

My CRM already has 10,000 contacts. Will importing a deduped CSV create duplicates with existing records?

The CSV Deduplicator only deduplicates within the CSV file itself — it does not know about records already in your CRM. For CRM-aware deduplication, export your existing CRM contacts, merge them with your new leads CSV, deduplicate the combined file, then import only the net-new records (those not already in the CRM).

How do I keep the most complete record when there are duplicates?

Sort your CSV before deduplicating so the most complete record is first. The tool keeps the first occurrence of each duplicate group. If you sort by a "completeness score" column, or just ensure the most data-rich records come first, those will be the ones kept.

What if I want to merge duplicate records rather than delete one?

The tool does not merge records — it keeps one and removes the rest. Merging duplicate records (combining fields from both rows into one complete record) requires a more advanced tool like a Python script or a CRM deduplication feature. Most CRMs offer native merge for duplicates already in the system.

Jennifer Hayes
Jennifer Hayes Business Documents & PDF Writer

Jennifer spent a decade as an executive assistant and office manager handling every type of business document imaginable. She writes about PDF tools and document workflows for professionals who need reliable solutions without enterprise pricing.

More articles by Jennifer →
Launch Your Own Clothing Brand — No Inventory, No Risk