How Recruiters Clean Candidate CSV Exports — Free Tool
Table of Contents
Recruiting data is some of the messiest CSV data you will encounter. ATS exports from Greenhouse, Lever, Workday, and similar systems come with names in all caps, inconsistent phone formats, email addresses that failed to normalize, and duplicate candidates from multiple sourcing channels. Then you merge that export with a LinkedIn sourcing list and an Indeed download — each with different column names and formatting conventions — and you have a genuinely messy dataset to work with.
The free CSV Data Sanitizer handles the common formatting fixes before you import the list anywhere. It runs entirely in your browser — candidate data (which is personal and often confidential) never leaves your device.
Common Formatting Issues in ATS and Sourcing CSV Exports
Each export source has its own formatting quirks:
ATS exports (Greenhouse, Lever, Workday, iCIMS):
- Candidate names exported in ALL CAPS or inconsistent mixed case
- Phone numbers in raw format (no separators, or multiple inconsistent formats)
- Email addresses with trailing spaces that prevent matching against existing records
- Empty rows between batches or at the end of the export
- Duplicate candidate entries when a candidate applied to multiple roles
LinkedIn exports:
- Names generally well-formatted but company names may be inconsistent
- Email addresses are usually not included (LinkedIn withholds them)
- Exported notes fields may have trailing whitespace from the text editor
Indeed and job board exports:
- Phone numbers vary — some applicants enter "(555) 123-4567", others enter "5551234567", others "555.123.4567"
- Email case is inconsistent — applicants enter their own email and capitalization varies
Merged lists (multiple sources combined):
- Duplicate candidates appearing under slightly different formats (different capitalization, trailing spaces)
- Inconsistent column order between files (handled by CSV Column Mapper, not sanitizer)
Which Fixes to Run on Recruiting CSVs
The CSV Sanitizer has six toggleable fixes. For recruiting data:
Trim whitespace — always enable. Trailing spaces in email addresses prevent matching against existing records in your CRM or outreach tool. Trimming is safe for all recruiting data.
Capitalize names — enable for ATS exports. The tool auto-detects columns with "name", "first", "last", or "contact" in the header and applies Title Case. Converts "JOHN SMITH" to "John Smith". Note: if your ATS export has a "Job Title" column, that will also be Title-Cased — check the preview to confirm this looks right.
Lowercase emails — enable. Standardizes email case for consistent dedup and matching. An outreach tool comparing "[email protected]" against its existing contact list as "[email protected]" will treat them as different people.
Format phone numbers — enable for outreach lists. Standardizes US numbers to (xxx) xxx-xxxx. Useful when the list is going into a dialer or SMS platform that expects a consistent format. The tool only formats 10-digit and 11-digit (leading 1) US numbers — international numbers are left unchanged.
Remove empty rows — enable. Blank rows between candidate records cause issues in CRM imports and outreach tool uploads.
Remove duplicate rows — enable with caution. Removes exact duplicate rows (same data in every column). Safe for straightforward dedup. If a candidate appears twice with slightly different data (one row has a phone, the other does not), they will not be detected as duplicates — you will need a dedicated dedup tool for that.
Sell Custom Apparel — We Handle Printing & Free ShippingCandidate Data Privacy and Local Processing
Candidate data is personal data. In many jurisdictions, it falls under GDPR, CCPA, or similar privacy regulations. Even where it does not, uploading a candidate list to a third-party server creates data exposure risk — you may not know where that data is stored, who can access it, or how long it is retained.
The CSV Sanitizer processes your file entirely in the browser using JavaScript. The file is read by your browser, cleaned in memory, and the cleaned result is offered as a download. Nothing is transmitted to any server. There are no accounts, no data retention, no analytics on your file content.
Close the tab when done and the data is gone. This makes it appropriate for candidate lists, employee data, and any other personally identifiable information that should not leave your device.
For context on how browser-based processing works and why it is private by design, see the Private CSV Cleaner guide.
Before Importing to Outreach Tools, CRMs, and Dialers
After cleaning the CSV, a few additional steps before import depending on your destination:
Outreach tools (Apollo, Instantly, Lemlist, Smartlead):
- Validate email addresses — outreach tools care deeply about bounce rates. Run the cleaned CSV through the Email Validator before importing. Catch invalid, disposable, and role-based addresses before they hurt your sender score.
- Match column headers to what the tool expects. Apollo expects "First Name", "Last Name", "Email" — your ATS export may use "firstname", "Candidate Email Address", etc. Use the CSV Column Mapper to rename headers before upload.
CRMs (HubSpot, Salesforce, Recruitee):
- Check for existing contacts before import. Most CRMs deduplicate on email — if the same candidate is already in the CRM, the import will update rather than create. With clean, lowercase emails this works correctly.
- Map custom fields. ATS-specific fields (rejection reason, pipeline stage) may not have direct equivalents in your CRM.
Dialers (RingCentral, Aircall, JustCall):
- Phone format matters — most dialers expect E.164 format (+1XXXXXXXXXX) or (xxx) xxx-xxxx. The sanitizer outputs (xxx) xxx-xxxx which is widely accepted.
Step-by-Step: Cleaning a Merged Candidate List
- Export from each source — ATS, LinkedIn, Indeed, spreadsheet
- Merge into one CSV — combine the files manually or with a tool. At this point the data is messy: inconsistent columns, mixed formatting, duplicates.
- Align column headers — use the CSV Column Mapper to rename columns across sources so they match. You want one "First Name" column, one "Email" column, not three differently-named variants.
- Run the CSV Sanitizer — with trim whitespace, capitalize names, lowercase emails, format phones, remove empty rows, remove exact duplicates all enabled
- Validate emails — run the sanitized file through the Email Validator
- Filter the list — use the CSV Row Filter to remove candidates from a suppression list (previous applicants you have already decided not to move forward, or people who have asked to be removed from outreach)
- Import to your outreach tool or CRM
This sequence turns a messy merged export into a clean, validated, filtered candidate list ready for outreach — without any code and without uploading sensitive data to third-party cleaning services.
Try It Free — No Signup Required
Runs 100% in your browser. No data is collected, stored, or sent anywhere.
Open Free CSV SanitizerFrequently Asked Questions
My ATS exports names in "Last, First" format. Will capitalize names fix that?
The capitalize names fix applies Title Case to the values in detected name columns. "SMITH, JOHN" becomes "Smith, John" — the capitalization is fixed, but the order is not changed. If you need to split or reorder name columns, use the CSV Column Mapper tool to handle the column restructuring.
Can I filter out candidates who are already in our database?
Not directly with the sanitizer. The sanitizer handles formatting. For suppression list filtering (removing rows that appear in another list), use the CSV Row Filter tool — enter your existing contact emails as the word bank and set the action to "Remove matched rows." This removes anyone already in your database from the new import list.
Is this GDPR compliant?
The tool does not store, transmit, or process any data on a server — everything runs locally in your browser. This means there is no data controller or processor relationship with the tool itself. Your own obligations under GDPR for storing and using candidate data are separate and depend on your organization's policies and the legal basis for processing that data.

