Sift Data cleaning for CRM imports

Comparison

Should you paste your contact list into ChatGPT to clean it?

ChatGPT is a genuinely useful tool, and for some data chores it's the right one. But cleaning a real contact list before a CRM import is not one of them. Three things get in the way: the data leaves your device, the results are nondeterministic, and there's no before/after diff to audit. Here's an honest account of what ChatGPT does well, where it breaks, and when it's still the better pick.

Clean your list in Sift, free →

What ChatGPT is good at

Credit where due. ChatGPT is excellent for ad hoc, one-off reshaping when you're poking at a file interactively. It'll write you a formula or a script that you run yourself, explain a cryptic import error in plain English, and reshape a small, non-sensitive sample on the spot. For exploratory work and for generating the code you'll run locally, a language model is often the fastest way to an answer.

Where it breaks for a real contact list

1. The data leaves your device

Pasting a contact list into ChatGPT sends real names, emails, and phone numbers off your machine to a third party. One analysis found sensitive corporate data in more than 20 percent of files uploaded to tools like this. You usually don't have the consent to share that data, and this is exactly the objection practitioners raise. When someone posted an upload-your-CSV cleaner in a CRM community, the first replies were "What do you do with all of the data that is submitted?" and, to the promise that files are "deleted after processing", a flat "That seems like totally applicable with GDPR. *sarcasm*". There's even a widely shared dev.to piece titled "Stop Uploading Your CSV Files to Random Tools". The instinct is right.

2. It's nondeterministic

A language model doesn't apply fixed rules; it predicts plausible output. That means it can silently alter values, drop rows, or invent rows that were never in your file, and it can give you a different answer each time you run the same prompt. Worst of all, it hands you a result with no before/after diff, so you can't see what it changed or verify that it left the rest alone. For a file you're about to push into a CRM, "probably right" is not good enough.

3. It can't guarantee it touched every row

Paste a large file and the model will happily process the top of it and summarize the rest, or quietly truncate. There's no guarantee every row was actually cleaned, and no count you can trust telling you how many rows it saw. On a list of a few thousand contacts, that gap is where the errors you never notice hide.

Paste into ChatGPTClean in Sift
Data sent to a third partyProcessed on your device, nothing uploaded
Nondeterministic, different each runDeterministic rules, same result every time
No before/after diff to auditEvery change is a rule you approve with a diff
May skip rows in a large fileEvery row processed, with a change log

The same job in Sift

  1. Drop the file in. Sift profiles every column in your browser; nothing is uploaded and it works offline once loaded.
  2. Approve each cleanup with a before/after diff: trim, name-safe casing, broken-email repair, mojibake repair, dates standardized, phones normalized to E.164, countries and postcodes standardized. Every change is a deterministic rule, no AI.
  3. Dedupe exactly and fuzzily, then merge duplicates into one golden record with survivorship rules instead of deleting rows.
  4. Map to your CRM's template (HubSpot, Salesforce, Pipedrive, Dynamics 365, Zoho, Mailchimp) and run the import pre-flight: required fields, types, and allowed values, flagged before you import.
  5. Export a clean file, a change log, and a "needs your eyes" list, and save the whole thing as a reusable pipeline.
Privacy note: Sift is a static web app with no backend. Your file is processed entirely on your device, never uploaded, and it works offline once the page has loaded, which you can verify by disconnecting your internet and watching the cleaning still run.

When ChatGPT is the better tool

The honest split most people land on: ChatGPT to explore and to write the code, a deterministic local tool for the actual clean-dedupe-map-check step on real customer data.

Related guides