Dedupe
Find the fuzzy duplicates Excel can't see
Excel's Remove Duplicates only catches exact matches. So "Abraham" and "Abrahamm" both
survive, "Jon Smith" and "Jonathan Smith" with the same phone number both survive, and your
list stays quietly full of near-duplicates. The duplicates that actually pollute a CRM are
the ones a byte-for-byte comparison never sees. Sift finds them in your browser, groups them
for review, and merges each group into one record. Nothing is uploaded.
Find fuzzy duplicates in Sift, free →
The manual fixes (and why they hurt)
Excel can do fuzzy matching, but not out of the box. Every native route asks you to build
the matcher yourself before you can use it:
-
Microsoft's Fuzzy Lookup add-in. You install the add-in, build a
self-join of your table against itself, and tune a similarity threshold until it stops
missing real duplicates without flagging everyone. It works, but it is famously fiddly to
set up, and the threshold is guesswork on a list you haven't seen yet.
-
Power Query M code. The no-add-in route is to write M, comparing records
with a distance function and grouping the matches. That is real code to write, test, and
maintain, for a one-off cleanup most people just want done.
-
A paid add-in like Ablebits. Ablebits sells a duplicate finder that
handles similar entries. It is a solid tool, but it costs money and still lives inside
Excel, which means the same CSV-mangling and manual-merge problems once you export.
As one r/excel poster described it, they had a column of 40,000 names full of "typos or very
similar duplicate entries but they aren't exact match, like Abraham & Abrahamm". Remove
Duplicates does nothing for that. And even when Excel does find a match, it deletes the
second row rather than merging the best values from both into one record.
Near-duplicates Excel keepsOne merged record
Abraham Cohen / Abrahamm Cohen→Abraham Cohen (one record)
Jon Smith / Jonathan Smith (same phone)→Jonathan Smith, both fields kept
ACME / Acme Ltd (same domain)→Acme Ltd (canonical company)
jane@acme.com / Jane@Acme.com→jane@acme.com (one contact)
Find fuzzy duplicates in Sift
No add-in, no Power Query M code, no threshold to guess at blind. Sift groups similar records
for you and lets you review before anything merges:
- Load the file. Sift profiles every column in your browser; nothing is uploaded.
- Choose what to match on: email, phone, or a name-and-company fingerprint, depending on which field is your best identifier.
- Fuzzy matching catches the near-misses a byte-for-byte comparison drops: "Abraham" and "Abrahamm", "Jon" and "Jonathan", casing and spacing noise. It works on a similarity fingerprint, so it is not perfect, which is why the next step exists.
- Review each cluster with a before/after diff. You see the grouped rows side by side and decide whether they really are the same person or company before anything changes.
- Merge to a golden record with survivorship rules, keeping the phone from one row and the job title from another, instead of deleting a row and losing data.
- Export a clean, deduped file ready for your CRM import.
Privacy note: Sift is a static web app with no backend. Your file is processed entirely on
your device, which you can verify by disconnecting your internet after the page loads; the
matching still works. The opposite of pasting your customer list into a web tool or ChatGPT.
When Excel's exact Remove Duplicates is enough
Fuzzy matching is the wrong tool when your duplicates really are exact. If a list has literal
copy-paste repeats, the same row entered twice with identical values, then Excel's Remove
Duplicates handles it in one click and there is nothing to review. Reach for fuzzy matching
only when the duplicates are near-misses: typos, nicknames, casing, or the same person under
two slightly different spellings. Sift does exact dedupe too, so you can start with the safe
exact pass and only bring in fuzzy matching where the near-duplicates actually live.
Related guides