Excel Files: Search columns and find ALL matches

Products and tips

Moderator: Site Mods

Post Reply
qz33
Member
Member
Posts: 27
Joined: 2008 Feb 15, 17:04

Excel Files: Search columns and find ALL matches

Post by qz33 »

Hello everyone.


I would like to compare some columns of names. There are over 100,000.

So far I have used the MATCH function but it only gives me the first match and then stops. How can I get the references of all the matches that might exist? Also I cannot figure out how to stop MATCH from matching the name from the line itself. In other words the cell that I use to reference the name value is matching itself.

Any suggestions for a start? I am sure there has to be some advanced programs for this kind of thing. I am sure you guys need more than what I have provided and I will give whatever you need.

I'll take whatever you have.
User avatar
fgagnon
Site Admin
Site Admin
Posts: 3737
Joined: 2003 Sep 08, 19:56
Location: Springfield

Post by fgagnon »

The MATCH function is ill-suited for what is sounds like you want to do.
But you don't say what you really want to accomplish.
Why do you think getting the cell references will solve your (unstated) problem?
It may be that a database tool is more appropriate than Excel, or there is a totally different approach.  Probably it would be better if you describe the problem at a higher level and solicit ideas for how to approach the solution, rather than assuming you want to use a MATCH-like function.
narayan
Platinum Member
Platinum Member
Posts: 1430
Joined: 2002 Jun 04, 07:01

Post by narayan »

If it is only the names you are after, take a text file of both lists
* one name per line; and
* both lists must be sorted using the same key

now use kdiff3 (freeware) to compare them. It will highlight the matches and ALSO the non-matches (List has it but List B doesn't have it; and vice versa) with color coding.

Another trick is to insert both lists in Excels one after another. Insert an extra column to identify source (e.g. ListA amd ListB). Notw create a pivot table for "count of names". If there are any duplicate names, the count will be 2. Otherwise the count will be 1. You can hide duplicated names or unique names, and copy+paste the others in a new list.

As Fred said, I too am not sure what you want, but you can experiment around this.
qz33
Member
Member
Posts: 27
Joined: 2008 Feb 15, 17:04

Post by qz33 »

Ok, here is what I am trying to do.

I have 3 lists.  One is a list of around 6000 employees.  I have 2 other security lists: one has about 45,000 names and the other with 60,000.  I want to see if anyone from the employee list is on either other list.


I reduce all 3 files to contain the same columns of information, which is one column each for LAST NAME, FIRST NAME, MIDDLE NAME, and DATE OF BIRTH.

I want to see if any name appears in the other lists.  What I have been doing the past couple of days is copying all the names from a security list, changing the font color to red, and pasting into the employee list.  

Then I sort the list by LAST NAME, then by FIRST NAME.

Then I look through the list for any last names that are both in black and red.  If those match I then look at the first name, middle name, and date of birth for a complete match.


After 1 list is complete I reopen the plain employee list and do the exact same thing with the other list.  Sometimes I post both lists into the Employee list at the same time and use green and red font colors.

Now another trick is that I am supposed to check and see if a first and last name are transposed and for slight variations of spelling like Marylou vs. Mary-lou.  This part can wait though if I can just get the above search to work it will be a start.
narayan
Platinum Member
Platinum Member
Posts: 1430
Joined: 2002 Jun 04, 07:01

Post by narayan »

You need to use similarity function, where a near-equivalent name is shown as a possibility (e.g. the soundex function will detect names that SOUND similar). The final result would be manual.
qz33
Member
Member
Posts: 27
Joined: 2008 Feb 15, 17:04

Post by qz33 »

Can you please give me some more information about the soundex function?  How can I intergrate this into excel?
narayan
Platinum Member
Platinum Member
Posts: 1430
Joined: 2002 Jun 04, 07:01

Post by narayan »

It is used to compare two strings by their sound (regardless of their spelling). So names with spelling variations are quickly found.

There are multiple variants; and their actual implementations would be different. But in general, it will produce a value that indicates how close the two strings are by their sound.

A Google with "Soundex Excel" yields excel sheets with that function.
More here:http://en.wikipedia.org/wiki/Soundex

Hope that helps you.
All the best!

(And don't forget to post the solution you will be finding! )
pj
Gold Member
Gold Member
Posts: 477
Joined: 2006 Jan 26, 14:01
Location: Florida

Post by pj »

QZ:  I highly recommend you visit the relevant Yahoo Groups on Excel and Excel VBA for recommendations.  The ExcelVBA group is probably the best for on-topic messages instead of spam and OT junk.  It's moderated by a couple of MVP's so the content is very good.

You will get this problem in front of many knowledgable experts in both native Excel functions and creating custom Visual Basic for Applications (VBA) solutions to problems just like yours.

Narayan and Fgagnon have offered great ideas, and I don't want to dismiss them.  I only wish to suggest other resources you can use to come to a solution.

The only thing I would add is to look at a custom function or VBA routine that would scan the list and return the count and location of the matches, or even possibly clean up the list for you in the process.

--------------

PJ in (sunny and warm!) FL
qz33
Member
Member
Posts: 27
Joined: 2008 Feb 15, 17:04

Post by qz33 »

These have been great responses.  Every single one has helped me.  


Thank you all so far.
Post Reply