top of page
  • Writer's pictureAdmin

Fuzzy Lookup in Excel

Have you been in a situation where Vlookup/ Hlookup/ Lookups function just don't work for you? Because the dataset you have does not provide an exact matching key/ texts, but rather just having similar texts (such as "George Street B no 31" vs "George Street no 31, B"). If that's the case, then you might want to take a look at Fuzzy Logic addin developed by Microsoft to solve the problem for you. It's free of use, so just try it out.

The addin can be used starting from Excel 2007.

Check out this link to download the add-in:

https://www.microsoft.com/en-us/download/details.aspx?id=15011


Note that as part of the download package, Microsoft also provides you with a sample data set you can use for practice; however, the example presented below uses a different data set.


Below is the guidance on how you'll use the add-in:

CREATING A FUZZY LOOKUP

Keep in mind that the Fuzzy Lookup Add-in matches inexact textual data and allows you to join data from multiple tables into one. Therefore, for your fuzzy lookup to work, you must first convert your data ranges to Excel tables.

Figure 1 presents two tables we will use for the basis of our fuzzy lookup. SSN is the name of the table on the left and Comp is the name of the table on the right. We would like to match the Name field in SSN to the Name field in Comp and create a results list that shows employees' names, Social Security numbers, and compensation. The challenge we face is that not all of the names are spelled and arranged the same way; therefore, if we attempt to use a VLOOKUP function to complete this task, we will not be successful. For example, consider the name Jan Kotas on row 4 of SSN. In looking at row 8 of Comp, we see that what should be the matching name is arranged as Kotas, Jan; thus, a VLOOKUP function would not match the two names together. Likewise, Robert Zare on row 8 and Thomas Axen on row 13 of SSN are not exact matches to Robert Zare, III on row 11 and T G Axen on row 17 of Comp, again presenting problems if we try to match based on traditional methods.

FIGURE 1 - TABLES FOR FUZZY LOOKUP


However, we can match the data in these two tables using the Fuzzy Lookup Add-in. To do so, click Fuzzy Lookup on the Fuzzy Lookup tab of the Ribbon to open the Fuzzy Lookup task pane on the right side of the window shown in Figure 2. Upon doing so, Excel automatically senses and inserts the names of the tables into the Fuzzy Lookup task pane. Excel also automatically analyzes the columns in the tables and joins the tables if a column in one table has the same column header as a column in the other table. Of course, you can edit both the table and column selections, if necessary.

FIGURE 2 - CREATING A FUZZY LOOKUP IN EXCEL


In the Output Columns section, check the box next to each field you want included in your results list. In this example, we are indicating that we want the Name and SSN fields from the SSN table and the Compensation field from the Compensation table to be included in the output. We are also indicating that we want the system-generated Similarity score to be included. The Similarity score provides a measure of how "confident" Excel is in its matching of inexact data. With these processes complete, all that is left to do is click Go. Upon doing so, the Fuzzy Lookup Add-in analyzes the data in the two tables and provides the results list pictured in Figure 3.

FIGURE 3 - SAMPLE FUZZY LOOKUP RESULTS


A close inspection of the data in Figure 3 reveals that the tool worked well, though not flawlessly. In particular, it did not find matches for Mariya Sergienko and Thomas Axen. Returning to the Fuzzy Lookup task pane, let us modify the Similarity Threshold setting to see if reducing the confidence level slightly increases the number of matches. To do this, drag the Zoom Slider to the left so that a smaller value – in this case, 0.4 – is in place.

FIGURE 4 - MODIFYING THE SIMILARITY THRESHOLD FOR A FUZZY LOOKUP


After modifying the Similarity Threshold, click Go to generate a new set of results. Upon doing so, as shown in Figure 5, the Fuzzy Lookup Add-in matches all of the data in the two tables and does so without any errors.

FIGURE 5 - COMPLETED FUZZY LOOKUP EXAMPLE


SUMMARY

When faced with the task of matching inexact data, including names and addresses, many Excel users expend vast amounts of time cleaning up the data before using traditional techniques such as VLOOKUP. With the Fuzzy Lookup Add-in for Excel, that need not be the case. By downloading and installing this free tool from Microsoft, we now have an extremely powerful and efficient means for matching data when perfect matches do not exist.


Below is the video demonstration on how to use the Fuzzy Lookup add-in.

reference: k2e.com

212 views0 comments

Recent Posts

See All
bottom of page