Batch Matching Tutorial

Overview

When you have more than a few Legal Entity Identifier codes to look up on lei-lookup.com it can save you time to use the batch matching function.  

The batch matching process works like this:

  1. Create a CSV file containing a list of LEIs names to match.
  2. Use the “BATCH MATCH” screen to upload the file to lei-lookup.com
  3. Download the results file containing the matched LEIs codes and corresponding LEI data.

Example 1 - Simple Name Matching

First we create a CSV file with just the Entity names we are looking for. We will see later that there are advantages to providing more information, but for now we will keep it simple. You will hopefully be able to extract these names from your own systems.  Here is what our example file looks like. It has 50 lines, The first line contains the column header “Entity name”, the other 49 lines contain names of the LEI’s we want to match:

To follow along with the tutorial. Download our example input file

To start the batch match, first log in to lei-lookup.com and select “BATCH-MATCH” from the top menu. Then drag the downloaded file onto the window where it says “Drag your CSV file onto here” (or use the browse button to browse for it)

Next you may see lei-lookup.com work through the lines of the file, then you will be presented with the this screen once the processing is complete. Click on the download button

 

You will see the screen below telling you that the batch matching is complete and that the file has been downloaded.  Note: How you see the downloaded file and where it is stored will depend on the browser you are using (in this example we are using Chrome and you can see the file as a button on the bottom left of the window called batchmatchresults.csv

If we open this results file in Excel you can see how the supplied entity name has been matched to a corresponding LEI code.  The results file provides you with quite a few columns of information about the entry that was matched. Below we can see an example - the column highlighted yellow is the Entity name we supplied in our csv file above. The columns highlighted in green show the matched LEI number, Legal Name, Registry Identifier and so on. Note: this shot only shows the first 5 columns for clarity.  You also can check to see how the names you supplied have matched against those found in the match.

 

You can see how it only takes a few minutes to find the LEI codes for potentially hundreds or thousands of entities.

(An example this results file is also available to download - batch match results file)

 

Matching Status and Score

When looking at the results in Excel you should scroll right as the last three columns are important. In the example below they are highlighted in red and show the status of the match. Each match is given a score based on how successful the match has been. In this simple case everything matched OK, so we got a good results.

 

Depending on the quality of the input data this may not always be the case.  The more information you can supply about the entity the better the matching result is likely to be and the higher the score. In this case the score is low - indicating a low level of confidence in the match.  This is purely because in this simple example we have only provided the entity name.  If you have the business registry identifier this can work well in some cases (in the UK for example this is the Companies House reference and so unique to a company), but also the address and postcode and country can all be used to help improve the match. We will look at this in more detail in our second example.

 

Example 2 - Improved matching

When providing the input csv file, you can provide as much information as you have for each entity.  You may be able to export this from one of your existing systems containing entity information. The more information the better chance of an accurate match.  The fields you can supply are shown in the batch matching screen. Only Entity Name is Mandatory.

In example 1 we got good matches but the confidence score was low because we only provided the entity name. In this example we will see how the data supplied affects the confidence score in the results file.

Input File

To follow along with this example download our  input file 

Go to the BATCH-MATCH screen as before and this time use the downloaded file match-example-2.csv. Download the results.

This is what the input file looks like (shown in Excel)

In this example you can see all the information that could be supplied: Entity Name,Registry Identifier,Address 1,Address 2,Address 3,City,State,Postcode,Country and you can see how in some cases we have all that information and in other cases only part of it.

Results

Taking a look at the results file in Excel:

(an example of the results file is also available to download - results file)

We can see how it works. Looking at the input file and the results for each of the rows we can see:

  • Row 2 - No Registry identifier is supplied but the address, postcode and country is - resulting in a confident score of 83.3% and a good match
  • Row 3 - We have the registry identifier and full address details resulting in a good match and score of 100%
  • Rows 4 & 5 - we have a registry identifier but no address. This still results in a good match and a score of 100% due to the matching algorithm favouring registry identifier (more details on this below)
  • Row 6 - we have an incomplete entity name “SPECTRIS” which could refer to any of the Spectris entities we matched in row 3,4 and 5 but as we have the registry identifier lei-lookup.com is still able to match this but has a slightly lower level of confience at 95.8% because the entity name does not completely match the Legal Name.
  • Row 7 - again the entity name is not a complete match for the legal name, and we have address details but no registry identifier so we have a lower score of 79.3%
  • Rows 8 and 9 show matching with only postcode and country producing scores in the 60%s
  • Row 10 has an incomplete match with no other data- the LEI in this case has still been matched but has a very low confidence of 14.1%
  • Row 11 is another complete match

Scoring Detail

To understand how the confidence score is built, here is the classification:

  • 83-100: name match was found based on Business Registry Identifier.
  • 66-83: name match was found based on address and ZIP or Postcode matches
  • 50-66: name match was found based on ZIP or Postcode matches
  • 34-50: name match was found based on other supplied address matches
  • 16-33: name match was found based on Country ISO Code matches
  • 0-16: name match was found but with no address or country match

The value within each range is determined by a ‘distance measure’ between the supplied entity name and the LEI Legal or Other Legal Names. Therefore a Score of 100 would represent an exact name match with an exact Business Registry Identifier.

Conclusion

You can see from the examples the more data you can include in your input CSV file the higher the score, and the more confident you can be that the correct entity has been matched.  Try to extract as much information as you can from your own systems to feed into the input csv file. Bear in mind that it may not always be possible to match the correct LEI from the information provided and you should always “eyeball” the results to check how good the confidence scores are. In the cases where they are low, check you really have the LEI you intended. In some cases you may have to revert back to a manual search and check, but hopefully for many of your lookups using the lei-lookup.com batch matching process should save a considerable amount of time and effort.