Extrapolating Merchant Names from Transaction Descriptions

4 min readJan 21, 2021

Introduction

At FinGoal, we build tools for Banks, Credit Unions, and FinTech developers that analyze consumer credit and debit card transactions to Find Money in their existing spending patterns. One of our data science team’s recent projects was to group similar merchants in order to suggest money saving alternatives to customers based on their shopping history. However, a major hurdle we encountered was handling differences in how merchant names are presented in transaction descriptions.

This article provides an overview of the data cleaning process used to extract merchant names from financial transaction descriptions. Our data consists of transactions from a variety of financial institutions, each with their own system for formatting transaction descriptions. For example, each of these transaction descriptions denote a transaction that took place at King Soopers, a regional grocery store:

‘king soopers 5135 lafayette us’
‘king soopers xx 995 s. h’
‘king soopers 0008’{ “king soopers”: “King Soopers” }

In this article, I explain the steps we took to create the partial string dictionary and map merchant names based on it.

Preprocessing the Data

Before beginning the process of extracting merchant names from transaction descriptions, I preprocessed the data to remove transactions that were not useful. Since we were only interested in purchases of goods and services, I reviewed all of the transaction categories and dropped all of the transactions with a category that did not fit our needs, mostly categories related to income and banking. I also reviewed the data and dropped any transaction with a description that was too broad to be useful — descriptions like “direct withdrawal,” “interest payment,” and “purchase interest charge.” Once these transactions were removed from the data, I was ready to start the merchant mapping process.

Determining Most Common Merchants

To create a dictionary that could map partial strings from the transaction description to a merchant name, I began by reviewing the transaction descriptions using Jupyter Notebook. For each transaction category, I created a dictionary using the transaction description as the key and the number of times that transaction description appeared as the value and limited the results to the fifty transaction descriptions that appeared most frequently.

This process returned a list of transaction descriptions I could use to create the partial string dictionary for mapping transaction descriptions to merchant names:

Even though the transaction descriptions and formatting vary, many of the merchants represented by these descriptions are actually the same. This is an instance where the task is very easy for a human to understand, but very difficult for a machine learning model. I was able to review the descriptions returned in the Jupyter Notebook and use the information to create a partial string dictionary where the common string in all transactions for a specific merchant (e.g. “king soopers”) is the key and the merchant name we want to return for those descriptions is the value (“King Soopers”). By creating this dictionary, we can remove all of the duplicate versions of the same merchant name and make it easier for the model to understand that King Soopers is only one merchant, no matter how many ways the transaction description is written.

Mapping Merchant Names

Once the partial string dictionary was created, I created a new column in the data frame and named it “suggested_merchant” with an initial value of “unknown.”

To map a suggested merchant to the transaction using the partial string dictionary, I created a function to iterate through the data frame and, for each row, check to see if any of the keys in the partial string dictionary were included in the transaction description. If they were, the “suggested_merchant” column was updated with the value from the partial string dictionary.

At the end of the process, I dropped all of the rows where the “suggested_merchant” column still had a value of “unknown,” since those transactions were less useful to our model.

Conclusion

At the end of this process, I was able to match a merchant name to over 113,000 transactions from an initial dataset of about 212,000 transactions and the process gave our data science team enough clean data to begin training a merchant recommender model. The initial setup of creating the partial string dictionary was very time consuming and it will need to be re-evaluated and possibly added to as we add more data, but it gave our data science team a good starting point to begin training models and understanding our transaction data.