Full-Text Search with Entity Framework Core & SQL Server

Mahdi Karimipour
5 min readJul 3, 2022

--

The story began when our address database started growing, and we needed really fast queries over our addresses with exact-match criteria. This post highlights all the challenges we went through to get Full-Text search up and running using Asp.NET, Entity Framework and MS SQL Server.

When you want to look up an address in a table, you typically are looking for an exact match, so your customers can find their address quickly. For that scenario, the first choice would be the SQL Like (%) operator.

SELECT * FROM ADDRESSES WHERE FULLADDRESS LIKE ‘%1/18 CLIFFORD%’

The problem with the above query is, it only works on small number of records, but when you have millions and millions of addresses, this will take ages to return results to you, and hence it is not a solution you could use in an enterprise application.

Here is where Full-Text search comes into play. It provides you with a super fast query language allowing you to query large data sets super fast, however there are a bunch of parameters you need to get right so that it matches your requirements.

We will go through the detailed setup a bit later, but if I wanted to rewrite the above query using Full-Text Search, it will be like:

SELECT * FROM dbo.Addresses a WHERE CONTAINS(a.FullAddress, ‘“1/18 Clifford”’)

1. Exact Match in Full-Text Search

By default SQL discards commonly occurring strings from Full-Text search. This means it is likely your exact match won’t work, cause some of the strings have been removed from the Full-Text Search Index. Here is a quote from SQL Server Docs:

To prevent a full-text index from becoming bloated, SQL Server has a mechanism that discards commonly occurring strings that do not help the search. These discarded strings are called stopwords. During index creation, the Full-Text Engine omits stopwords from the full-text index. This means that full-text queries will not search on stopwords.

Source: Configure & manage stopwords & stoplists for Full-Text Search — SQL Server | Microsoft Docs

With that in mind, it is likely you won’t get exact matches for all the queries, if we have the stoplist enabled. We can however turn it off, and enable the exact match by including “ ” inside the ‘ ’. I will show you later how to disable the stoplist.

2. Free-Text vs Full-Text

These are the two types of searches available when it comes down to fast look-up over a pile of data in DB columns, Full-Text and Free-Text. While Full-Text scans the exacts words to find matches, Free-Text also scans the meaning of those words to see if there are any semantic matches available. For the purpose of address lookup, we use Full-Text.

In terms of operators, CONTAINS - as mentioned above - is used for Full-Text, and FREETEXT for Free-Text search.

3. Setup SQL Server for Full-Text Search

Every Full-Text search has a couple of elements that we are covering here:

  1. Catalogue
  2. Unique Index
  3. FullText Index

3.1 Catalogue

A full-text catalog is a logical container for a group of full-text indexes. You have to create a full-text catalog before you can create a full-text index.

A full-text catalog is a virtual object that does not belong to any filegroup, and here is the command to create a catalogue:

CREATE FULLTEXT CATALOG FTCAddress AS DEFAULT;

3.2 Unique Index

Once a catalogue was created, we would need a unique index to distinguish the results based on. For that reason, we simply can rely on the Primary Key of our table, which is a GUID column.

3.3 Full-Text Index

And finally, the full-text index, that we create using the Catalogue and the Unique index using the below command:

As you can see in the above command, we simply create the index on the FullAddress column of the Addresses column. Then we mention the Unique index and the Catalogue.

We talked about the role of STOPLIST, and here you can see how I have turned it off for the purpose of the work.

3.3.1 Maintain Full-Text Index

When the Full-Text Index is created, database starts populating the the Index, and generally an initial Full Population is required. However what happens when the data changes, rows get added or get removed?

There are three ways to readjust the Index using different method post the initial full population:

1- Full population

2- Automatic or manual population based on change tracking

3- Incremental population based on a timestamp

3.3.1.1 Full Population

Performing another full population is costly and consumes a lot of resources, and it impacts our searches while the population is taking place as the Catalogue is not usable until the population is done. Hence ideally we should rely on the first full population, and then use one of the below Change Tracking or Time Stamp methods.

3.3.1.2 Change Tracking

SQL Engine keeps track of all the changes internally by maintaining another table, and then applies those changes to the Full-Text Index once that is done. This has a small overhead of our database.

For our Address Lookup application, I have relied on Automatic Change Tracking to keep track of all the updates.

3.3.1. 3 Time Stamp

This methods needs a Time Stamp column to be added to our Address table, and track those stamps to find which rows have changed.

“An incremental population is an alternative mechanism for manually populating a full-text index. If a table experiences a high volume of inserts, using incremental population can be more efficient that using manual population.”

Populate Full-Text Indexes — SQL Server | Microsoft Docs

4. Entity Framework Setup

Now it’s time to set up our ASP.NET API and Entity Framework to run the Full-Text query.

As we are using Code-First, every change that goes into our database should be through codified Migrations managed by the ASP.NET API.

The first step in doing so is to create a database migration:

dotnet-ef migrations — startup-project ..\Api\LocationApi.csproj add FullAddressIndex — verbose

This create a migration in which we can inject all the Catalogue, and Full-Text index set up:

To keep my migrations clean, I am putting the actual SQL scripts in a separate file, and load them at the time of migration as embedded resources.

Here is the Roll Forward migration:

and here is the Roll Back migration:

Once run, the initial Full Population gets started, and after some time, the Full-Text Queries can be run against the database.

But how do you run a Full-Text Query from a .NET application?

5. Run Full-Text Query

To run Full-Text Queries, Entity Framework provides a set of helper functions to execute the Contains method using the EF construct as below:

And finally here is how to call this method to fetch all the records fast:

Other Options for Search

Once solving these challenges, we also looked at SQL Server Full-Text alternatives.

The first one was Azure Cognitive Search which comes with a hefty and expensive price tag, which grows quite quickly with the size of your data.

The free alternative was Apache SOLR, but the problem is to run and maintain it on your infrastructure and give it adequate love, it will take some time and effort.

--

--