Blog

Speeding up database searches using an invisible column

Here at Mailcoach, we handle email lists with up to millions of subscribers. Keeping queries, and especially search queries fast is paramount for a good user experience.

One such query is for searching subscribers based on their email address or the domain of their email address. A typical search query looks like this:

SELECT *
FROM subscribers
WHERE email_list_id = 1
AND email LIKE '%example.com%'

We already have an index on the email column. However BTREE indices do not work when you’re searching on a suffix.

In this example the issue is the leading % in the LIKE query which results in a full scan filtering only on the email_list_id.

You can solve this by adding an extra column to the subscribers table called email_suffix, however we don’t want this column to show up in normal queries or when fetching subscribers, so it’s marked as “invisible”.

This is what the Laravel migration looks like for this column:

Schema::table('subscribers', function (Blueprint $table) {
    $table->string('email_suffix')
           ->after('email')
           ->nullable()
           ->invisible();

	$table->index(['email_list_id', 'email_suffix'], 'email_suffix_index');
});

Or in sql:

ALTER TABLE mailcoach_subscribers
  ADD COLUMN email_suffix varchar(255) INVISIBLE;
  
ALTER TABLE mailcoach_subscribers
  ADD INDEX `email_suffix_index` (email_list_id, email_suffix) USING BTREE;

When a subscriber is updated or created, we update this column with the value after the @ sign. You could generate this inside your database using a generated column, however we chose to keep the logic in our codebase and add it to a Laravel model hook instead:

protected static function booted()
{
    static::saved(function (Subscriber $subscriber) {
        $subscriber->update([
            'email_suffix' => Str::after($subscriber->email, '@'),
        ]);
    });
}

This now allows us to update our initial search query to the following:

SELECT *
FROM subscribers
AND (
  subscribers.id in (
    SELECT id
    FROM subscribers
    WHERE email_list_id = 1
    AND email LIKE 'example.com%'
  )
  OR
  subscribers.id in (
    SELECT id
    FROM subscribers
    WHERE email_list_id = 1
    AND email_suffix LIKE '%example.com'
  )
)

The two subqueries in this case allow each of the indices to be used to their full extend for fast results!

Ready to get started?