How to speed up querying for autocomplete fields with MongoDB
In a project I am working on I have an autocomplete input field on a form which searches in a collection of about 200000 documents of potential customers. Normal autocomplete logic is to call back-end on every key pressed. Server performs regular expression search on a database field (usually name of company) and returns first n documents which meet the criteria.
The problem which I encountered with usual approach was that the responses were slow. Response for particular example took half a second which was very noticeable when entering data. So, what can be done to speed up queries.
All examples shown are written in Ruby on Rails with Mongoid ODM for MongoDB. I have also added search_field to the model for first two examples. This field concatenates customer names with vat number and other identification numbers, which are used for searching customer.
1. Standard query
Standard query which is used for autocomplete field looks like this:
This is standard MongoDB regular expression query. Field "search_field'' is queried for value params[:search]. "search" is parameter passed through url. In this particular case the query takes around 500ms and lag before results are shown is very noticeable. Which is expected since all 200000 documents must be read into memory for regular expression queries.
2. Full text search index
Full text search is available in MongoDB since version 2.6. Full text search field can be simply defined in model:
and search can be performed in Mongoid with this expression:
This query takes 9ms which is unnoticeable when typing on the input form field. But full text search is not usable in this scenario, since it matches only values which exactly match query parameter. Unfortunately MongoDB FTS does not support searches of partial values.
3. Array field indexing
One of the field types which can be used in MongoDB is Array. And it happens so, that array fields can also be indexed. From here came an idea to break search field into individual words and save them as array. Indexing on array is defined same as any other Mongoid index
And searches can be performed like this:
Update: Regular expression queries have changed since mongoid v4.2.0. and the query looks like this since.
Pay attention to caret at the beginning of the regular expression. It forces query to start from the beginning of the word and thus uses array index. Without it, query takes about 50 times more. Searching from the beginning of the word is normal, because users always start entering name of the company and almost never look for words in the middle of the name. This query takes around 30ms and lag is not noticeable.
Below is the whole solution which also includes Mongoid model definition, clearing up search field from unwanted characters and words and is able to search for multiple words when more than one word is entered in the input field.
include Mongoid::Document
include Mongoid::Timestamps
field :name, type: String
field :id_vat, type: String
field :id_1, type: String
field :id_2, type: String
# .. lots of other fields
field :search_array, type: Array
index search_array: 1
before_save :prepare_search
#####################
# Will prepare search array_field before document is saved to database.
#####################
def prepare_search
# downcase
search = UnicodeUtils.downcase(name)
# remove unwanted characters and words
search.gsub!(/\,|\.|\)|\(|\:|\;|\?\"\'/,' ')
search.gsub!('ltd',' ')
search.gsub!('company',' ')
# ensure single blank between words
search.squish!
# add id numbers
search << " #{id_vat} #{id_1} #{id_2}"
self.search_array = search.split(' ')
end
#####################
# Create search query
#####################
def self.search_query(search)
# downcase search
search = UnicodeUtils.downcase(search.chomp)
# more than one search word
if search.match(' ')
# split words into array
a = search.split(' ')
# query foundation
query = self.where(search_array: BSON::Regexp::Raw.new("^#{a.first}") )
# and other query words
1.upto(a.size - 1) { |i| query = query.and(search_array: BSON::Regexp::Raw.new("^#{a[i]}") )}
query
else
# single word query
self.where(search_array: /^#{search}/)
end
end
#####################
# Query method required for text autocomplete. The query must return
# two dimensional array of names and ids. We also limit query to return
# only name and id fields.
#####################
def self.search_autocomplete(search)
self.search_query(search).only(:id, :name).limit(20).inject([]) do |result,value|
result << [value.name, value.id]
end
end
end
And this is all you need to put autocomplete field on DRG Form:
name: customer_id
type: text_autocomplete
size: 80
search:
table: customer
field: name
method: search_autocomplete
This solution is not only faster but is probably also a lot less stressful on the MongoDB server. Querying on regular expressions takes whole collection to be searched and read into memory. With this solution only array index is used.