Home > English, java, work > Add search field priority to Elasticsearch (works for every Lucene based framework)

Add search field priority to Elasticsearch (works for every Lucene based framework)

Last week a ‘bug’ was filed where the end users of our application wanted search results with a match on the name given more priority than a match on the address (we’re talking about searching for a company). Since I used Lucene a lot I thought it was just ‘boosting’ the name field. It appeared to be a bit more difficult. Maybe because Elasticsearch behaves differently, but probably because my Lucene knowledge has some rusty colorations.

First part of the solution is searching on both the _all and name field. When there is a match on name there also is a match on _all, which I thought would get a higher total score.
The problem with searching on multiple fields is that Lucene does something called query normalization (the queryNorm in the explanation). Fiddling with the boost factor won’t help much because then the queryNorm also changes.

When there was no query normalization there also would be the problem of (MATCH) max. When there are multiple search fields the one with the highest value counts, this means that there still is no difference between searching on any of the _all fields or on name.
What we want to do is a sum.
How do I know all this stuff? Just enable explain on your query and this will show you a lot of useful information!

Replacing MAX by SUM is done by disabling use of the DisjunctionMaxQuery (also known as disMax or dis_max):
queryStringQueryBuilder.useDisMax(false);
When the property is false you will see (MATCH) sum in the explanation instead of (MATCH) max.

Conclusion

I’m sure this is not the perfect solution but it’s working fine for our project. his solution works without boosting, but I can imagine that when using multiple fields you want to fiddle aroud with the boost factor.

Sources

Unfortunately I had too many tabs open and forgot to mark the one with the golden tip. I think it’s important to give credits to the rightful owner, but I can’t figure out who it is. So once again: someone else found the first part of the solution, I just built my solution on top of that.
ElasticSearch – Dis Max Query
ElasticSearch – Explain

Categories: English, java, work Tags: ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: