Add search field priority to Elasticsearch (works for every Lucene based framework)
Last week a ‘bug’ was filed where the end users of our application wanted search results with a match on the name given more priority than a match on the address (we’re talking about searching for a company). Since I used Lucene a lot I thought it was just ‘boosting’ the name field. It appeared to be a bit more difficult. Maybe because Elasticsearch behaves differently, but probably because my Lucene knowledge has some rusty colorations.
First part of the solution is searching on both the _all and name field. When there is a match on name there also is a match on _all, which I thought would get a higher total score.
The problem with searching on multiple fields is that Lucene does something called query normalization (the queryNorm in the explanation). Fiddling with the boost factor won’t help much because then the queryNorm also changes.
When there was no query normalization there also would be the problem of (MATCH) max. When there are multiple search fields the one with the highest value counts, this means that there still is no difference between searching on any of the _all fields or on name.
What we want to do is a sum.
How do I know all this stuff? Just enable explain on your query and this will show you a lot of useful information!
Replacing MAX by SUM is done by disabling use of the DisjunctionMaxQuery (also known as disMax or dis_max):
When the property is false you will see (MATCH) sum in the explanation instead of (MATCH) max.
I’m sure this is not the perfect solution but it’s working fine for our project. his solution works without boosting, but I can imagine that when using multiple fields you want to fiddle aroud with the boost factor.
Unfortunately I had too many tabs open and forgot to mark the one with the golden tip. I think it’s important to give credits to the rightful owner, but I can’t figure out who it is. So once again: someone else found the first part of the solution, I just built my solution on top of that.
ElasticSearch – Dis Max Query
ElasticSearch – Explain