My main goal is to give higher weight to exact hits.
I would like to recreate the behavior of the old Sphinx version (2.1.9).
The Manticore (6.0.4) configurations (short) look like this:
dict = keywords
index_exact_words = 0
expand_keywords = 0
min_word_len = 3
min_infix_len = 3
Here is a simplified query - it searches for the word “sport” as exact/prefix/infix variants:
select field1, weight()
MATCH('
@(field1) (sport|sport*|*sport*)
')
option
ranker=expr('sum(word_count*user_weight)'),
field_weights=(field1=100)
Result table with search weights for both engines:
-----------------------------------------------------------------------------
ID | field1 | Manticore (new) | Sphinx (old)
-----------------------------------------------------------------------------
10 | sunglasses sports, glasses sport design | 200 | 300
20 | Shower 250ml Sport for Men | 100 | 300
30 | Motorsport helmet | 100 | 100
40 | Sports glasses | 100 | 200
-----------------------------------------------------------------------------
My interpretation:
word_count in combination with the searchword brackets works different.
The problem is, that the exact hit “sport” has the same weight as “motorsport”.
With Manticore, each distinct term in the record that matches a search term from the parenthesis is added with the field weighting.
(ID 10: sports and sport => 2x100 = 200)
With Sphinx, each hit of the search terms is added once per field, regardless of how often the word form is present in the record field.
That means three possible variants (3 x 100=300) maximum value for this field.
(ID 10: sport but NOT sports => 3x100 = 300)
Is there a way to rank exact hits higher with the word_count factor (without index_exact_words/expand_keywords=1 setting)?