How can I configure the indexer so that the dot is treated as a character between numbers? I want “foo.bar” to be indexed as two words “foo” and “bar”, but “12.34” as a single “word” “12.34”. Thanks!
Probably using Manticore Search Manual: Creating a table > NLP and tokenization > Low-level tokenization to distinguish a period between numbers from a period between letters.
Then, regexp_filter
should replace the period with something, and that something should be in the charset_table
, so 12.34
is finally considered a single word.
Thanks! Does it mean I have to do the same later on when searching? That is, pre-process all search queries to replace 12.34
with 12 something 34
?
The regexp_filter
is run both during indexing, but also on the query too.
… so in theory as run in both places, should allow matching to take place.