Realtime index: partial match no results, exact match works

I’m working on a php-powered site which uses plain indexes for a product catalog. They want to migrate the site to realtime indexes to avoid having to constantly reindex all the time to keep the catalog up-to-date. Part of migrating the site to realtime indexes involves figuring out why the catalog queries are returning incomplete data. The old plain indexes are now hit via a MySQLi connection and it works fine. However, when I have the catalog use realtime data, things get pretty weird when using match queries.

I’m using match queries for 2 operations:

  1. find departments based on their tree structure path so we can include child departments,
    and 2. word search. Duh. lol

Example match query:
SELECT department_path, COUNT(DISTINCT id) count FROM rt_catalog WHERE MATCH('(@department_path *Path3_*)') GROUP BY department LIMIT 50

Example data for pathes:

Example results from plain index:

Example results from realtime index:

As you can see, the child departments are missing when querying for partial matches.

The word search is also affected by this issue, though I am debugging the issue using the @department_path operator.

Both indexes are set up using the same wordforms file and same conf file settings. Has anyone else had this issue before? Can you share a known issue or fix please? I’m currently on Manticore v5 so a little behind on updates but not extremely… and I doubt this issue is caused by that. I’m not using the PHP API because its incompatible with their server. I rather keep everything in the same SQL syntax style, anyway, because that’s way easier to maintain. One thing I noticed that’s missing when I hit Manticore with SQL queries is facet results, though. If I use the facet queries in CLI, they work, so I tried match queries in cli as well… but I found these queries have the same results both in the CLI and on the php script.

could you check the output of the show meta right after the query to make sure what terms got matched?

also worth to check the output of the call keywords ( 'index_name', '*Path3_*', 1 as stats) to indexes to make sure query got expanded the same way

SHOW META data shows the expected results of the issue I described. The docs and hits are much less on realtime index. There is one odd behavior difference I can’t make since of:

Realtime index shows keyword[0] of “path3_*”
Plain index shows keyword[0] of “path3_

Notice the extra asterisk at the beginning on plain indexes. That’s suspicious.

I ran “call keywords” on each index for the given “*Path3_*” example and it is vastly different on plain indexes. The realtime index shows only the single “path3_” token, meaning only exact matches are found. Plain indexes show a bunch of partial matches tokenized as “*path3_*”. Is there anything that could prevent tokenization on realtime indexes?

it is hard to imagine the description you provided
it could be better to post output I asked to make sure that I understand your description well.

Please also posts
show table name settings

for both indexes

I successfully recreated the issue in a new realtime index.
The create table query is as follows:
CREATE TABLE rt_catalog (department int, department_path text, products_name text, manufacturers_name text, model text, upc text) engine='rowwise' blend_chars = '&' charset_table = '0..9, english, A..Z->a..z, _, U+2215, U+0022, U+0027, U+002E, U+002F, U+002D' wordforms = '/etc/manticoresearch/rt_catalog_wordforms.txt'

The wordforms file is currently empty for testing purposes.

Please see below screenshot for insert queries and select query results:

you should use min_infix_len option for infix search to work

1 Like

That was the missing piece of the puzzle! Thanks.