Error occurs when using max_matches

glukkkk · October 6, 2020, 11:24am

We have two nodes in cluster. After creating a cluster and adding empty indexes to it, everything works well. However, after one index is filled, we get the following error:

mysql> SELECT un_sku_id FROM sku OPTION max_matches = 1;
ERROR 1064 (42000): unknown option ‘max_matches’ (or bad argument type)

The error occurs only on the first node. The second node returns proper results for this query. It happens when using version 3.5.2. With version 3.5.0 everything works properly.

adrian · October 6, 2020, 11:33am

Can you make a step-by-step scenario that reproduces this and open a ticket on github? I just tried on our replication course (Manticore tables replication) and I can’t reproduce it.

glukkkk · October 6, 2020, 11:39am

Unfortunately, the only way to reproduce it is to provide you SQL dump with ~400.000 insert queries, but this is sensitive data. We have no idea after which of these queries the problem occurs. The only thing we know is that the problem is not reproduced on an empty index.

It’s OK for us to stay at 3.5.0, if there is no other way to investigate the issue.

tomat · October 6, 2020, 12:27pm

could you provide output of desc sku and show index sku status commands after error happens?

alexey · October 6, 2020, 1:12pm

If you have debugger onboard and installed debug symbols package for release, you can provide backtrace to us.
That is most probably from searchdsql.cpp line 532. If you set breakpoint there (after attaching to running instance with gdb), and then reproduce the error (so that breakpoint will be hit), you can issue command ‘where’ and attach the output (of course, you can check if that output doesn’t contains your sensitive data).

glukkkk · October 7, 2020, 6:24am

Unfortunately, we do not have the ability to provide this information at the moment as long as we downgraded to version 3.5.0. If we upgrade Manticore in the future and experience the same problem, we will provide you with the necessary information then.

Thank you!

glukkkk · November 6, 2020, 3:21pm

Ok, we have experienced the problem again with Server version: 3.5.3 c419a324@201105 release git branch HEAD (no branch).

Index config is here: https://pastebin.com/raw/BjQGRk8v

tomat · November 6, 2020, 6:40pm

I build daemon from master branch Manticore 3.5.3 c419a3248@201105 dev and all works well

mysql -h 127.0.0.1 -P 8306 -vv -e "select grls_id from grls LIMIT 0,50 OPTION max_matches = 50;"                                                     --------------                                                                                                                                                                                  select grls_id from grls LIMIT 0,50 OPTION max_matches = 50                                                                                                                                     --------------                                                                                                                                                                                                                                                                                                                                                                                  Empty set (0.00 sec)

could you provide complete example that recreates issue locally?
could you use binary from dev repo that got built by our CI?

glukkkk · November 7, 2020, 4:00am

I’m not sure in which moment the problem occurs. When indexes are created from scratch and added to cluster, everything works well. The problem occurs suddenly when indexes are filling.

We use the same binary as you mentioned.

The problem does not occur with version 3.5.0. Only with 3.5.2 and higher.

tomat · November 7, 2020, 6:29pm

could you try suggestion from Alexey and provide that information?

glukkkk · November 9, 2020, 12:39pm

We can try, if you provide step-by-step instructions on how to achieve this.

glukkkk · November 12, 2020, 2:11pm

Could you assist please?

tomat · November 16, 2020, 7:24am

there was a fix at dc7285890c8816d53b19f791c25f37bce4eac1f1 of reduce \ shift at SphinxQL parser at option - could you try to use recent dev version of daemon and post here what package will you use?

I will provide step by step debug instruction for that package as GDB commands will use source code line numbers.

glukkkk · November 17, 2020, 6:45am

It seems that the fix has helped! The error has gone.

Thank you! I will let you know if something goes wrong.

glukkkk · November 17, 2020, 2:50pm

I have managed to reproduce the problem at Server version: 3.5.3 4367c82a@201116 release git branch HEAD (no branch). There were the following steps:

I created 12 RT indexes on a server, created a cluster and added these indexes to the cluster.
I added two additional nodes to the cluster.
Then I filled all indexes in the cluster.
After this I stopped Manticore on the 3rd node and cleaned data and binlog folders. Then I restarted the server.
After the server is restarted, I joined the 3rd node to the cluster again (as long as it became a vanilla Manticore instance with no indexes).
The node joined to the cluster properly, however the problem appeared again on this node. On the 1st and 2nd nodes the problem is not reproduce, the query is executed without any errors there.

Sergey · November 18, 2020, 4:29am

Can you please create an issue about this on github with detailed steps to reproduce it?

glukkkk · November 18, 2020, 4:37am

github.com/manticoresoftware/manticoresearch

Error occurs when using max_matches

opened 04:37AM - 18 Nov 20 UTC

closed 07:10AM - 19 Nov 20 UTC

glukkkk

_**mysql> SELECT un_sku_id FROM sku OPTION max_matches = 1; ERROR 1064 (42000):… unknown option ‘max_matches’ (or bad argument type)**_ The problem is reproduced on some nodes when using cluster. However, sometimes it is reproduced, sometimes not. **To Reproduce** 1. I created 12 RT indexes on a server, created a cluster and added these indexes to the cluster. 2. I added two additional nodes to the cluster. 3. Then I filled all indexes in the cluster. 4. After this I stopped Manticore on the 3rd node and cleaned **data** and **binlog** folders. Then I restarted the server. 5. After the server is restarted, I joined the 3rd node to the cluster again (as long as it became a vanilla Manticore instance with no indexes). 6. The node joined to the cluster properly, however the problem appeared again on this node. On the 1st and 2nd nodes the problem is not reproduced, the query is executed without any errors there. **Describe the environment:** - Server version: 3.5.3 4367c82a@201116 release git branch HEAD (no branch) - Centos 8