unknown local index(es) 'rt_index' in search request

glukkkk · August 7, 2021, 7:35pm

Strange things happened in our production environment. We have a cluster with two nodes. When we tried to add a third node to the cluster, we got errors in our application logs saying unknown local index(es) ‘rt_index’ in search request. The error repeated for every index in the cluster. However, when we went to the server and checked the same SQL request manually via MySQL client on both nodes, everything worked properly.

Once the third node had joined the cluster, all errors disappeared and application has started to work as it should. We tried to reproduce this problem on our development servers, but we didn’t see any errors while node was joining to the cluster.

No further errors were found in Manticore logs. Any thoughts what could the reason for this?

tomat · August 7, 2021, 8:46pm

it could be better to restart nodes with --logreplication option then check replication events from donor and joiner nodes to get more info about issues at node join.

But joiner node restart could clean up node state and allow to refresh join process. As failure at joiner node on join the cluster could screw up node state that hard to figure out without checking replication events from daemon log.

glukkkk · August 9, 2021, 12:30pm

Where we can upload these logs? We already have the –logreplication flag enabled.

tomat · August 9, 2021, 12:54pm

you could upload them into write-only FTP for customer data:

ftp: dev.manticoresearch.com
user: manticorebugs
pass: shithappens

use for the folder name github-issue-XXX

glukkkk · August 9, 2021, 1:43pm

Uploaded to github-issue-465 (though it is not related to the issue 465, so please delete the folder after logs are downloaded).

Each log file is from a particular cluster node. We tried to join w3 node.