Strange things happened in our production environment. We have a cluster with two nodes. When we tried to add a third node to the cluster, we got errors in our application logs saying unknown local index(es) ‘rt_index’ in search request. The error repeated for every index in the cluster. However, when we went to the server and checked the same SQL request manually via MySQL client on both nodes, everything worked properly.
Once the third node had joined the cluster, all errors disappeared and application has started to work as it should. We tried to reproduce this problem on our development servers, but we didn’t see any errors while node was joining to the cluster.
No further errors were found in Manticore logs. Any thoughts what could the reason for this?
it could be better to restart nodes with --logreplication option then check replication events from donor and joiner nodes to get more info about issues at node join.
But joiner node restart could clean up node state and allow to refresh join process. As failure at joiner node on join the cluster could screw up node state that hard to figure out without checking replication events from daemon log.