When a node joins a cluster what happens if alrady local indexes?

I managed to crash a node on a cluster (inserted too much data, the combined total of rt_mem_limit, was larger than the resources limit, so container was forcibly restarted by k8s)

Now the replication is broken. The current service IPs:

manticorert-worker-0: 10.72.38.198
manticorert-worker-1: 10.72.42.238
manticorert-worker-2: 10.72.45.210

manticorert-worker-2 is the node that was restarted, and I think USED to have the IP 10.72.47.81

… manticorert-worker-2 wont start, because it cant contact 10.72.47.81 - its old IP!

# php scripts/runsphrt.php "show status like 'uptime'" | grep Value
0:                                    Value: 254864
1:                                    Value: 254933
2:                                    Value: 2334

Log from: manticorert-worker-2
[Wed Aug 24 14:19:26.878 2022] [42] WARNING: cluster 'manticore': no available nodes (10.72.47.81,10.72.43.101,10.72.45.210), replication is disabled, error: '10.72.47.81:9312': connect timed out;'10.72.43.101:9312': connect timed out

Frankly not sure what 10.72.43.101 is!

And each node have different list of nodes

# php scripts/runsphrt.php "show status like 'cluster%node%'"
0: Counter, Value
0:   cluster_manticore_node_state, synced
0:   cluster_manticore_nodes_set, 10.72.38.198,10.72.42.238,10.72.45.210
0:   cluster_manticore_nodes_view, 10.72.42.238:9312,10.72.42.238:9315:replication,10.72.38.198:9312,10.72.38.198:9315:replication

1: Counter, Value
1:   cluster_manticore_node_state, synced
1:   cluster_manticore_nodes_set, 10.72.47.81,10.72.42.238,10.72.45.210
1:   cluster_manticore_nodes_view, 10.72.42.238:9312,10.72.42.238:9315:replication,10.72.38.198:9312,10.72.38.198:9315:replication

2: success but zero rows returned

… so intend to run UPDATE nodes on instances 0 and 1. Possibly promote one to master for bootstrap

On 2, will have to run JOIN cluster. But as it already has local copies of all the indexes, wont JOINing fail? I guess I need to clear out the data folder, so can ‘start fresh’ (syncing data from either 0 or 1)

node that joins cluster replace index files by files from donor then reloads index.

during replace it uses SST file transfer (like rsync does - split file in chunks, calc hash of every chunk then transfers from donor chunks these do not match)

Well I kinda answered my own question. It crashes!

mysql -hmanticorert-worker-2.manticorert-worker-svc.staging.svc.cluster.local -P9306 -A --prompt='RT2>'

RT2>show status like 'cluster%';
Empty set (0.001 sec)

RT2>join cluster manticore at '10.72.42.238:9312';
ERROR 2013 (HY000): Lost connection to MySQL server during query
RT2>show status like 'cluster%';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    12
Current database: *** NONE ***

Empty set (0.003 sec)

RT2>show status like 'cluster%';
Empty set (0.001 sec)

RT2>show status like 'uptime';
+---------+-------+
| Counter | Value |
+---------+-------+
| uptime  | 17    |
+---------+-------+
1 row in set (0.001 sec)

RT2>show tables;
+-----------------+------+
| Index           | Type |
+-----------------+------+
| gridprefix      | rt   |
| gridsquare      | rt   |
| loc_placenames  | rt   |
| os_gaz          | rt   |
| os_gaz_250      | rt   |
| placename_index | rt   |
| snippet         | rt   |
+-----------------+------+
7 rows in set (0.003 sec)

RT2>drop gridprefix;
ERROR 1064 (42000): sphinxql: syntax error, unexpected IDENT, expecting FUNCTION or PLUGIN or TABLE near 'gridprefix'
### THis was my own typo, but can use this mistake to correlate with the logs!
### normal drop queries don't appear in query log, but the parse error does!

RT2>drop table gridprefix;
Query OK, 0 rows affected (0.005 sec)

###... dropped each one manually

RT2>show tables;
Empty set (0.001 sec)

RT2>join cluster manticore at '10.72.42.238:9312';
Query OK, 0 rows affected (3.323 sec)

RT2>show status like 'cluster%node%';
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Counter                      | Value                                                                                                                                           |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| cluster_manticore_node_state | synced                                                                                                                                          |
| cluster_manticore_nodes_set  | 10.72.45.210:9312,10.72.42.238:9312,10.72.38.198:9312                                                                                           |
| cluster_manticore_nodes_view | 10.72.45.210:9312,10.72.45.210:9315:replication,10.72.42.238:9312,10.72.42.238:9315:replication,10.72.38.198:9312,10.72.38.198:9315:replication |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.002 sec)

RT2>show tables;
+-----------------+------+
| Index           | Type |
+-----------------+------+
| gridprefix      | rt   |
| gridsquare      | rt   |
| loc_placenames  | rt   |
| os_gaz          | rt   |
| os_gaz_250      | rt   |
| placename_index | rt   |
| snippet         | rt   |
+-----------------+------+
7 rows in set (0.001 sec)

RT2>

If drop the local indexes, then can join the cluster!

(havent been able to access the logs from the pod yet, to find what searchd says)

could you create ticket at Github and provide searchd.log from node that crashed and donor node into that ticket?

Ok have put what I have been able to recover

To be honest, suspect it searchd was killed by the OS when issued the JOIN, probably for using too much memory.

Going to try spinning up to clusters, one with more memory, so can do all the data insertion testing.
And another I can then intentionally try to crash :slight_smile:

Also going to start logging searchd.log seperately to query.log

btw, whats ‘doner node’? Do you mean the node I tried to join the failed node TO?

Ie the one the the data would be resynced FROM?

yes donor node that node try to connect cluster select as donor to resync from

ok thanks. Have added the log from the worker node 1 (I entered its IP in the JOIN CLUSTER … TO )

It just sees the connection getting closed.