Ok have put what I have been able to recover
opened 10:39AM - 25 Aug 22 UTC
closed 01:01AM - 02 Nov 22 UTC
waiting for reply
wontfix
**Describe the bug**
Was unable to manually issue 'JOIN CLUSTER' following a po… d failure.
I THINK it might of been memory related. See
https://forum.manticoresearch.com/t/when-a-node-joins-a-cluster-what-happens-if-alrady-local-indexes/1155
for context. When rejoining the cluster the worker pod still had local indexes. Possibly the replication needed too much memory to resync the files on JOIN'ing.
So searchd was killed for using too much memory, rather than actually crashing.
Once I deleted the local indexes, was able to join the cluster successfully.
**To Reproduce**
**Describe the environment:**
- Manticore Search version: 5.0.0 b4cb7da02@220518 release
- OS version: Manticore Search Helm Chart Version: 5.0.0.2
**Messages from log files:**
https://staging.data.geograph.org.uk/facets/manticorert2.2022-08-24.log.filtered.txt
This is the entireity of the searchd.log being able to recover (complicated as the pod puts query_log into the same stream, so had to filter out queries - we have multi-line queries, so tricky!)
The first KILL is known. That is when I inserted too much data, the rt_mem_limit exceeded the resources.limit for the worker pod.
The second KILL is when I tried to get the worker pod to rejoin the cluster manually.
The 'drop gridprefix' syntax error is when searchd has come back just after the second KILL. It was me attempting to delete the local indexes to retry joining the cluster.
I dont know why the logs end there. I get nothing after that.
**Additional context**
[Add any other context about the problem here.
In case you've faced a crash what `indextool --check` returns.](https://forum.manticoresearch.com/t/when-a-node-joins-a-cluster-what-happens-if-alrady-local-indexes/1155)
To be honest, suspect it searchd was killed by the OS when issued the JOIN, probably for using too much memory.
Going to try spinning up to clusters, one with more memory, so can do all the data insertion testing.
And another I can then intentionally try to crash
Also going to start logging searchd.log seperately to query.log
opened 09:52AM - 25 Aug 22 UTC
waiting for reply
I know we can configure this, be redefining values.config.content, but wonder if… there is virtu in piping searchd to `stderr` by default?
(leaving query log to `stdout`)
log = /dev/stderr
https://github.com/manticoresoftware/manticoresearch-helm/blob/master/chart/values.yaml#L66
... This makes it much easier to inspect the logs, in systems that can separate stderr/stdout streams. We use loki to ingest logs from containers.