I am currently running a Manticore Cluster on Kubernetes. Unfortunately, my pods are sometimes restarting, which means that they loose all their data. I’ve managed to get them sync again with an entrypoint.sh script and, in order not to loose data (and to avoid costly syncs), I’ve bound a Persistent Volume on each node.
I still don’t understand why my pods are sometimes crashing (they use RT index). And the logs are hard to interpret because they’re not timestamped. It would be nice if manticoresearch’s output logs where prefixed with a datetime, in order to sort out what’s happening.
Here is my actual output of previously crashed pod using the following command :
kubectl logs manticore-xxxx-pod-name --previous
starting daemon version '3.5.4 13f8d08d@201211 release' ... listening on all interfaces for mysql, port=9306 listening on 10.244.6.188:9312 for sphinx and http(s) accepting connections prereading 0 indexes prereaded 0 indexes in 0.000 sec WARNING: Could not open state file for reading: '/var/lib/manticore/grastate.dat' WARNING: No persistent state found. Bootstraping with default state WARNING: Fail to access the file (/var/lib/manticore/gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown rt: index digsty: diskchunk 2(1), segments 32 saved in 2.220 sec
Here is the current pod’s output :
kubectl logs manticore-xxxx-pod-name
[Wed Dec 30 11:07:55.724 2020]  using config file '/etc/manticoresearch/manticore.conf' (626 chars)... [Wed Dec 30 11:07:55.728 2020]  Set max_open_files to 1048576 (previous was 1048576), hardlimit is 1048576. starting daemon version '3.5.4 13f8d08d@201211 release' ... listening on all interfaces for mysql, port=9306 listening on 10.244.6.188:9312 for sphinx and http(s) Manticore 3.5.4 13f8d08d@201211 release Copyright (c) 2001-2016, Andrew Aksyonoff Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com) Copyright (c) 2017-2020, Manticore Software LTD (http://manticoresearch.com) precaching index 'suggests' precaching index 'digsty' binlog: replaying log /var/lib/manticore/binlog.001 binlog: index digsty: recovered from tid 732 to tid 4289 binlog: index suggests: recovered from tid 0 to tid 167 binlog: replay stats: 451252 rows in 4456 commits; 0 updates, 0 reconfigure; 0 pq-add; 0 pq-delete; 2 indexes binlog: finished replaying /var/lib/manticore/binlog.001; 247.9 MB in 4.779 sec binlog: finished replaying total 1 in 4.781 sec prereading 2 indexes WARNING: no nodes found, created new cluster 'profiles' prereaded 2 indexes in 0.248 sec FATAL: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 . FATAL: replication connection failed: 7 'error in node state, must reinit' accepting connections