Поднял кластер мультимастер в k8s
worker:
replicaCount: 3
...
config:
path: /mnt/manticore.conf
content: |
searchd
{
listen = /var/run/mysqld/mysqld.sock:mysql41
listen = 9306:mysql41
listen = 9308:http
listen = 9301:mysql_vip
listen = $hostname:9312
listen = $hostname:9315-9415:replication
node_address = $hostname
binlog_path = /var/lib/manticore
query_log = /dev/stdout
query_log_format = sphinxql
pid_file = /var/run/manticore/searchd.pid
data_dir = /var/lib/manticore
shutdown_timeout = 25s
auto_optimize = 0
max_packet_size = 32M
net_workers = 4
seamless_rotate = 1
unlink_old = 1
watchdog = 1
max_filter_values = 10000
persistent_connections_limit = 256
}
Все хорошо работало: пока на worker-0 и worker-2 кончилось место. Место добавил, worker-0 поднялся и присоеденился к кластеру: а вот worker-2 не хочет ни в какую (убивал pvc чтобы запустить worker-2 пустым - ошибки те же)
Manticore 6.3.8 d17bd2b6b@24112202 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)
Mount success
2025-03-14 10:44:36,098 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, you can set user=root in the config file to avoid this message.
2025-03-14 10:44:36,101 INFO RPC interface 'supervisor' initialized
2025-03-14 10:44:36,101 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-03-14 10:44:36,102 INFO supervisord started with pid 43
2025-03-14 10:44:37,104 INFO spawned: 'quorum_recover' with pid 44
2025-03-14 10:44:37,106 INFO spawned: 'searchd_replica' with pid 45
[2025-03-14T10:44:37.135800+00:00] Logs.INFO: Replication mode: multi-master [] []
[2025-03-14T10:44:37.162518+00:00] Logs.INFO: Pods count 2 [] []
[2025-03-14T10:44:37.162551+00:00] Logs.INFO: Empty conf with more than one node in cluster [] []
2025-03-14 10:44:37,278 INFO spawned: 'searchd' with pid 50
[Fri Mar 14 10:44:37.290 2025] [50] using config file '/etc/manticoresearch/manticore.conf' (866 chars)...
starting daemon version '6.3.8 d17bd2b6b@24112202 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)' ...
listening on UNIX socket /var/run/mysqld/mysqld.sock
listening on all interfaces for mysql, port=9306
listening on all interfaces for sphinx and http(s), port=9308
listening on all interfaces for VIP mysql, port=9301
listening on 10.2.249.19:9312 for sphinx and http(s)
prereading 0 tables
preread 0 tables in 0.000 sec
accepting connections
[BUDDY] started v2.3.12 '/usr/share/manticore/modules/manticore-buddy/bin/manticore-buddy --listen=http://0.0.0.0:9308 --bind=127.0.0.1 --threads=2 --skip=manticoresoftware/buddy-plugin-sharding --skip=manticoresoftware/buddy-plugin-queue' at http://127.0.0.1:46437
[BUDDY] Loaded plugins:
[BUDDY] core: empty-string, backup, emulate-elastic, create, insert, alias, select, show, cli-table, plugin, test, alter-distributed-table, alter-rename-table, modify-table, knn, replace
[BUDDY] local:
[BUDDY] extra:
2025-03-14 10:44:38,355 INFO success: searchd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-03-14 10:44:38,355 INFO success: quorum_recover entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-03-14 10:44:38,356 INFO success: searchd_replica entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
[2025-03-14T10:44:38.373358+00:00] Logs.INFO: Wait until manticoresearch-worker-2 came alive [] []
[2025-03-14T10:45:39.134675+00:00] Logs.INFO: Wait for NS... [] []
[2025-03-14T10:45:40.158843+00:00] Logs.INFO: Wait until join host come available ["manticoresearch-worker-1.manticoresearch-worker-replication-svc",9306] []
[2025-03-14T10:45:40.161707+00:00] Logs.INFO: Check is cluster exist at manticoresearch-worker-1.manticoresearch-worker-replication-svc [] []
[2025-03-14T10:45:40.163045+00:00] Logs.INFO: Join to manticoresearch-worker-1.manticoresearch-worker-replication-svc [] []
WARNING: No persistent state found. Bootstraping with default state
WARNING: (78aea525, 'tcp://0.0.0.0:9315') address 'tcp://10.2.249.19:9315' points to own listening address, blacklisting
FATAL: failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)
at /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcomm/src/pc.cpp:connect():159
FATAL: /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -110 (Connection timed out)
FATAL: /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcs/src/gcs.cpp:gcs_open():1514: Failed to open channel 'usmall_cluster' at 'gcomm://manticoresearch-worker-1.manticoresearch-worker-replication-svc.manticoresearch.svc.cluster.local:9315,manticoresearch-worker-2.manticoresearch-worker-replication-svc.manticoresearch.svc.cluster.local:9315': -110 (Connection timed out)
FATAL: gcs connect failed: Connection timed out
/* Fri Mar 14 10:46:10.704 2025 conn 4 (127.0.0.1:50914) */ JOIN CLUSTER usmall_cluster at 'manticoresearch-worker-1.manticoresearch-worker-replication-svc:9312'312'9306 # error=replication connection failed: 7 'error in node state, must reinit'
[2025-03-14T10:46:10.704503+00:00] Logs.ERROR: Exception until query processing. Query: JOIN CLUSTER usmall_cluster at 'manticoresearch-worker-1.manticoresearch-worker-replication-svc:9312' . Error: mysqli_sql_exception: replication connection failed: 7 'error in node state, must reinit' in /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreMysqliFetcher.php:33 Stack trace: #0 /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreMysqliFetcher.php(33): mysqli->query() #1 /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreConnector.php(193): Core\Manticore\ManticoreMysqliFetcher->query() #2 /etc/manticoresearch/replica.php(219): Core\Manticore\ManticoreConnector->joinCluster() #3 {main} [] []
[2025-03-14T10:46:10.704503+00:00] Logs.ERROR: Exception until query processing. Query: JOIN CLUSTER usmall_cluster at 'manticoresearch-worker-1.manticoresearch-worker-replication-svc:9312' . Error: mysqli_sql_exception: replication connection failed: 7 'error in node state, must reinit' in /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreMysqliFetcher.php:33 Stack trace: #0 /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreMysqliFetcher.php(33): mysqli->query() #1 /etc/manticoresearch/vendor/manticoresoftware/manticoresearch-auto-replication/src/Manticore/ManticoreConnector.php(193): Core\Manticore\ManticoreMysqliFetcher->query() #2 /etc/manticoresearch/replica.php(219): Core\Manticore\ManticoreConnector->joinCluster() #3 {main} [] []
[2025-03-14T10:46:10.704694+00:00] Logs.ERROR: Error until query processing. Query: JOIN CLUSTER usmall_cluster at 'manticoresearch-worker-1.manticoresearch-worker-replication-svc:9312' . Error: replication connection failed: 7 'error in node state, must reinit' [] []
[2025-03-14T10:46:10.704694+00:00] Logs.ERROR: Error until query processing. Query: JOIN CLUSTER usmall_cluster at 'manticoresearch-worker-1.manticoresearch-worker-replication-svc:9312' . Error: replication connection failed: 7 'error in node state, must reinit' [] []
WARNING: No persistent state found. Bootstraping with default state
WARNING: (8b7ba2c9, 'tcp://0.0.0.0:9315') address 'tcp://10.2.249.19:9315' points to own listening address, blacklisting
FATAL: failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)
at /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcomm/src/pc.cpp:connect():159
FATAL: /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -110 (Connection timed out)
FATAL: /__w/manticoresearch/manticoresearch/build/galera-build/_deps/galera_populate-src/gcs/src/gcs.cpp:gcs_open():1514: Failed to open channel 'usmall_cluster' at 'gcomm://manticoresearch-worker-1.manticoresearch-worker-replication-svc.manticoresearch.svc.cluster.local:9315,manticoresearch-worker-2.manticoresearch-worker-replication-svc.manticoresearch.svc.cluster.local:9315': -110 (Connection timed out)
FATAL: gcs connect failed: Connection timed out