Manticore memory capping on Kubernetes

JeffPY · December 30, 2020, 2:57pm

Dear all,

I am trying to get my Manticore Cluster work properly on Kubernetes but I am facing memory usage issues which lead to the pod’s systematic eviction / crashloop backoff.

I have two nodes, with 4 CPUs and 16 Gi RAM each, running one manticore container each. I have set up Requests and Limits for each of them at 80% full potential, but my pods keep being OOMkilled by Kubernetes system (Out Of Memory).

Containers:
  manticore:
    Container ID:   docker://2c71c25298154b09ecb00****6c58d1103096d0fc42732aa316516ec82a9
    Image:          *****/manticore:7e740fe43b50
    Image ID:       docker-pullable://*****/manticore@sha256:c38b116***38cf383fa583c8a76705c9ac2dd417851649df608134ee2cf8a2
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 30 Dec 2020 13:06:58 +0100
      Finished:     Wed, 30 Dec 2020 13:11:18 +0100
    Ready:          False
    Restart Count:  8
    Limits:
      cpu:     3700m
      memory:  6000Mi
    Requests:
      cpu:        50m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /var/lib/manticore from manticore-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-csf8n (ro)

I am using RealTime index and each time I do some “REPLACE INTO” queries, the memory keeps on increasing. I have managed to monitor this with this command :

kubectl top pod manticore-6446***6c-6bcd8

NAME                        CPU(cores)   MEMORY(bytes)
manticore-64466f86c-6bcd8   1m           3429Mi

CPU is keeping with low values but memory can’t stop increasing and when it reaches my pod’s limit, it gets destroyed, again and again. Is there a way to “cap” memory usage or to flush automatically to disk before the pod gets destroyed?

Sergey · December 30, 2020, 4:23pm

What’s your rt_mem_limit? Manticore Search Manual

JeffPY · December 30, 2020, 5:11pm

It is not set, so it should default to 256MB.

Here is the result of a “kubectl top” command after a few minutes (around 30 minutes) of quite intensive “REPLACE INTOs”. I’ve upgraded the servers to 32 GB to see if it helps.

kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   1m           276Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   21m          7925Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   18m          8677Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   31m          11463Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   27m          13238Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   1m           16226Mi

JeffPY · December 30, 2020, 5:21pm

After 30 minutes, memory usage has hit a maximum of around 16Gi and began to lower.

kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   2m           9892Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   5m           8683Mi

But it went up again and it finally crashed …

kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   77m          11814Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   1m           20172Mi
kubectl top pod manticore-5467cb5c74-4j7nz
NAME                         CPU(cores)   MEMORY(bytes)
manticore-5467cb5c74-4j7nz   110m         26104Mi

By chance I’ve attached a PersistentVolumeClaim, so it restarts from where it was, but it creates a downtime during which it can’t be updated.

So, it would probably be good to be able to cap global memory usage to avoid these kind of issues.

JeffPY · December 30, 2020, 6:26pm

In case it helps : my index files on disk are only 243 MB… and manticoresearch is the only container running.

Here is my config file :

#!/bin/sh
ip=`hostname -i`
cat << EOF

searchd
{
        listen                  = 9306:mysql41
        listen                  = $ip:9312
        listen                  = $ip:9315-9325:replication
        log                     = /dev/stdout
        query_log               = /var/log/manticore/searchd.log
        pid_file                = /var/run/manticore/searchd.pid
        preopen_indexes         = 0
        binlog_path             = /var/lib/manticore
        data_dir                = /var/lib/manticore
        collation_server        = utf8_general_ci
        max_packet_size = 128M
        max_open_files = max
        rt_mem_limit = 8192M

}
EOF

JeffPY · December 31, 2020, 10:55am

I think I found the issue :

preopen_indexes was in fact set to 1 because it did not sync my last changes.

Setting preopen_indexes to 0 seems to stop the server from always increasing used memory!