Hi everyone,
I’m running Manticore in a Docker and when I load more than 700k records the Docker shuts down (Exited (137)) because it consumes 8 Gbs of RAM that I provided… These records are news so the “Cuerpo” field contains texts of 20k characters and the 2 jsons fields contain an array of objects with no more than 3 or 4 properties that store integer values.
My goal is to load 100M records and keep adding day by day, I understand that for this I should use more RAM, but how much should I estimate? Could something in the creation of my table be improved?
The version I’m using I got it from: docker pull manticoresearch/manticore
I’m using RT and I created it in the following way with a C# application:
create table noticias
(
NoticiaId bigint,
Titulo text engine=‘columnar’,
Cuerpo text indexed engine=‘columnar’,
FechaAlta timestamp,
FechaPublicacion timestamp,
Audiencia int,
TierId int,
TipoMedioId int,
Ave float,
SoporteId int,
DivisionId int,
PaisId int,
Empresas json,
Temas json,
embed_vector float_vector knn_type=‘hnsw’ knn_dims=‘4096’ hnsw_similarity=‘cosine’ engine=‘columnar’
)
morphology = ‘libstemmer_es’
stopwords = ‘es’
min_word_len = ‘4’
html_strip = ‘1’