FATAL: out of memory (unable to allocate 9007199254740992 bytes)

Приветствую!

Версия Мантикоры 3.6.0. Запись русского текста (размер записей от 2 до 500 КБ, общим объёмом менее 500 МБ) приводит к раздуванию searchd.exe на несколько десятков гигабайт (на 32 ГБ ОЗУ и с уходом в подкачку), после чего обычно Мантикора либо начинает тормозить, либо подвисает, либо не даёт подключаться - по разному, но начинается всегда с взрывного захвата всей памяти и строчки в логе

FATAL: out of memory (unable to allocate 9007199254740992 bytes)

при этом число байт как правило это или ещё бывает 137438953472.
При попытке остановить сервис после этого - Мантикора обычно крашится и создаёт минидамп, а иногда не стопится и приходится её убивать.

Касательно конфигурации - у меня два индекса - полнотекстовый для записи текста и номерной для записи только чисел, FATAL из сабжа бывает и при записи только текста, но чаще - при записи и текста и номеров (т.е. в два индекса). Плюс это поведение чаще проявляется при записи в несколько потоков, но также случается и при записи одним потоком.

Посмотрел похожий тред

  • но таких параметров, которые вызывали проблему там - у меня в конфиге нет, также видел похожий тред
    FATAL: out of memory
  • и по реккомендации оттуда сделал запись лога с --logdebug

Лог:

[Thu Jun 17 22:01:42.484 2021] [1588] DEBUG: StartGlobalWorkpool
[Thu Jun 17 22:01:42.487 2021] [1588] starting daemon version '3.6.0 96d61d8bf@210504 release' ...
[Thu Jun 17 22:01:42.488 2021] [1588] listening on 127.0.0.1:9312 for sphinx and http(s)
[Thu Jun 17 22:01:42.489 2021] [1588] listening on all interfaces for mysql, port=9306
[Thu Jun 17 22:01:42.491 2021] [1588] listening on 127.0.0.1:9308 for sphinx and http(s)
[Thu Jun 17 22:01:42.492 2021] [1588] DEBUG: 'rt_flush_period' - nothing specified, using default value 36000000000
[Thu Jun 17 22:01:42.494 2021] [6100] DEBUG: 'embedded_limit' - nothing specified, using default value 16384
[Thu Jun 17 22:01:42.494 2021] [6100] DEBUG: 'attr_update_reserve' - nothing specified, using default value 131072
[Thu Jun 17 22:01:42.496 2021] [6100] DEBUG: 'docstore_block_size' - nothing specified, using default value 16384
[Thu Jun 17 22:01:42.688 2021] [6100] DEBUG: 'embedded_limit' - nothing specified, using default value 16384
[Thu Jun 17 22:01:42.689 2021] [6100] DEBUG: 'attr_update_reserve' - nothing specified, using default value 131072
[Thu Jun 17 22:01:42.689 2021] [6100] DEBUG: 'docstore_block_size' - nothing specified, using default value 16384
[Thu Jun 17 22:01:42.703 2021] [1588] DEBUG: expression stack for creation 0
[Thu Jun 17 22:01:42.706 2021] [1588] DEBUG: expression stack for eval/deletion 80
[Thu Jun 17 22:01:42.740 2021] [1588] DEBUG: filter stack delta 240
[Thu Jun 17 22:01:42.741 2021] [1588] DEBUG: 'binlog_max_log_size' - nothing specified, using default value 268435456
[Thu Jun 17 22:01:42.748 2021] [1588] DEBUG: MAC address 00:25:90:e8:38:7f for uuid-short server_id
[Thu Jun 17 22:01:42.748 2021] [1588] DEBUG: uid-short server_id 26, started 67287702, seed 1874626345296723968
[Thu Jun 17 22:01:42.751 2021] [1588] DEBUG: SaveMeta: Done.
[Thu Jun 17 22:01:42.752 2021] [4472] prereading 2 indexes
[Thu Jun 17 22:01:42.752 2021] [4472] DEBUG: prereading index 'main_index'
[Thu Jun 17 22:01:42.752 2021] [1588] accepting connections
[Thu Jun 17 22:01:42.753 2021] [4472] DEBUG: prereaded index 'main_index' in 0.001 sec
[Thu Jun 17 22:01:42.755 2021] [4472] DEBUG: prereading index 'numeric_index'
[Thu Jun 17 22:01:42.756 2021] [4472] DEBUG: prereaded index 'numeric_index' in 0.001 sec
[Thu Jun 17 22:01:42.756 2021] [4472] prereaded 2 indexes in 0.005 sec
[Thu Jun 17 22:01:51.849 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:01:55.855 2021] [1588] DEBUG: last message repeated 30 times
[Thu Jun 17 22:01:59.367 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:01.400 2021] [1588] DEBUG: last message repeated 10 times
[Thu Jun 17 22:02:12.943 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:16.917 2021] [1588] DEBUG: last message repeated 29 times
[Thu Jun 17 22:02:38.633 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.634 2021] [2108] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:38.634 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.635 2021] [2184] DEBUG: last message repeated 1 times
[Thu Jun 17 22:02:38.635 2021] [5024] DEBUG: last message repeated 1 times
[Thu Jun 17 22:02:38.635 2021] [2184] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:38.635 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.635 2021] [5024] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:38.637 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.637 2021] [5072] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:38.638 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.638 2021] [1188] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:38.638 2021] [4216] DEBUG: RemoveEvent()
[Thu Jun 17 22:02:38.638 2021] [4876] DEBUG: AsyncNetInputBuffer_c::AppendData: error 0 (No error) return -1
[Thu Jun 17 22:02:39.707 2021] [1588] DEBUG: last message repeated 1 times
[Thu Jun 17 22:02:40.406 2021] [4876] FATAL: out of memory (unable to allocate 137438953472 bytes)
[Thu Jun 17 22:03:31.101 2021] [1588] caught SIGTERM, shutting down
[Thu Jun 17 22:03:31.186 2021] [1588] DEBUG: StopNetLoop()
[Thu Jun 17 22:03:31.190 2021] [1588] DEBUG: TerminateSessions() (000000081A5AEF20) invoked
[Thu Jun 17 22:03:31.192 2021] [1588] DEBUG: StopNetLoop() succeeded
[Thu Jun 17 22:03:31.200 2021] [1588] DEBUG: ReplicateClustersDelete invoked
[Thu Jun 17 22:03:34.111 2021] [1588] WARNING: still 10 alive tasks during shutdown, after 3.005 sec
[Thu Jun 17 22:03:34.143 2021] [1588] DEBUG: SaveMeta: Done.
[Thu Jun 17 22:03:34.144 2021] [1588] rt: index numeric_index: ramchunk saved in 0.029 sec
------- FATAL: CRASH DUMP -------
[Thu Jun 17 22:03:31.101 2021] [ 2924]
[Thu Jun 17 22:03:34.439 2021] [1588] WARNING: GlobalCrashQueryGetRef: thread-local info is not set! Use ad-hoc

--- crashed invalid query ---

--- request dump end ---
--- local index:
Manticore 3.6.0 96d61d8bf@210504 release
minidump located at:  C:/Manticore/log/searchd.pid.2924.0000000000000000.mdmp
--- active threads ---
--- Totally 0 threads, and 0 client-working threads ---
------- CRASH DUMP END -------

Конфиг:

indexer
{
	mem_limit		= 512M
	lemmatizer_cache 	= 512M
}

common {
    plugin_dir = C:/Manticore/usr/local/manticore/lib
    lemmatizer_base = C:/Manticore/usr/dicts/
}

searchd {
    listen = 127.0.0.1:9312
    listen = 9306:mysql41
    listen = 127.0.0.1:9308:http
    log = C:/Manticore/log/searchd.log
    query_log = C:/Manticore/log/query.log
    binlog_path = C:/Manticore/binlog
    pid_file = C:/Manticore/log/searchd.pid
    query_log_format = sphinxql

	seamless_rotate		= 1
	preopen_indexes		= 1
	unlink_old		= 1
	workers			= threads 
	
	thread_stack 		= 2M 
	
	mysql_version_string 	= 5.0.37

}

index main_index
{
	path                     = C:/Manticore/main_index/main
	rt_mem_limit             = 2048m
	type                     = rt
	morphology               = lemmatize_ru_all, lemmatize_en, lemmatize_de
	charset_table            = non_cjk
	stored_fields 		 = content  
	
	rt_field		= content
	rt_attr_string		= guid
	rt_attr_uint		= rubric_id
	rt_attr_string		= fileName
	rt_attr_uint		= size
	rt_attr_uint		= format
	rt_attr_string		= suffix
	rt_attr_timestamp	= dt_inserted
	rt_attr_string		= identifier
	rt_attr_string		= source
	rt_attr_string		= root
	rt_attr_string		=  sreserve_1
	rt_attr_string      	=  sreserve_2
	rt_attr_string      	=  sreserve_3
	rt_attr_uint		=  ireserve_1
	rt_attr_uint		=  ireserve_2
	rt_attr_uint      	=  ireserve_3
	rt_attr_timestamp	=  dtreserve_1
	rt_attr_timestamp	=  dtreserve_2
	rt_attr_timestamp   	=  dtreserve_3
}

index numeric_index
{
	path                     = C:/Manticore/numeric_index/numeric

	rt_mem_limit             = 2048m
	type                     = rt
	morphology               = none
	charset_table            = non_cjk
	stored_fields 		 = number
	
	rt_field		= number
	
	rt_attr_string		= record_guid
	rt_attr_string		= original_number
	rt_attr_uint		= rubric_id
	rt_attr_string		= fileName
	rt_attr_uint		= size
	rt_attr_uint		= format
	rt_attr_string		= suffix
	rt_attr_timestamp 	= dt_inserted
	rt_attr_string		= identifier
	rt_attr_string		= source
	rt_attr_string		= root
	rt_attr_string		=  sreserve_1
	rt_attr_string      	=  sreserve_2
	rt_attr_string      	=  sreserve_3
	rt_attr_uint		=  ireserve_1
	rt_attr_uint		=  ireserve_2
	rt_attr_uint      	=  ireserve_3
	rt_attr_timestamp	=  dtreserve_1
	rt_attr_timestamp	=  dtreserve_2
	rt_attr_timestamp   	=  dtreserve_3
}

В чём может быть причина?

общим объёмом менее 500 МБ) приводит к раздуванию searchd.exe на несколько десятков гигабайт

Похоже на мемлик.

FATAL: out of memory (unable to allocate 9007199254740992 bytes)

Похоже на баг, т.к. сильно уж много bytes.

Можете завести issue на гитхабе и предоставить воспроизводимый кейс? Можете заслать на наш ftp - https://mnt.cr/ftp

Кстати, а в линуксе воспроизводится?

1 Like