Запрос с ORDER BY и WHERE по id падает с OOM

Доброго времени суток.

Есть Manticore:

Server version: 7.0.0 92c650401@25013002 (columnar 4.0.0 5aa8e43@25012409) (secondary 4.0.0 5aa8e43@25012409) (knn 4.0.0 5aa8e43@250124
09) git branch manticore-7.0.0...origin/manticore-7.0.0

Сервера VM:
manticore-libra-stage-proxy - Manticore proxy с настроенным agents
manticore-libra-stage-01 и manticore-libra-stage-02 - поисковые сервера с plain-индексами
manticore-libra-stage-indexer - на сервере выполняется индексация с indexer и последующая синхронизация на поисковые node

manticore-libra-stage-01 и manticore-libra-stage-02:
На поисковых node (VM) CPU: 4, RAM: 20G (из них под хранение индексов выделено 14G в /dev/shm/, остается чуть менее 6G на ОС и сам Manticore).

Есть индекс poducts, который сконфигурирован по схеме main + delta.

конфигурация:

Summary
source products_main
{
    type = pgsql
    sql_host = XXX
    sql_user = XXX
    sql_pass = XXX
    sql_db = XXX
    sql_port = XXX
    sql_query_pre = \
    INSERT INTO manticore_indexer (index_name, created_at) \
      VALUES ('products_main', NOW()) \
      ON CONFLICT (index_name) DO UPDATE SET created_at = EXCLUDED.created_at

    sql_query = \
    SELECT id, name, seo_slug as slug, main_image as image, price_retail, marketing_statuses as marketing_status_list, \
      availability_status, availability_quantity, vendor_id, vat as nds, description, selling_text, \
      reasons_to_buy as reasons_to_buy_list, literature_work_publishing_year, EXTRACT(EPOCH FROM preorder_available_at) AS preorder_available_at, EXTRACT(EPOCH FROM released_at) AS released_at, \
      EXTRACT(EPOCH FROM created_at) AS created_at, EXTRACT(EPOCH FROM updated_at) AS updated_at, isbns, printing_page_count, printing_page_format, printing_copy_count, weight, height, width, length, \
      excerpts, additional_images, rating_average, rating_weight, rating_star, rating_count, review_count, \
      sales_stat_half_year_day_avg_quantity as purchase_stats_day_avg_count, \
      sales_stat_year_turnover_amount_without_vat, sales_stat_year_turnover_quantity, sales_stat_year_markup, \
      sales_stat_half_year_turnover_amount_without_vat, sales_stat_half_year_turnover_quantity, sales_stat_half_year_markup, \
      sales_stat_quarter_turnover_amount_without_vat, sales_stat_quarter_turnover_quantity, sales_stat_quarter_markup, \
      sales_stat_month_turnover_amount_without_vat, sales_stat_month_turnover_quantity, sales_stat_month_markup, \
      school_grade_id_list, school_subject_id_list, school_material_type_id_list, school_exam_id_list, \
      school_education_system_id_list, school_umk_id_list, school_umk_title_list as school_umk_title, school_exam_year_id_list, \
      school_purpose_id_list, seo_title, seo_description, main_category_id_list, category_id_list, tbk_id_list, ekn_id_list, \
      author_id_list, translator_id_list, illustrator_id_list, publisher_series_id, publisher_id, publisher_brand_id, \
      manufacturer_brand_id, literature_work_cycle_id, literature_work_cycle_volume_id, age_restriction, \
      printing_binding_id as binding_id, tag_id_list, product_set_id_list, author_full_name_list, publisher_series_name, \
      publisher_name, publisher_brand_name, manufacturer_brand_name, product_type_id, article_number_id, \
      stationery_brush_shape_id_list, stationery_brush_material_id_list, stationery_brush_number_id_list, \
      stationery_painting_technique_id_list, stationery_lead_hardness_id_list, stationery_format_id_list, \
      stationery_line_type_id_list, stationery_ink_color_id_list, stationery_lead_diameter_id_list, \
      stationery_case_shape_id_list, stationery_mechanism_type_id_list, stationery_diameter_id_list, \
      stationery_feature_id_list, stationery_colors_quantity_id_list, stationery_gender_id_list, stationery_length_id_list, \
      stationery_staple_number_id_list, stationery_stapler_number_id_list, stationery_material_id_list, \
      stationery_punched_sheets_quantity_id_list, stationery_pen_thickness_id_list, stationery_mounting_type_id_list, \
      stationery_pen_tip_shape_id_list, stationery_ink_base_id_list, \
      stationery_calculator_capacity_id_list, stationery_calendar_year_id_list, stationery_calendar_type_id_list, \
      stationery_calendar_subject_id_list, stationery_clasp_type_id_list, stationery_compartments_quantity_id_list, \
      stationery_cover_binding_id_list, stationery_package_type_id_list, stationery_cover_surface_id_list, \
      stationery_universal_id_list, stationery_sheets_quantity_id_list, stationery_color_id_list, stationery_volume_id_list, \
      comic_universal_id_list, comic_character_id_list, comic_genre_id_list, comic_series_id_list, comic_type_id_list, \
      comic_line_id_list, comic_section_id_list, comic_subject_id_list, game_player_quantity_id_list, \
      game_use_case_id_list, game_skill_id_list, game_audience_id_list, game_child_age_id_list, game_series_id_list, \
      game_duration_id_list, constructor_detail_quantity_id_list, constructor_nation_id_list, \
      constructor_equipment_type_id_list, souvenir_reason_id_list, souvenir_format_id_list, souvenir_set_quantity_id_list, \
      toy_type_id_list, toy_height_id_list, gift_hobby_id_list, gift_for_children_idea_id_list as gift_for_children_id_list, \
      gift_for_new_year_idea_id_list as gift_new_year_id_list, gift_section_id_list, gift_books_on_interest_id_list, \
      product_collection_id_list, active_shops_id_list, active_cities_id_list, active_shop_brands_id_list, all_shops_id_list, \
      all_cities_id_list, all_shop_brands_id_list, videos, attended_foreign_agents, \
      EXTRACT(EPOCH FROM synchronized_at) AS synchronized_at \
    FROM product \
    WHERE is_removed = 'f' AND EXTRACT(EPOCH FROM synchronized_at) >= $start AND EXTRACT(EPOCH FROM synchronized_at) <= $end
    sql_query_range = \
    SELECT (SELECT EXTRACT(EPOCH FROM MIN(synchronized_at)) FROM product) min, \
      (SELECT EXTRACT(EPOCH FROM created_at)-1 FROM manticore_indexer WHERE index_name='products_main') max

    sql_range_step = 1000
    sql_query_post_index = \
    INSERT INTO manticore_indexer (index_name, created_at) \
      VALUES ('products_delta', (SELECT created_at FROM manticore_indexer WHERE index_name='products_main')) \
      ON CONFLICT (index_name) DO UPDATE SET created_at = EXCLUDED.created_at

    # attributes
    sql_attr_string = name
    sql_attr_string = slug
    sql_attr_string = description
    sql_attr_string = selling_text
    sql_attr_string = image
    sql_attr_string = additional_images
    sql_attr_string = seo_title
    sql_attr_string = seo_description
    sql_attr_string = isbns
    sql_attr_string = printing_page_format
    sql_field_string = school_umk_title
    sql_field_string = author_full_name_list
    sql_field_string = publisher_series_name
    sql_field_string = publisher_name
    sql_field_string = publisher_brand_name
    sql_field_string = manufacturer_brand_name
    sql_attr_float = price_retail
    sql_attr_float = rating_average
    sql_attr_float = sales_stat_year_turnover_amount_without_vat
    sql_attr_float = sales_stat_year_markup
    sql_attr_float = sales_stat_half_year_turnover_amount_without_vat
    sql_attr_float = sales_stat_half_year_markup
    sql_attr_float = sales_stat_quarter_turnover_amount_without_vat
    sql_attr_float = sales_stat_quarter_markup
    sql_attr_float = sales_stat_month_turnover_amount_without_vat
    sql_attr_float = sales_stat_month_markup
    sql_attr_uint = availability_status
    sql_attr_uint = availability_quantity
    sql_attr_uint = vendor_id
    sql_attr_uint = nds
    sql_attr_uint = literature_work_publishing_year
    sql_attr_uint = printing_page_count
    sql_attr_uint = printing_copy_count
    sql_attr_uint = weight
    sql_attr_uint = height
    sql_attr_uint = width
    sql_attr_uint = length
    sql_attr_uint = rating_weight
    sql_attr_uint = rating_star
    sql_attr_uint = rating_count
    sql_attr_uint = review_count
    sql_attr_uint = purchase_stats_day_avg_count
    sql_attr_uint = sales_stat_year_turnover_quantity
    sql_attr_uint = sales_stat_half_year_turnover_quantity
    sql_attr_uint = sales_stat_quarter_turnover_quantity
    sql_attr_uint = sales_stat_month_turnover_quantity
    sql_attr_uint = publisher_series_id
    sql_attr_uint = publisher_id
    sql_attr_uint = publisher_brand_id
    sql_attr_uint = manufacturer_brand_id
    sql_attr_uint = literature_work_cycle_id
    sql_attr_uint = literature_work_cycle_volume_id
    sql_attr_uint = binding_id
    sql_attr_uint = product_type_id
    sql_attr_uint = article_number_id
    sql_attr_bigint = age_restriction
    sql_attr_timestamp = preorder_available_at
    sql_attr_timestamp = released_at
    sql_attr_timestamp = created_at
    sql_attr_timestamp = updated_at
    sql_attr_timestamp = synchronized_at
    sql_attr_json = reasons_to_buy_list
    sql_attr_json = excerpts
    sql_attr_json = videos
    sql_attr_json = attended_foreign_agents
    sql_attr_multi = uint marketing_status_list from field
    sql_attr_multi = uint school_grade_id_list from field
    sql_attr_multi = uint school_subject_id_list from field
    sql_attr_multi = uint school_material_type_id_list from field
    sql_attr_multi = uint school_exam_id_list from field
    sql_attr_multi = uint school_education_system_id_list from field
    sql_attr_multi = uint school_umk_id_list from field
    sql_attr_multi = uint school_exam_year_id_list from field
    sql_attr_multi = uint school_purpose_id_list from field
    sql_attr_multi = uint main_category_id_list from field
    sql_attr_multi = uint category_id_list from field
    sql_attr_multi = uint tbk_id_list from field
    sql_attr_multi = uint ekn_id_list from field
    sql_attr_multi = uint author_id_list from field
    sql_attr_multi = uint translator_id_list from field
    sql_attr_multi = uint illustrator_id_list from field
    sql_attr_multi = uint tag_id_list from field
    sql_attr_multi = uint product_set_id_list from field
    sql_attr_multi = uint stationery_brush_shape_id_list from field
    sql_attr_multi = uint stationery_brush_material_id_list from field
    sql_attr_multi = uint stationery_brush_number_id_list from field
    sql_attr_multi = uint stationery_painting_technique_id_list from field
    sql_attr_multi = uint stationery_lead_hardness_id_list from field
    sql_attr_multi = uint stationery_format_id_list from field
    sql_attr_multi = uint stationery_line_type_id_list from field
    sql_attr_multi = uint stationery_ink_color_id_list from field
    sql_attr_multi = uint stationery_lead_diameter_id_list from field
    sql_attr_multi = uint stationery_case_shape_id_list from field
    sql_attr_multi = uint stationery_mechanism_type_id_list from field
    sql_attr_multi = uint stationery_diameter_id_list from field
    sql_attr_multi = uint stationery_feature_id_list from field
    sql_attr_multi = uint stationery_colors_quantity_id_list from field
    sql_attr_multi = uint stationery_gender_id_list from field
    sql_attr_multi = uint stationery_length_id_list from field
    sql_attr_multi = uint stationery_staple_number_id_list from field
    sql_attr_multi = uint stationery_stapler_number_id_list from field
    sql_attr_multi = uint stationery_material_id_list from field
    sql_attr_multi = uint stationery_punched_sheets_quantity_id_list from field
    sql_attr_multi = uint stationery_pen_thickness_id_list from field
    sql_attr_multi = uint stationery_mounting_type_id_list from field
    sql_attr_multi = uint stationery_pen_tip_shape_id_list from field
    sql_attr_multi = uint stationery_ink_base_id_list from field
    sql_attr_multi = uint stationery_calculator_capacity_id_list from field
    sql_attr_multi = uint stationery_calendar_year_id_list from field
    sql_attr_multi = uint stationery_calendar_type_id_list from field
    sql_attr_multi = uint stationery_calendar_subject_id_list from field
    sql_attr_multi = uint stationery_clasp_type_id_list from field
    sql_attr_multi = uint stationery_compartments_quantity_id_list from field
    sql_attr_multi = uint stationery_cover_binding_id_list from field
    sql_attr_multi = uint stationery_package_type_id_list from field
    sql_attr_multi = uint stationery_cover_surface_id_list from field
    sql_attr_multi = uint stationery_universal_id_list from field
    sql_attr_multi = uint stationery_sheets_quantity_id_list from field
    sql_attr_multi = uint stationery_color_id_list from field
    sql_attr_multi = uint stationery_volume_id_list from field
    sql_attr_multi = uint comic_universal_id_list from field
    sql_attr_multi = uint comic_character_id_list from field
    sql_attr_multi = uint comic_genre_id_list from field
    sql_attr_multi = uint comic_series_id_list from field
    sql_attr_multi = uint comic_type_id_list from field
    sql_attr_multi = uint comic_line_id_list from field
    sql_attr_multi = uint comic_section_id_list from field
    sql_attr_multi = uint comic_subject_id_list from field
    sql_attr_multi = uint game_player_quantity_id_list from field
    sql_attr_multi = uint game_use_case_id_list from field
    sql_attr_multi = uint game_skill_id_list from field
    sql_attr_multi = uint game_audience_id_list from field
    sql_attr_multi = uint game_child_age_id_list from field
    sql_attr_multi = uint game_series_id_list from field
    sql_attr_multi = uint game_duration_id_list from field
    sql_attr_multi = uint constructor_detail_quantity_id_list from field
    sql_attr_multi = uint constructor_nation_id_list from field
    sql_attr_multi = uint constructor_equipment_type_id_list from field
    sql_attr_multi = uint souvenir_reason_id_list from field
    sql_attr_multi = uint souvenir_format_id_list from field
    sql_attr_multi = uint souvenir_set_quantity_id_list from field
    sql_attr_multi = uint toy_type_id_list from field
    sql_attr_multi = uint toy_height_id_list from field
    sql_attr_multi = uint gift_hobby_id_list from field
    sql_attr_multi = uint gift_for_children_id_list from field
    sql_attr_multi = uint gift_new_year_id_list from field
    sql_attr_multi = uint gift_section_id_list from field
    sql_attr_multi = uint gift_books_on_interest_id_list from field
    sql_attr_multi = uint product_collection_id_list from field
    sql_attr_multi = uint active_shops_id_list from field
    sql_attr_multi = uint active_cities_id_list from field
    sql_attr_multi = uint active_shop_brands_id_list from field
    sql_attr_multi = uint all_shops_id_list from field
    sql_attr_multi = uint all_cities_id_list from field
    sql_attr_multi = uint all_shop_brands_id_list from field
    # attributes end
        
}

source products_delta: products_main
{
    sql_query_pre =
    sql_query_range = \
    SELECT (SELECT EXTRACT(EPOCH FROM created_at) FROM manticore_indexer WHERE index_name='products_delta') min, \
      EXTRACT(EPOCH FROM NOW() AT TIME ZONE 'Europe/Moscow') max

    sql_query_killlist = \
    SELECT id FROM product \
    WHERE synchronized_at >= (SELECT created_at FROM manticore_indexer WHERE index_name='products_delta')
    
}

index products_main
{
    type = plain
    source = products_main
    path  = /dev/shm/manticore/indexes/products/products_main

    # options
    min_prefix_len = 3
    index_exact_words = 1
    charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+0401->U+0435, U+0451->U+0435, U+401->U+0435, U+451->U+0435
    ignore_chars = U+002E, U+002D, U+005C
    morphology = lemmatize_ru_all, lemmatize_en_all
    min_stemming_len = 4
    expand_keywords = 1
    # options end
        
}

index products_delta
{
    type = plain
    source = products_delta
    path  = /dev/shm/manticore/indexes/products/products_delta
    killlist_target = products_main:kl
    
}

на proxy конфигурация для него:

index products
{
    type = distributed
    agent = node1|node1:products_main,products_delta[conn=pconn,ha_strategy=nodeads]

}

Конфигурация поисковых node:

Summary
indexer
{
    mem_limit = 2040M
    write_buffer = 128M
}
common {
    plugin_dir = /usr/local/lib/manticore
}
searchd {
    listen = XXX:9312
    listen = XXX:9306:mysql
    listen = XXX:9308:http
    listen = XXX:9322-9325:replication
    log = /var/log/manticore/searchd.log
    query_log = /var/log/manticore/query.log
    pid_file = /var/run/manticore/searchd.pid
    query_log_format = sphinxql
    query_log_min_msec = 500
    query_log_mode = 644
    max_batch_queries = 0
    max_packet_size = 128M
    thread_stack = 16M
    binlog_flush = 0
    binlog_max_log_size = 16M
    binlog_path =  # disable logging
    threads = 3
    auto_optimize = 0
    collation_server = utf8_ci
    network_timeout = 300s
    sphinxql_timeout = 30m
    client_timeout = 1h

    secondary_indexes = 1

    telemetry = 0
}

Есть запрос:

Summary
SELECT id,
    name,
    slug,
    image,
    price_retail,
    marketing_status_list,
    IF (
        availability_status != 40 AND availability_status != 50 AND GREATEST(active_shops_id_list) > 0,
        20,
        availability_status
    ) as availability_status,
    availability_status as availability_status_online,
    availability_quantity,
    GREATEST(active_shops_id_list) > 0 as is_available_offline,
    IN(active_cities_id_list, -1) as is_available_in_customer_city,
    IN(active_shops_id_list, -1) as is_available_in_customer_shop,
    vendor_id,
    nds,
    main_category_id_list,
    author_id_list,
    product_type_id,
    article_number_id,
    product_collection_id_list,
    videos
    , IF (
    availability_status IN (40, 50),
    100,
    IF (
        IN(active_shops_id_list, -1),
        35,
        IF (
            IN(active_cities_id_list, -1),
            30,
            IF (
                GREATEST(active_shops_id_list) > 0,
                20,
                availability_status
            )
        )
    )
) as availability  /* статус для сортировки */
, 1000000000000 * literature_work_publishing_year + IF(released_at > 0, released_at, created_at) as newness
FROM products
WHERE id IN (2057837,2200199,2405917,2418104,2447550,2454296,2459100,2467382,2467384,2480417,2496668,2560805,2562533,2592760,2609975,2628067,2631217,2632042,2653784,2655789)
ORDER BY availability DESC, purchase_stats_day_avg_count DESC, newness DESC, price_retail asc
limit 20
OPTION max_matches=20

При запросах через proxy или даже на прямую на поисковые node он приводит к жору всей доступной ОЗУ и падению с OOM:

2025-04-11T08:27:52.759909+00:00 manticore-libra-stage-01 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_all
owed=0,global_oom,task_memcg=/system.slice/manticore.service,task=searchd,pid=2153594,uid=111
2025-04-11T08:27:52.759910+00:00 manticore-libra-stage-01 kernel: Out of memory: Killed process 2153594 (searchd) total-vm:25580524kB, 
anon-rss:16978828kB, file-rss:2688kB, shmem-rss:2371152kB, UID:111 pgtables:45848kB oom_score_adj:0
2025-04-11T08:27:52.781066+00:00 manticore-libra-stage-01 systemd[1]: manticore.service: A process of this unit has been killed by the OOM killer.

Из Zabbix:
CPU:


ОЗУ:

Но, если:

  1. Заменить сортировку:
ORDER BY availability DESC, purchase_stats_day_avg_count DESC, newness DESC, price_retail asc

на

ORDER BY purchase_stats_day_avg_count DESC, price_retail asc

Этот запрос работает без проблем.
2. Убрать из WHERE выборку по id и оставить ту же сортировку:

Summary
SELECT id,
    name,
    slug,
    image,
    price_retail,
    marketing_status_list,
    IF (
        availability_status != 40 AND availability_status != 50 AND GREATEST(active_shops_id_list) > 0,
        20,
        availability_status
    ) as availability_status,
    availability_status as availability_status_online,
    availability_quantity,
    GREATEST(active_shops_id_list) > 0 as is_available_offline,
    IN(active_cities_id_list, -1) as is_available_in_customer_city,
    IN(active_shops_id_list, -1) as is_available_in_customer_shop,
    vendor_id,
    nds,
    main_category_id_list,
    author_id_list,
    product_type_id,
    article_number_id,
    product_collection_id_list,
    videos
    , IF (
    availability_status IN (40, 50),
    100,
    IF (
        IN(active_shops_id_list, -1),
        35,
        IF (
            IN(active_cities_id_list, -1),
            30,
            IF (
                GREATEST(active_shops_id_list) > 0,
                20,
                availability_status
            )
        )
    )
) as availability  /* статус для сортировки */
, 1000000000000 * literature_work_publishing_year + IF(released_at > 0, released_at, created_at) as newness
FROM products
WHERE (((ALL(marketing_status_list)!=50 AND ALL(tbk_id_list) NOT IN (30,32,33)) AND ANY(active_shops_id_list)=0) AND availability_status>0) AND (availability_status!=15 OR ALL(marketing_status_list)!=50)
ORDER BY availability DESC, purchase_stats_day_avg_count DESC, newness DESC, price_retail asc
limit 20
OPTION max_matches=20

Тоже работает и быстро.
3. Совсем простая версия:

SELECT *
FROM products
WHERE id IN (2057837,2200199,2405917,2418104,2447550,2454296,2459100,2467382,2467384,2480417,2496668,2560805,2562533,2592760,2609975,2628067,2631217,2632042,2653784,2655789)

Так же работает.

Для тестирования на node1 увеличили ОЗУ на 8G до 28G, выполнил 1 запрос и так же жор всей доступной памяти и падение с OOM.
Причем такой же запрос работает на RT индексе (таком же), но не работает на plain (мы пытаемся перейти с RT на plain).

Подскажите, в чем проблема и как устранить?

можете показать show meta после всех успешно отработаных запросов?

Результаты:

  1. Через proxy

По запросу с заменой ORDER BY:

Variable_name   Value
total   20
total_found     20
total_relation  eq
time    0.003

Запрос без WHERE id:

Variable_name   Value
total   20
total_found     1083207
total_relation  eq
time    0.323
  1. Через поисковые node:

По запросу с заменой ORDER BY:

Variable_name   Value
total   20
total_found     20
total_relation  eq
time    0.001
index   id:DocidIndex (50%), id:SecondaryIndex (50%)

Запрос без WHERE id:

Variable_name   Value
total   20
total_found     1083926
total_relation  eq
time    0.427

Есть мысли, почему так происходит?

пока что в одном случае включается SI и не используются все аттрибуты

в другом случае идет скан аттрибутов и они поднимаются в память

Без данных это не проверить

Да могу подготовить архив с этим индексом и загрузить его в S3.

давайте

1 Like

Загрузил в S3 под issue: Запрос с ORDER BY и WHERE по id падает с OOM · Issue #3301 · manticoresoftware/manticoresearch · GitHub