Включить морфологию

index indexnews {
  type   = plain
  source = sourcename
  morphology = lemmatize_ru
  charset_table = rus,latin,digits,underscore
  path   = /var/lib/manticore/indexnews
  min_word_len = 3
}
sudo indexer --all --rotate
Manticore 7.4.6 b2ff82920@25022808 (columnar 4.1.1 25f4706@25022806) (secondary 4.1.1 25f4706@25022806) (knn 4.1.1 25f4706@25022806)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2024, Manticore Software LTD (https://manticoresearch.com)

using config file '/etc/manticoresearch/manticore.conf'...
indexing table 'indexnews'...
FATAL: table 'indexnews': 'charset_table': syntax error near 'us,latin,digits,underscore'

В строчке есть слово ‘кот’, пытаюсь сделать так, чтобы находило запись по запросу ‘Коту’. Для это необходимо включить марфологию, добавляю

  morphology = lemmatize_ru
  charset_table = rus,latin,digits,underscore

При попытке переиндексации, выдает ошибку. Как поправить?

Как поправить?

Использовать правильные значения вместо rus,latin,digits,underscore - Creating a table > NLP and tokenization > Low-level tokenization | Manticore Search Manual

1 Like

Что то я не так делаю…

 apt install manticore-language-packs
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  python3-mutagen python3-pycryptodome python3-websockets
Use 'apt autoremove' to remove them.
The following NEW packages will be installed:
  manticore-language-packs
0 upgraded, 1 newly installed, 0 to remove and 13 not upgraded.
Need to get 14,0 MB of archives.
After this operation, 33,7 MB of additional disk space will be used.
Get:1 http://repo.manticoresearch.com/repository/manticoresearch_jammy jammy/main arm64 manticore-language-packs all 1.0.13-250708-1e9c2cd [14,0 MB]
Fetched 14,0 MB in 2s (7 215 kB/s)
Selecting previously unselected package manticore-language-packs.
(Reading database ... 173640 files and directories currently installed.)
Preparing to unpack .../manticore-language-packs_1.0.13-250708-1e9c2cd_all.deb ...
Unpacking manticore-language-packs (1.0.13-250708-1e9c2cd) ...
Setting up manticore-language-packs (1.0.13-250708-1e9c2cd) ...

Скачал пакеты, в папке /usr/share/manticore есть файлы ru.pak, en.pak, de.pak

В конфигурационном файле прописываю

lemmatizer_base = /usr/share/manticore/
index indexnews {
  type   = plain
  source = sourcename
  path   = /var/lib/manticore/indexnews
  morphology = lemmatize_ru
  # wordforms = /etc/manticoresearch/wordforms/indexname.wfs
}

Далее выполняю

:~# sudo systemctl stop manticore
:~# sudo systemctl start manticore
Job for manticore.service failed because the control process exited with error code.
See "systemctl status manticore.service" and "journalctl -xeu manticore.service" for details.

Как понимаю lemmatizer_base = /usr/share/manticore/ прописан не верно, хотя расположен в не какого либо блока

А проблема-то какая? Можно searchd log или вывод от searchd в journalctl?

1 Like
 sudo systemctl start manticore
Job for manticore.service failed because the control process exited with error code.
See "systemctl status manticore.service" and "journalctl -xeu manticore.service" for details.
root@desktop:~# systemctl status manticore.service
× manticore.service - Manticore Search Engine
     Loaded: loaded (/etc/systemd/system/manticore.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Fri 2025-07-25 20:11:31 MSK; 7s ago
   Duration: 3h 55min 19.295s
       Docs: https://manual.manticoresearch.com/,
             man:searchd(1)
    Process: 30073 ExecStart=/usr/bin/searchd --config /etc/manticoresearch/manticore.conf $_ADDITIONAL_SEARCHD_PARAMS (code=exited, status=1/FAILURE)
        CPU: 82ms

июл 25 20:11:31 desktop systemd[1]: manticore.service: Scheduled restart job, restart counter is at 5.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Start request repeated too quickly.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Failed with result 'exit-code'.
июл 25 20:11:31 desktop systemd[1]: Failed to start manticore.service - Manticore Search Engine.
root@desktop:~# journalctl -xeu manticore.service
июл 25 20:11:31 desktop searchd[30073]: [Fri Jul 25 20:11:31.812 2025] [30073] FATAL: failed to parse config file '/etc/manticoresearch/manticore.conf': ERROR: invalid section type 'lemmatizer_base' in /etc/manticoresearch/manticore.conf line 44 col 17.
июл 25 20:11:31 desktop searchd[30073]: Manticore 13.2.3 bf96368a6@25070806 (columnar 8.0.0 1f68681@25070110) (secondary 8.0.0 1f68681@25070110) (knn 8.0.0 1f68681@25070110) (embeddings 1.0.0)
июл 25 20:11:31 desktop searchd[30073]: Copyright (c) 2001-2016, Andrew Aksyonoff
июл 25 20:11:31 lena-desktop searchd[30073]: Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
июл 25 20:11:31 desktop searchd[30073]: Copyright (c) 2017-2025, Manticore Software LTD (https://manticoresearch.com)
июл 25 20:11:31 desktop systemd[1]: manticore.service: Control process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ An ExecStart= process belonging to unit manticore.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ The unit manticore.service has entered the 'failed' state with result 'exit-code'.
июл 25 20:11:31 desktop systemd[1]: Failed to start manticore.service - Manticore Search Engine.
░░ Subject: A start job for unit manticore.service has failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit manticore.service has finished with a failure.
░░
░░ The job identifier is 31315 and the job result is failed.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Scheduled restart job, restart counter is at 5.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ Automatic restarting of the unit manticore.service has been scheduled, as the result for
░░ the configured Restart= setting for the unit.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Start request repeated too quickly.
июл 25 20:11:31 desktop systemd[1]: manticore.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ The unit manticore.service has entered the 'failed' state with result 'exit-code'.
июл 25 20:11:31 desktop systemd[1]: Failed to start manticore.service - Manticore Search Engine.
░░ Subject: A start job for unit manticore.service has failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit manticore.service has finished with a failure.
░░
░░ The job identifier is 31409 and the job result is failed.

Добавляю lemmatizer_base = /usr/share/manticore/, затем останавливаю, затем запускаю. Не понимаю в чем тут дело, вроде делаю все правильно.

вообще не надо прописывать, у него по дефолту значение правильное, но если прописываете, то это должно быть в секции common - Server settings > Common | Manticore Search Manual

common {
  lemmatizer_base = ...
}
1 Like

Работает, проверил на русском.

Указал так:
morphology = lemmatize_ru,lemmatize_en,lemmatize_de

В папке /usr/share/manticore, только de.pak, en.pak, ru.pak

В документации есть еще lemmatize_uk, но в папке еще нет. Как его возможно добавить и что находится в папке /usr/share/manticore/stopwords, что за стоп слова?

Есть ли возможно указать, чтобы морфология работала со всеми языками что есть в папке?

что за стоп слова?

Есть ли возможно указать, чтобы морфология работала со всеми языками что есть в папке?

Да, укажите несколько морфологий через запятую. Пример тут Creating a table > NLP and tokenization > Morphology | Manticore Search Manual