Hi.
First of all, thank you very much for your efforts for keeping this awesome software up to date with the excellent documentation.
I am considering about changing sphinx search to manticore. Well, in fact, I would have posted this thread before, but there is a user call barryhunter on sphinx search forums which is it is awesome and it is the person which maintains alive sphinxsearch. Without his help, we could never have discovered the potential of Sphinx Search.
We are using sphinx search in a production server with Foolz\SphinxQL\SphinxQL package (PHP). Now, we have to a do a migration, and I have to consider if we opt to manticore. Sphinxsearch is giving us some problems with Ubuntu and the systemd scripts.
But I have to know if our current sphinx.config is compatible, and also if how we proceed it is still valid in manticore.
Summary: we use a dedicated server with 16 cores. Daily we fill a table called products_tmp (100GB). When this table is completed, we index with sphinxsearch that table and rename products_tmp to products. Each *.sps is about 705MB, *.spa about 100MB and *.spd and *.spp about 360MB
As you can see in the details, the sphinx.config may look a bit weird.
The thing is that our dedicated server has 16 cores, so we share the indexes via scripting PHP with distributed index for faster speed read.
mem_limit seems to be low, but that is because we index 8 at each time and we are in the limit of RAM usage.
echo pp0 pp1 pp2 pp3 pp4 pp5 pp6 pp7 | xargs -n1 -P8 /usr/bin/indexer --nohup --rotate --config /etc/sphinxsearch/sphinx.conf
rename 's/tmp/new/' /mnt/disk/sphinxsearch_data/*.tmp.* -v
echo pp8 pp9 pp10 pp11 pp12 pp13 pp14 pp15 | xargs -n1 -P8 /usr/bin/indexer --rotate --config /etc/sphinxsearch/sphinx.conf
With the above lines, we index products_tmp table, and when the last index has been processed, is when we do rotate. That is why we have to use the “hack” of -nohup + rename, in order to “wait” until all the work has been done for pp15 (see line below if ($i == ($cores - 1)).
We currently can’t use seamless_rotate = 1 because off high peak of RAM and the crash of mysql service.
sphinx.config (omitted irrelevant parts)
#!/usr/bin/php -q
<?php
$table = "products_tmp";
$cores = 16;
?>
source theshop
{
type = mysql
[...]
sql_query_range = \
SELECT MIN(id),MAX(id) FROM <?php echo $table;?>
sql_query = \
SELECT id, name,price,brand,category,sku FROM <?php echo $table;?> WHERE id>=$start AND id<=$end
[...]
sql_query_pre = SET CHARACTER_SET_RESULTS=utf8
sql_query_pre = SET NAMES utf8
sql_query_pre = SET SESSION query_cache_type=OFF
}
<?php for ($i=0; $i<$cores; $i++):?>
source ptp<?=$i?>: theshop
{
sql_query = \
SELECT id, name,price,brand,category,sku FROM <?php echo $table;?> WHERE id>=$start AND id<=$end AND id % <?=$cores?> = <?=$i?>
<?php
if ($i == ($cores - 1) && $table == "products_tmp") echo "sql_query_post_index = RENAME TABLE `products` TO `products_old`, `products_tmp` TO `products`".PHP_EOL;
?>
}
<?php endfor; ?>
index products_template
{
type = plain
source = theshop
path = /mnt/disk/sphinxsearch_data/products
stopwords = /home/ubuntu/sphinx_stopwords.txt
wordforms = /home/ubuntu/sphinx_wordform.txt
stopword_step = 0
stopwords_unstemmed = 1
html_strip = 1
blend_chars = -,&,U+002E
blend_mode = trim_none,trim_both
docinfo = extern
#mlock = 1
morphology = libstemmer_es
min_stemming_len = 4
dict = keywords
min_infix_len = 2
#min_prefix_len = 2
min_word_len = 1
index_exact_words=1
expand_keywords= 0
charset_table = #ommitted for brevity
}
<?php for ($i=0; $i<$cores; $i++) { ?>
index pp<?=$i?>: products_template
{
source = ptp<?=$i.PHP_EOL?>
path = /mnt/disk/sphinxsearch_data/products_<?=$i?>
}
<?php } ?>
index products
{
type = distributed
<?php for ($i=0; $i<$cores; $i++) { ?>
local = pp<?=$i.PHP_EOL?>
<?php } ?>
}
indexer
{
mem_limit = 128M
write_buffer = 4M
lemmatizer_cache = 1024M
}
searchd
{
listen = 127.0.0.1:9306:mysql41
log = /var/log/sphinxsearch/searchd.log
#query_log_format = sphinxql
#query_log = /var/log/sphinxsearch/query.log
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinxsearch/searchd.pid
seamless_rotate = 0
preopen_indexes = 1
unlink_old = 1
binlog_path = /var/lib/sphinxsearch/data
dist_threads = <?php echo $cores.PHP_EOL;?>
#expansion_limit = 5
}