Build and rotate indexes on remote servers

I have the following server setup:

  • web_server_1
  • web_server_2
  • manticore_server_1
  • manticore_server_2

How do I run the indexers of the manticore servers from any web server? The web servers will have a kind of script that will manage the building of indexes. I’m kind of lost how to run the indexers without programming some kind of SSH. Is there any other way than creating a SSH script?

There’s no HTTP or SQL or any other TCP-based interface to run indexer remotely, so you need to go there and run it.

ssh manticore_server_1 indexer ... doesn’t seem to be too difficult.

Alternatively you can use RT indexes and just do INSERT INTO ... remotely via SQL over mysql, SQL over HTTO, JSON over HTTP or our new PHP client (https://github.com/manticoresoftware/manticoresearch-php)

@Sergey Hi! Thanks for your work!

Still no option to call indexer via TCP? Old-style Sphinx-ish indexes inherently depend on separate Cron process updating them. This is especially problematic in containerized environments like Docker/Kubernetes. Running both Cron and searchd in one container is problematic because running two processes is problematic by itself. Normally I could run Cron outside in another container but without remote indexer it’s not possible. And running sshd instead of Cron won’t help either.

Also it makes writing automated integration tests a trouble because integration test needs to find a way to trigger reindexing after arrange-part before asserting that the search is working.

I was considering switching to Manticore’s new “RT” indexes but that’s really a completely different design. I would have to rewrite all infra code. Now I am wondering if I should switch to RT index or code a good Cron+searchd daemon runner.

I feel rather unintelligent and obtrusive as I am talking with myself. But maybe other Googlers could find it useful, so…

I was dwelling for several days upon xmlpipe/csvpipe, sophisticated main+delta scheme, trying to generate ids, solving merging duplicates issues and dealing with Cron setup. And then I just migrated to Manticore-specific RT indexes using only HTTP, very small config file and ready-made Docker image. Took me a couple of hours maybe overall. I was coming from Sphinx and trying old Sphinx approach. But this new approach seems quite a lot easier.

1 Like

Still no option to call indexer via TCP?

Unfortunately no.

Running both Cron and searchd in one container is problematic because running two processes is problematic by itself.

I’ve just tried this How to run a cron job inside a docker container? - Stack Overflow and it worked fine in a manticore container (official image):

root@7656cbf91f92:/var/lib/manticore# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mantico+       1  0.0  0.0 768404  9224 ?        Ssl  03:34   0:00 searchd --nodetach
root          37  0.0  0.0  40480  8004 pts/0    Ss+  03:34   0:00 mysql
root          42  0.0  0.0  20256  3852 pts/1    Ss   03:35   0:00 bash
root         644  0.0  0.0  30104  2844 ?        Ss   03:38   0:00 cron
root         648  0.0  0.0  36152  3208 pts/1    R+   03:39   0:00 ps aux

root@7656cbf91f92:/var/lib/manticore# tail -f /var/log/cron.log
Hello world

So you can build your own image FROM manticoresearch/manticore and add cron to it.

And then I just migrated to Manticore-specific RT indexes using only HTTP, very small config file and ready-made Docker image. Took me a couple of hours maybe overall

That’s exactly why we’ve been always focusing more on the RT part of the project since the beginning and added the HTTP JSON interface. Plain indexation is fine, but it made more sense back in the days when Sphinx was more of an extension to mysql/postgres/whatever.

For one of my projects, use a ‘wrapper’ script on the maniticore servers (well containors), that just fetches a list of indexes to build, and calls indexer

(just using PHP as most familar with that)

I simplified the code, to make it easier to follow

Cron, just calls the wrapper every 5 minutes

*/5 * * * *	    /usr/local/bin/indexer-wrapper.php

In the container, supercronic in running in a sidecar container to searchd.

The list of indexes to index come from mysql database.

The application can then set the schedule of individual indexes, or even dynamically add new schedules, pause indexing. Also manages main+delta indexes.