Is there any benchmark or information regarding the performance of plain versus real time indexes?
I remember in early sphinx versions, real time index perfomance was less than that of plain indexes.
I am curious about how the performance state of plain versus real time indexes is in current manticore versions.
This may be imperfect in terms of the returned results (that’s why the “plus/minus” icon is shown for a few records), but it shows the essence: RT tables are on-par with plain tables in terms of response time. This is because since the Sphinx times:
RT tables can parallelize queries among all the chunks
Plain tables use pseudo-sharding to do a similar thing
First of all, thank you for your informative response!
I am currently testing Manticore on a beefy server with 128 cores (256 threads with hyperthreading enabled), NVMe drive, and 512GB of RAM. I have split the index into 20 shards on a plain index, as I was using Sphinx until a few days ago.
Although I have only briefly looked at the changes in top, I used real traffic to test and noticed that performance on Manticore can be easily improved by increasing max_threads_per_query. However, the server load increases significantly with a small improvement in performance.
Would you think that RT indexes and plain indexes have similar CPU usage?
As for performance, it seems that simply increasing max_threads_per_query can improve response time easily, but at the cost of much higher CPU usage.
Do you think automatic performance increases using a high max_threads_per_query are as less, similar, or more efficient than physically sharding the index on the same local hardware?
Manticore now has pseudo sharding enabled by default, and in theory, search performance shouldn’t differ much between plain and real-time tables. The goal is to minimize response times by maximizing CPU utilization through the parallelization of search queries. This applies to both plain and real-time tables and can be controlled via the following settings: