RT query gives wrong count(*)

I am currently inserting documents into an RT table. I am only doing insertions, no deletions. I have inserted about 12 million rows. I am running parallel insertions. The disk chunk count is currently at 5 and I’m running optimize. rt_mem_limit is 10 GB.

My issue is that when I do a query then the count is wrong. For example, if I simply do:

select count(*) from my_index;

then it may tell me 12423110 one time and then immediately afterwards say a lower number like 12390350 to the same query. The count cannot be going down because I am not doing any deletions.

This also happens if I include a match term. It may say the count is 343093 and then say it’s 341946 next time.

I am currently filling up the index but this is always going to be a very active table with a lot of insertions and deletions.

Obviously, at this point I can’t trust or use it at all. Is there some explanation or solution?

Index status:

| index_type | rt
| indexed_documents | 13857274
| indexed_bytes | 22188888803
| ram_bytes | 1497320863
| disk_bytes | 28351974628
| ram_chunk | 25179915
| ram_chunks_count | 23
| disk_chunks | 5
| mem_limit | 134217728
| ram_bytes_retired | 0
| tid | 66798
| tid_saved | 66763

do you do replace statement or only insert?

I do it as a replace statement, but each document is only being inserted once.

I have inserted 14378441 unique records (according to my insertion script), but the sphinx count(*) just now told me 12412139 (then 12342585).

I stopped the inserts and restarted sphinx. I then did a more controlled insert: two insertions, one with 215 records and one with 181 records (396 total).

When they completed, a count() showed that the count had gone up by 373. I waited a bit and did another count() and the remaining 23 were there. That means that the sphinx insert query is returning as OK, but before Sphinx has actually done its stuff. If Sphinx returns an OK, but before the records have actually been inserted, how can I know if something went wrong. Millions weren’t inserted.

seems like a bug - could you create ticket at Github where provide reproducible example - config files, insert queries, search queries show the issue?

I created an issue https://github.com/manticoresoftware/manticoresearch/issues/345. I don’t know that it’s reproducible.

i need reproducible case to investigate issue further

Try it and see. Clearly there is a serious issue. I told you what I did that produced these results.

I tried and unable to reproduce the issue. Still need complete example as for me all works fine.