I am currently inserting documents into an RT table. I am only doing insertions, no deletions. I have inserted about 12 million rows. I am running parallel insertions. The disk chunk count is currently at 5 and I’m running optimize. rt_mem_limit is 10 GB.
My issue is that when I do a query then the count is wrong. For example, if I simply do:
select count(*) from my_index;
then it may tell me 12423110 one time and then immediately afterwards say a lower number like 12390350 to the same query. The count cannot be going down because I am not doing any deletions.
This also happens if I include a match term. It may say the count is 343093 and then say it’s 341946 next time.
I am currently filling up the index but this is always going to be a very active table with a lot of insertions and deletions.
Obviously, at this point I can’t trust or use it at all. Is there some explanation or solution?
I stopped the inserts and restarted sphinx. I then did a more controlled insert: two insertions, one with 215 records and one with 181 records (396 total).
When they completed, a count() showed that the count had gone up by 373. I waited a bit and did another count() and the remaining 23 were there. That means that the sphinx insert query is returning as OK, but before Sphinx has actually done its stuff. If Sphinx returns an OK, but before the records have actually been inserted, how can I know if something went wrong. Millions weren’t inserted.