is count(*) probabilistic?

environment:

  • manticore 6.3.2-24062606-c296dc7c8
  • Debian 12
  • amd64

i’m puzzled by this situation that happens on non-RT indexes.

on server A:

mysql> select count(*) from entity_textsearch where id>210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 3ms ---

mysql> select count(*) from entity_textsearch where id>=210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 3ms ---

mysql> select count(*) from entity_textsearch where id=210000007;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---

on server B:

mysql> select count(*) from entity_textsearch where id>210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.02 sec)
--- 1 out of 1 results in 14ms ---

mysql> select count(*) from entity_textsearch where id>=210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649557 |
+----------+
1 row in set (0.02 sec)
--- 1 out of 1 results in 15ms ---

mysql> select count(*) from entity_textsearch where id=210000007;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.01 sec)
--- 1 out of 1 results in 0ms ---

data did not change in any way during the query time.

how is that possible - looks like row with id=210000007 is not counted by server 1 when covering id>=210000007 and id<220000007

here is a topic at the manual about Grouping accuracy

You could also check for SHOW META total_relation to make sure about the precision

thank you. interestingly total_relation is eq which suggests that the result is exact.

there’s parallel processing involved via locally distributed index:

index entity_textsearch{
        type = distributed
        local=entity_textsearch_a
        local=entity_textsearch_b
        local=entity_textsearch_c
        local=entity_textsearch_d
}

server 1:

mysql> select count(*) from entity_textsearch where id>210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.01 sec)
--- 1 out of 1 results in 3ms ---

mysql> show meta;
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 1     |
| total_found    | 1     |
| total_relation | eq    |
| time           | 0.003 |
+----------------+-------+
4 rows in set (0.00 sec)

mysql> select count(*) from entity_textsearch where id>=210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.01 sec)
--- 1 out of 1 results in 3ms ---

mysql> show meta;
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 1     |
| total_found    | 1     |
| total_relation | eq    |
| time           | 0.003 |
+----------------+-------+
4 rows in set (0.00 sec)

mysql> select count(*) from entity_textsearch where id=210000007;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.01 sec)
--- 1 out of 1 results in 0ms ---

mysql> show meta;
+----------------+----------------------+
| Variable_name  | Value                |
+----------------+----------------------+
| total          | 1                    |
| total_found    | 1                    |
| total_relation | eq                   |
| time           | 0.000                |
| index          | id:DocidIndex (100%) |
+----------------+----------------------+
5 rows in set (0.18 sec)

server 2:

mysql> select count(*) from entity_textsearch where id>210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649556 |
+----------+
1 row in set (0.02 sec)
--- 1 out of 1 results in 18ms ---

mysql> show meta;
+----------------+----------------------+
| Variable_name  | Value                |
+----------------+----------------------+
| total          | 1                    |
| total_found    | 1                    |
| total_relation | eq                   |
| time           | 0.018                |
| index          | id:DocidIndex (100%) |
+----------------+----------------------+
5 rows in set (0.00 sec)

mysql> select count(*) from entity_textsearch where id>=210000007 and id<220000007;
+----------+
| count(*) |
+----------+
|  1649557 |
+----------+
1 row in set (0.03 sec)
--- 1 out of 1 results in 22ms ---

mysql> show meta;
+----------------+----------------------+
| Variable_name  | Value                |
+----------------+----------------------+
| total          | 1                    |
| total_found    | 1                    |
| total_relation | eq                   |
| time           | 0.022                |
| index          | id:DocidIndex (100%) |
+----------------+----------------------+
5 rows in set (0.00 sec)

mysql> select count(*) from entity_textsearch where id=210000007 ;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---

mysql> show meta;
+----------------+----------------------+
| Variable_name  | Value                |
+----------------+----------------------+
| total          | 1                    |
| total_found    | 1                    |
| total_relation | eq                   |
| time           | 0.000                |
| index          | id:DocidIndex (100%) |
+----------------+----------------------+
5 rows in set (0.00 sec)

second query select count(*) from entity_textsearch where id>=210000007 and id<220000007; is returning results which are inconsistent with the 1st and 2nd, but only on server 1.

you could also try to use count distinct as described Manticore Search Manual: Searching > Grouping

May be a bug similar to Incorrect total_found in "show meta;" for a "group X by". Can repeat for 1 index or distributed index across two servers. · Issue #2488 · manticoresoftware/manticoresearch · GitHub

We’ll appreciate if you create an issue with a minimal reproducible example.

sadly i’m unable to reproduce this issue [ underlying index was rebuilt few times since the report ]. if i spot it happening again - i’ll try to get as much details as possible and write a bug report.

thanks for your help!