Hi, just recently came across this amazing DB, which has a lot of awesome features that make me want to try it out in my next projects, but I have some questions
distributed table look cool, but parallelly requesting sth like ”LIMIT n, m“ to shards will make 1000 items being fetched and recalculated,it seems, distributed table is quite heavy if not in a necessary scenario, so my question is:
1、how big size of data can a non-distributed columnar table hold before it goes into a slow response time and high latency ? say, for a billion rows of data, do I have to make a distributed table for them?
2、distributed shards has such ”LIMIT n, m“ drawback, so, in deep paging scenario, will it be a bad choice ?
3、when I start some thing small,a non-distributed table is good,but when data grows, can I alter this non-distributed table to a distributed one with some few more shards. and when data keeps growing on, could I dynamicly add more shards to it? or do I have to re-define a new table and port those data to it?
4、for distributed table, there are two terms, shards and mirrors, in my understanding, different shards hold different part of a logic table, and they do not contain duplicate data of each other; but mirrors are just duplicates of certain shards for data redundancy. am I right?
5、for shards and mirrors, how can I differentiate them? only by table name? say, in a distributed table, child tables “table1” “table2” are shards of parent table, but remote agent also has a child table “table1”, is a mirror of table1?
thx so much for these answers