In our business, we need alter table column frequently, but we also need distributed to storage our data about ten billion level.Currently we use elastic search, we can use ‘put mapping’ command to alter field.I wanna know whether manticore search realize similar functions?
In elastic search, we can use command like this “curl --location --request GET ‘http://localhost:8080/item_comment_*/_mappings/_doc’
–header ‘Content-Type: application/json’
–data ‘{
“properties”: {
“attr_is_highlight_comment”: {
“type”: “boolean”
}
}
}’”.
Then “attr_is_highlight_comment” filed will be added into all indexes and can be searched, filtered and aggerated. Can manticore search do this?
Manticore doesn’t have this functionality yet.
That’s means, if I want to use ‘ALTER’ command, I have to give up distributed table and storage all data into one table?
No, it means you have to alter each table the distributed table includes separately.
That’s good. If I alter one table, its replica will also be altered automatically right?
No. You have to detach a table from a cluster, alter it and then add it back. Otherwise you’ll get this error:
mysql> alter table t add column a int;
ERROR 1064 (42000): table t: is part of cluster c, ALTER is not supported for tables in cluster
How can I keep the cluster is still working for read and write data when I detach one table from a cluster?
In general you can’t do direct ‘online’ schema updates. Would have to take teh index offline to perform such updates.
But you might consider a JSON attribute.
https://manual.manticoresearch.com/Data_creation_and_modification/Adding_data_from_external_storages/Fetching_from_databases/Processing_fetched_data#sql_attr_json
You can add arbitrary ‘sub’ attributes to each document, without having to alter the overall table.
You setup the single JSON attribute on the table, and then the properties inside the attribute are schemaless.
Subject to some limitations can still filter or sort by properties inside a JSON attribute. ( I think grouping too, not tested)
The table will be still readable. Writing should be anyway suspended whlie the table is being altered. Since you have to write to cluster_name:table_name
when the table is in a cluster, you won’t be able to do it once it’s detached as a protection from unexpected writes.
Does that mean when one table is detached from cluster, read request will be assigned to read its replica?And I need to alter all tables and their replica tables, right?
then you detach the table from the cluster - you can not write into that table and replicate writes across the whole cluster.
You could read or search that table. You could write the table locally at the node. Then you attach the table again into cluster the cluster uses SST to replicate the table content across the nodes. That is why after ALTER cluster ADD table
all nodes will see the same state of the added table.
However you can not add or drop attributes for table that is part of the cluster as that operation is not cluster wide and works only for local tables. That operation is also lock local table for write that is why it is not possible to write data into table and add or drop attributes at the same time.
If you want to add attribute you should
- drop table from the cluster (table remains local at all nodes and can be used as local table at any node)
- issue
alter table t add column a int;
to change local table at one of the node - add local table into the cluster (cluster replicate the new table files across the nodes and if the different nodes have different table content it will be replaced by the content of the table from the node there table got added into cluster)
So just to confirm, whether I only need to alter one table, and the whole cluster(including its replica and all other shards) will synchronize the attributes I changed?
yes the cluster will see the changes after it finished adding table into cluster back. However it might take time if the table has huge size.
Will you support this feature in the future that alter table attributes in distributed table in the future like elasticsearch “curl --location --request GET ‘http://localhost:8080/item_comment_*/_mappings/_doc’
–header ‘Content-Type: application/json’
–data ‘{
“properties”: {
“attr_is_highlight_comment”: {
“type”: “boolean”
}
}
}’”?