Any chance to see this feature in the future? We want to re-sync our indexes daily, but without optimizing it drains our disk space significantly. And we cannot exclude indexes from the cluster on our production environment because of downtime.
Are you meaning calling ‘OPTIMIZE INDEX’?
That shouldnt need a ‘remove from cluster’, the optimization happens in a background thread, with no downtime. (sometimes there is short pause, when the final switch happens, but its just a momemnty ‘slow down’ - queries dont fail.
no you can not issue OPTIMIZE for index in cluster.
As optimize performs at each node it could produce different disk chunks during the work and could also keep different amount of disk chunks depends on core count at every node.
That could lead to SST issues
- joiner started to fetch index from one donor with one disk chunks set via SST, stopped
- joiner reconnects but cluster selects another donor with another set of disk chunks and SST starts to transfer whole disk chunks set again from a new donor node
for now recommendation is to remove index from cluster and issue OPTIMIZE at one node
- all other nodes have local index available and could handle read queries but the index at these nodes has an old snapshot of data
- node there optimize happens could handle write and read queries, someone could route all write queries into this node
after optimize finished user could add index back into cluster and SST will transfer actual index to all nodes in cluster
For now we are implementing auto-optimize functionality there different optimize strategies could be added. That allow user does not issue optimize by hand.
However that will be not related to cluster and cluster indexes still need to be optimized by hand.
But after refactor of optimize code for auto-optimize we will plan cluster wide optimize too.
Ah ok, didnt think of ‘replication clusters’, that might well be different. Don’t use them really, so not experienced with them.
Sorry for confusion.