I’ve found a weird issue, have RT index on 3 node cluster - using the official manticore helm chart ( v10.1.0)
But KNN queries return different results on each node
$ php scripts/runsphrt.php --namespace=dev "select id, user_id, realname, title, 1 as reference_index, knn_dist() from gridimage_embedding where knn(image_vector, 100, 181780) limit 10"
0: id, user_id, realname, title, reference_index, knn_dist()
0: 875248, 19979, Michael Trolove, House and Barn Conversion, Lingham Lane, 1, 0.08535016
0: 3662336, 4330, Dave Hitchborne, High Street, West Wycombe, 1, 0.08751625
0: 877002, 19979, Michael Trolove, Rectory Farm, Bullock Road, 1, 0.08935702
0: 1495549, 11627, andrew auger, Salisbury Road, Blandford Forum, 1, 0.08986270
0: 3635586, 4330, Dave Hitchborne, Main Road, Dowsby, 1, 0.09074509
0: 73190, 2215, Phil Williams, Lissett, 1, 0.09110427
0: 1724480, 4330, Dave Hitchborne, Cagthorpe, Horncastle, 1, 0.09228814
0: 437560, 4330, Dave Hitchborne, Road Junction at Halton Fenside, 1, 0.09278101
0: 1092160, 19979, Michael Trolove, Main Street Great Gidding, 1, 0.09309936
1: id, user_id, realname, title, reference_index, knn_dist()
1: 90948, 3141, Michael Graham, Tarn Pike o Blisco, 1, -0.00000024
1: 875248, 19979, Michael Trolove, House and Barn Conversion, Lingham Lane, 1, 0.08535016
1: 3662336, 4330, Dave Hitchborne, High Street, West Wycombe, 1, 0.08751625
1: 877002, 19979, Michael Trolove, Rectory Farm, Bullock Road, 1, 0.08935702
1: 1495549, 11627, andrew auger, Salisbury Road, Blandford Forum, 1, 0.08986270
1: 3635586, 4330, Dave Hitchborne, Main Road, Dowsby, 1, 0.09074509
1: 73190, 2215, Phil Williams, Lissett, 1, 0.09110427
1: 1724480, 4330, Dave Hitchborne, Cagthorpe, Horncastle, 1, 0.09228814
1: 437560, 4330, Dave Hitchborne, Road Junction at Halton Fenside, 1, 0.09278101
1: 1092160, 19979, Michael Trolove, Main Street Great Gidding, 1, 0.09309936
2: id, user_id, realname, title, reference_index, knn_dist()
2: 178386, 2215, Phil Williams, Craigmark, 1, -0.00000024
2: 875248, 19979, Michael Trolove, House and Barn Conversion, Lingham Lane, 1, 0.08534968
2: 3662336, 4330, Dave Hitchborne, High Street, West Wycombe, 1, 0.08751625
2: 877002, 19979, Michael Trolove, Rectory Farm, Bullock Road, 1, 0.08935738
2: 1495549, 11627, andrew auger, Salisbury Road, Blandford Forum, 1, 0.08986270
2: 3635586, 4330, Dave Hitchborne, Main Road, Dowsby, 1, 0.09074509
2: 64977, 2215, Phil Williams, The Post Office at Bathford, 1, 0.09110427
2: 1724480, 4330, Dave Hitchborne, Cagthorpe, Horncastle, 1, 0.09228814
2: 437560, 4330, Dave Hitchborne, Road Junction at Halton Fenside, 1, 0.09278101
2: 1092160, 19979, Michael Trolove, Main Street Great Gidding, 1, 0.09309953
This script just runs the query on each node seperately, and prints the result.
Very similar, but see they not identical. document 875248 even has different distance.
As far as can seen the nodes are all in sync, and even querying the vector attributes seem to be identical on each
Data was just inserted to a worker, and allowed to replicate naturally - gridimage_embedding was added to cluster before it was populated
php scripts/runsphrt.php --namespace=dev "select id, user_id, realname, title, image_vector from gridimage_embedding where id = 90948 limit 10"
0: id: 90948
0: image_vector: -0.03217115,0.03704010,-0.03127992,0.02554078,-0.00850291,-0.03710286,-0.01276904,0.02601549,0.00510677,0.04268274,0.01847675,0.00375635,0.02503659,0.01960007,0.00948665,-0.04172490,0.07599994
1: image_vector: -0.03217115,0.03704010,-0.03127992,0.02554078,-0.00850291,-0.03710286,-0.01276904,0.02601549,0.00510677,0.04268274,0.01847675,0.00375635,0.02503659,0.01960007,0.00948665,-0.04172490,0.07599994
2: image_vector: -0.03217115,0.03704010,-0.03127992,0.02554078,-0.00850291,-0.03710286,-0.01276904,0.02601549,0.00510677,0.04268274,0.01847675,0.00375635,0.02503659,0.01960007,0.00948665,-0.04172490,0.07599994
Have checked more, they identical right to end! Checked other docs, too. The output of the attribute is identical!
… My only assumption is somehow the HNSW’s ‘small world’ network has been built different on each node - the links between items have somehow ended up with a different graph.
I wouldn’t mind if the results were comparable (it just swapping similar results), but the results are vastly different quality. Hard to see in list form.
CREATE TABLE gridimage_embedding (
id bigint,
title text,
grid_reference text,
realname text,
user_id integer,
title_vector float_vector knn_type='hnsw' knn_dims='512' hnsw_similarity='COSINE',
image_vector float_vector knn_type='hnsw' knn_dims='512' hnsw_similarity='COSINE'
)
The same issue happens on the title_vector, results are diofferent to searching on the image vector, but they are stiff different between nodes.