Search on multiple indexes


#1

I have two indexes:
city (id, name, fulltext_field)
street (id, city, name, fulltext_field)

Now i call
select * from city, street where ...

Questions:

  1. Is there posibility to know which index belongs each found row?
  2. In result i see only id and name fields. As i understand result have only common fields for all indexes. But is there posibility to get non-common fields? I think they should contain NULL for other indexes. But if field is present in both indexes and have different type… Maybe alias to index should be used…
  3. If there are rows with same ID in both indexes - only one row will present in result. But i want to get rows from both indexes. Can i?

I tried to find answers in documentation - but found nothing. So if it is impossible now to do all this - can it be feature request?

Also i think to use multi-query instead of multi-index query. But as i understand queries from multi-query request will execute consistently. And multi-index query i can run in parallel - multi-core or multi-server if indexes are on different servers.


#2

Right now results from indexes are intersected by document ID and only common columns are retained. For this, only one document is output in result set. For 1. you need to have an attribute that tells the index to which it belongs. Going with 2 indexes with different that can have overlapping ids is not a good idea. Either query then independent or make one structure that could fit any of the types you have there (even if spans to multiple indexes).


#3

Ok, so you think it is no reason to create feature request for this?
And what about run in parallel multiple queries? Is it possible to develop something for this in future releases?

If possible, add this to documentation, please. I think it is useful information.


#4

for 2nd point you could use EXIST expression described at docs

There is also example that says “This function is handy when you are searching through several indexes with different schemas”.

For 3 could you provide example that show that you need? As from my point of view it already takes advantage of dist_threads and process queries in parallel.


#5

Oh, i missed EXIST. Thanks.
But for my task i need use it on string attributes, but it does not support them.

You mean i can create distributed index that contain all indexes i need? I’ll think about this. It seems can resolve my task. But can i combine indexes with different structure?
But as for now, i cannot use this aproach, because of EXISTS does not support string attributes.

And i’ll explain my task.
I have 2 indexes - city and street. I have string, that i want to search in both. And then sort result on expression: if row from city indexed - by attribute from city index, if row from street index - by attribute from street index.

And problems that i should to resolve:
A. i need fields that present only in city and only in street indexes.
B. in result i need to know what is row - city or street.
C. i need rows from city index and street index, even is ID is the same.

A i can resolve using EXISTS, but my attributes is string. so not now.
B i can resolve adding some attribute with index name to each index. or using EXISTS for unique attributes. But i will be better to have some function to get index name.
C i cannot resolve in any way.

I need result from both indexes at same time - i will always process them simultaneously. So i can use multi-query syntax: select * from city; select * from street;. This will resolve all problems (a,b,c), but i sould make final sorting outside of sphinx, but it is not significant.
But significant is that i cannot call select * from city; select * from street; in parallel. But i want…)