Unexpected results from nested bool query

Hi, I am trying to get a bool query to return the correct results.

I have this:

"query": {
 "bool": {
  "must": [
   {
    "equals": { "catid": 123 }
   },
   {
    "bool": {
     "should": [
      {
       "bool": {
        "must": [
         {
          "match": { "tags": "red" }
         },
         {
          "equals": { "type": "Car" }
         }
        ]
       }
      },
      {
       "equals": { "type": "Bus" }
      }
     ]
    }
   }
  ]
 }
},

As I understand it, this translates to SQL-like syntax:

WHERE catid=123 AND ((tags LIKE '%red%' AND type='Car') OR type='Bus')

The problem is, the results are not respecting the tags LIKE '%red%' part. I am getting:

catid | type | tags
--------------------------
123   | Bus  | red      // ok
123   | Bus  | yellow   // ok
123   | Car  | green    // should not be getting this result
123   | Car  |          // should not be getting this result
123   | Car  | red      // ok

What am I doing wrong with my JSON?

Any help appreciated.

Hi. Can you please provide the CREATE TABLE used to create the table and the inserts to populate it, so I can reproduce and dig into that?

Hi, thanks for getting back to me.

Here is the syntax:

/cli
DROP table tags_test
/cli
CREATE TABLE tags_test (model_name string, itemid integer, catid integer, tags text)
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Bus",
		"itemid": 1,
		"catid": 31,
		"tags": ""
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Car",
		"itemid": 2,
		"catid": 31,
		"tags": "color:red"
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Car",
		"itemid": 3,
		"catid": 31,
		"tags": "color:blue"
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Car",
		"itemid": 4,
		"catid": 31,
		"tags": ""
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Bus",
		"itemid": 5,
		"catid": 31,
		"tags": "color:blue"
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Bus",
		"itemid": 6,
		"catid": 12,
		"tags": ""
	}
}
/insert
{
	"index": "tags_test",
	"doc": {
		"model_name": "Bus",
		"itemid": 7,
		"catid": 88,
		"tags": ""
	}
}
/search
{
    "index": "tags_test",
    "_source": [
        "model_name",
        "itemid",
        "catid",
        "tags"
    ],
    "limit": 1000,
    "query": {
        "bool": {
            "must": [
                {
                    "equals": {
                        "catid": 31
                    }
                },
                {
                    "bool": {
                        "should": [
                            {
                                "bool": {
                                    "must": [
                                        {
                                            "in": {
                                                "model_name": [
                                                    "Car"
                                                ]
                                            }
                                        },
                                        {
                                            "match_phrase": {
                                                "tags": "red"
                                            }
                                        }
                                    ]
                                }
                            },
                            {
                                "in": {
                                    "model_name": [
                                        "Bus"
                                    ]
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

For me, this returns:

{
    "took": 0,
    "timed_out": false,
    "hits": {
        "total": 5,
        "total_relation": "eq",
        "hits": [
            {
                "_id": "4685708705916623284",
                "_score": 1,
                "_source": {
                    "itemid": 2,
                    "catid": 31,
                    "model_name": "Car",
                    "tags": "color:red"
                }
            },
            {
                "_id": "4685708705916623285",
                "_score": 1,
                "_source": {
                    "itemid": 3,
                    "catid": 31,
                    "model_name": "Car",
                    "tags": "color:blue"
                }
            },
            {
                "_id": "4685708705916623286",
                "_score": 1,
                "_source": {
                    "itemid": 4,
                    "catid": 31,
                    "model_name": "Car",
                    "tags": ""
                }
            },
            {
                "_id": "4685708705916623287",
                "_score": 1,
                "_source": {
                    "itemid": 5,
                    "catid": 31,
                    "model_name": "Bus",
                    "tags": "color:blue"
                }
            },
            {
                "_id": "4685708705916623283",
                "_score": 1,
                "_source": {
                    "itemid": 1,
                    "catid": 31,
                    "model_name": "Bus",
                    "tags": ""
                }
            }
        ]
    }
}
searchd -v
Manticore 5.0.2 348514c86@220530 dev (columnar 1.15.4 2fef34e@220522) (secondary 1.15.4 2fef34e@220522)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2022, Manticore Software LTD (https://manticoresearch.com)

from the debug code I see the following filter tree catid=31 AND (model_name IN ('Car') OR model_name IN ('Bus'))

and it matches result set well

the issue that you mix match_phrase or match the full text operator with a filters and that will not work

the right query should be

WHERE ('@tags red') AND catid=31 AND (model_name IN ('Car') OR model_name IN ('Bus'))

Thanks for your response. That query is not what I wanted though. What I want is:

  • Return any Bus
  • Return Cars only in red
  • Both Cars and Buses only in category 31

It would translate to this pseudo SQL:

WHERE 
  (model_name='Bus' 
    OR (model_name='Car' AND tags LIKE '%red%'))
  AND catid=123;

As you say though, there is an issue with the match/match_phrase component. The result is the same whether this part is included or not, so why is it being excluded?

Thanks

there is no way to combine full text field tags with filter via OR operator or even put the full text matching deep inside the filter bool tree.
Any complex full text operators could work along with filter tree but they can not be nested inside each other and could only relate with AND, ie WHERE MATCH ( ' ...' ) AND ( filters )