blend chars in search input

Maybe i have some problems understanding blend chars,

config:

blend_chars = U+002C, U+002E
blend_mode = trim_tail, skip_pure

German Input Search:

Serviceportal, ändern

the word ändern is in Stopwortlist.

I was thinking that if i input these search “Serviceportal, ändern”
that internaly manticore will search only with “serviceportal”, but the meta shows me that manticore search with “serviceportal,” with comma at the end.

Is that a bug or is blendchars only for indexing and i have to remove the comma from my input?

Thanks Nic
I use manticore 3.5.4

blend_chars affects only indexation. When it’s turned on and indexation finds a character which is defined as a blend char it can make multiple tokens from it depending on the blend modes. You can use CALL KEYWORDS to see how it works:

mysql> drop table if exists t; create table t(f text) blend_chars = 'U+002E' blend_mode='trim_tail, skip_pure'; call keywords('.a.', 't');
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.00 sec)

+------+-----------+------------+
| qpos | tokenized | normalized |
+------+-----------+------------+
| 1    | .a        | .a         |
| 1    | a         | a          |
+------+-----------+------------+
2 rows in set (0.00 sec)
mysql> drop table if exists t; create table t(f text) blend_chars = 'U+002E' blend_mode='trim_head, trim_tail, trim_both, trim_none'; call keywords('.a.', 't');
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

+------+-----------+------------+
| qpos | tokenized | normalized |
+------+-----------+------------+
| 1    | .a.       | .a.        |
| 1    | a         | a          |
| 1    | a.        | a.         |
| 1    | .a        | .a         |
| 1    | a         | a          |
+------+-----------+------------+
5 rows in set (0.00 sec)

The important things to remember are:

  1. during search no transformations related with blend chars happen
  2. blend char is not a separator any more as soon as it’s defined as a blend char

Therefore you should make sure you use the blend_mode wisely.

In your case it might make sense to add trim_none or trim_head to the list so there’s a match for serviceportal,:

mysql> drop table if exists t; create table t(f text) blend_chars = 'U+002C, U+002E' blend_mode='trim_tail, skip_pure, trim_head'; call keywords('Serviceportal, ändern', 't'); insert into t values(0, 'Serviceportal, ändern'); select * from t where match('Serviceportal, ändern'); show meta;
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.00 sec)

+------+----------------+----------------+
| qpos | tokenized      | normalized     |
+------+----------------+----------------+
| 1    | serviceportal, | serviceportal, |
| 1    | serviceportal  | serviceportal  |
| 1    | serviceportal  | serviceportal  |
| 2    | andern         | andern         |
+------+----------------+----------------+
4 rows in set (0.00 sec)

Query OK, 1 row affected (0.00 sec)

+---------------------+------------------------+
| id                  | f                      |
+---------------------+------------------------+
| 1514145039905718312 | Serviceportal, ändern  |
+---------------------+------------------------+
1 row in set (0.00 sec)

+---------------+----------------+
| Variable_name | Value          |
+---------------+----------------+
| total         | 1              |
| total_found   | 1              |
| time          | 0.000          |
| keyword[0]    | andern         |
| docs[0]       | 1              |
| hits[0]       | 1              |
| keyword[1]    | serviceportal, |
| docs[1]       | 1              |
| hits[1]       | 1              |
+---------------+----------------+
9 rows in set (0.00 sec)

mysql>