How to support Chinese search？and other national languages, Japan …
By default, only english and russian characters are indexed.
You need to create a custom charset_table to include Chinese or other language characters you need.
For example to include also Swedish characters, the charset table should look like
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451,U+C5->U+E5, U+E5, U+C4->U+E4, U+E4, U+D6->U+F6, U+F6
On Sphinx Wiki there is a page with lists for various languages: http://sphinxsearch.com/wiki/doku.php?id=charset_tables
For CJK languages you might want to use the ngram feature (for unsegmented texts).
I tried to import CJK , but it does no work
select * from test1 where match('@title =网站');
source = src1 path = manticore/data/test1 ngram_len = 1 charset_table = 0..9, english,U+F900->U+8C48, U+F901->U+66F4, U+F902->U+8ECA, U+F903->U+8CC8, U+F904->U+6ED1, U+F905->U+4E32, \ U+F906->U+53E5, U+F907->U+9F9C, U+F908->U+9F9C, U+F909->U+5951, U+F90A->U+91D1, U+F90B->U+5587, U+F90C->U+5948, U+F90D->U+61F6, \ U+F90E->U+7669, U+F90F->U+7F85, U+F910->U+863F, U+F911->U+87BA, U+F912->U+88F8, U+F913->U+908F, U+F914->U+6A02, U+F915->U+6D1B, \ U+F916->U+70D9, U+F917->U+73DE, U+F918->U+843D, U+F919->U+916A, U+F91A->U+99F1, U+F91B->U+4E82, U+F91C->U+5375, U+F91D->U+6B04, \ ... ..
I don’t need a word segmentation.only need to find out
Hi, jfyi we have a new article about using Manticore in CJK languages, I hope it would be useful https://bit.ly/2Ll9cyJ