Custom Morphology and Metaphone


#1

We use Morphology when users search for a person in our person database. But as a lot of the names are non-english, the Metaphone and Soundex algorithms are not always producing the required result.

Is there a way to add additional or custom Morphology algorithms, eg a customized version of Soundex?


#2

we released new morph processor icu - could you look at custom collation feature of icu library to make sure is it fits your need
http://userguide.icu-project.org/collation/customization


#3

I am not sure I understand your reply. Are you saying ICU can be used to create a plugin or UDF for Manticore?

I have read some of the ICU documentation and it seems to me like ICU is more designed to deal with internationalization aspects of text.

The problem I am having is not with character sets, but with the algorithm used in the standard Soundex and Metaphone which are optimised for the English language, not for African Names.


#4

you could use ICU also for custom segmentation or create custom UDF for index time or query time to segment text on your own


#5

Can the logic you want be written with help of https://snowballstem.org/ ? It can then be integrated with Manticore Search.


#6

Thank you for the feedback. Will investigate the options proposed.