machine learning - lexical-level similarity word clustering tool -

- September 15, 2014

is there open software toolkit compares lexcial-level similarities among words , group similar words together? example, blue jean, blue jeans, , blue jea (miss-spelled) should grouped together? don't need semantic similarity here.

try natural language toolkit http://nltk.org/

here's rather abstract treatment of brown clustering algorithm http://www.cs.columbia.edu/~cs4705/lectures/brown.pdf

the standard similarity metric between words levenstein distance http://en.wikipedia.org/wiki/damerau%e2%80%93levenshtein_distance

Search This Blog

Scrio

machine learning - lexical-level similarity word clustering tool -

Comments

Post a Comment

Popular posts from this blog

python - cx_oracle unable to find Oracle Client -

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -