Crate tokenizer

Source
Expand description

tokenizer implementation, currently target for Thai language

It re-export two main module in root module.

  • en - A space based tokenizer.
  • th - A dictionary based tokenizer.

Modules§

en
A whitespace based word tokenizer.
th
Dictionary based Thai word tokenizer

Traits§

Tokenizer
A trait that all Tokenizer should implement.