tokenizer
A simple multilingual tokenizer for NLP tasks. This tool provides a CLI and a library for linguistic tokenization which is an anavoidable step for many HLT (human language technology) tasks in the preprocessing phase for further syntactic, semantic and other higher level processing goals. Use it for tokenization of German, English and French texts.
Ecosystem
gem.coop
gem.coop
Latest Release
over 10 years ago
0.3.0
over 10 years ago
Versions
6
6
Downloads
283,049 total
283,049 total
Loading...
Readme
Loading...
Links
| Registry | gem.coop |
| Source | Repository |
| Docs | Documentation |
| JSON API | View JSON |
| CodeMeta | codemeta.json |
Package Details
Repository
| Stars | 46 on GitHub |
| Forks | 11 on GitHub |
Rankings on gem.coop
Overall
Top 1.9%
Downloads
Top 5.6%