proxy.golang.org : github.com/proycon/python-ucto
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).
Registry
-
Source
- Documentation
- JSON
purl: pkg:golang/github.com/proycon/python-ucto
Keywords:
computational-linguistics
, folia
, nlp
, nlp-library
, python
, text-processing
, tokenizer
License:
Latest release: 10 months ago
First release: almost 10 years ago
Stars: 29 on GitHub
Forks: 5 on GitHub
Total Commits: 138
Committers: 1
Average commits per author: 138.0
Development Distribution Score (DDS): 0.0
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: 3 days ago