An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

Top 7.5% dependent repos on pypi.org

pypi.org : python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).

Registry - Source - Documentation - JSON
purl: pkg:pypi/python-ucto
Keywords: tokenizer , tokenization , tokeniser , tokenisation , nlp , computational_linguistics , ucto , computational-linguistics , folia , nlp-library , python , text-processing
License: GPL-3.0
Latest release: 4 months ago
First release: about 10 years ago
Dependent packages: 1
Dependent repositories: 4
Downloads: 3,588 last month
Stars: 29 on GitHub
Forks: 5 on GitHub
Total Commits: 138
Committers: 1
Average commits per author: 138.0
Development Distribution Score (DDS): 0.0
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: 6 days ago