Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org : utoken

utoken is a universal tokenizer (multilingual word segmenter) that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags. It comes with a companion detokenizer.

Registry - Source - Documentation - JSON
purl: pkg:pypi/utoken
Keywords: machine translation, datasets, NLP, natural language processing, computational linguistics
License: Apache-2.0
Latest release: over 2 years ago
First release: over 2 years ago
Dependent repositories: 1
Downloads: 78 last month
Stars: 12 on GitHub
Forks: 1 on GitHub
See more repository details: repos.ecosyste.ms
Last synced: about 1 month ago

0.1.8
Published: over 2 years ago
Registry - Documentation - Download sha256-c2b6181a502cb...
0.1.7
Published: over 2 years ago
Registry - Documentation - Download sha256-2450aef66d203...
0.1.6
Published: over 2 years ago
Registry - Documentation - Download sha256-8b43046627217...
0.1.3
Published: over 2 years ago
Registry - Documentation - Download sha256-c2db002bd9c14...
0.1.2
Published: over 2 years ago
Registry - Documentation - Download sha256-0fd9a502e108f...
0.1.1
Published: over 2 years ago
Registry - Documentation - Download sha256-3f1aa62efe703...