Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

nuget.org : stanford.nlp.segmenter

Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require more extensive token pre-processing, which is usually called segmentation.

Registry - Homepage - JSON
purl: pkg:nuget/stanford.nlp.segmenter
Keywords: nlp, stanford, segmenter, tokenization, splitting, IKVM
License:
Latest release: over 3 years ago
First release: over 10 years ago
Downloads: 22,841 total
Last synced: 14 days ago

    Loading...
    Readme
    Loading...