Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

crates.io "commoncrawl" keyword

ungoliant 2.0.0
The pipeline for the OSCAR corpus.
5 versions - Latest release: over 1 year ago - 2.01 thousand downloads total - 147 stars on GitHub - 2 maintainers
amadeus-commoncrawl 0.4.3
Harmonious distributed data analysis in Rust.
25 versions - Latest release: about 3 years ago - 1 dependent package - 1 dependent repositories - 10.7 thousand downloads total - 466 stars on GitHub - 1 maintainer
tantivy_warc_indexer 0.2.0
Builds a tantivy index from common crawl warc.wet files
1 version - Latest release: almost 3 years ago - 494 downloads total - 8 stars on GitHub - 1 maintainer