Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

repo1.maven.org "deduplication" keyword

org.zouzias:spark-lucenerdd_2.11 0.3.10
spark-lucenerdd
39 versions - Latest release: almost 3 years ago - 128 stars on GitHub
io.github.phymbert:spark-search-sql_2.11 0.2.0
Spark Search - high performance advanced search features based on Apache Lucene
5 versions - Latest release: over 2 years ago - 22 stars on GitHub
io.github.phymbert:spark-search_2.12 0.2.0
Spark Search - high performance advanced search features based on Apache Lucene
9 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 22 stars on GitHub
io.github.phymbert:spark-search_2.11 0.2.0
Spark Search - high performance advanced search features based on Apache Lucene
9 versions - Latest release: over 2 years ago - 1 dependent package - 22 stars on GitHub
io.github.phymbert:spark-search-sql_2.12 0.2.0
Spark Search - high performance advanced search features based on Apache Lucene
5 versions - Latest release: over 2 years ago - 1 dependent repositories - 22 stars on GitHub
io.github.phymbert:spark-search-parent_2.11 0.2.0
Spark Search Parent - high performance advanced search features based on Apache Lucene
2 versions - Latest release: over 2 years ago - 22 stars on GitHub
io.github.phymbert:spark-search-parent_2.12 0.2.0
Spark Search Parent - high performance advanced search features based on Apache Lucene
3 versions - Latest release: over 2 years ago - 22 stars on GitHub
org.bradfordmiller:deduper 0.0.40
General deduping engine for JDBC sources with output to JDBC/csv targets
19 versions - Latest release: over 3 years ago - 5 stars on GitHub
com.bakdata.dedupe:examples 2.1.1
Concise examples of using the deduplication DSL.
8 versions - Latest release: 4 months ago - 19 stars on GitHub
com.bakdata.dedupe:common 2.1.1
Typical implementation of similarity measures, duplicate detection strategies, cluster algorithms...
8 versions - Latest release: 4 months ago - 1 dependent package - 19 stars on GitHub
com.bakdata.dedupe:core 2.1.1
Base interfaces and data structures for defining a deduplication workflow.
8 versions - Latest release: 4 months ago - 1 dependent package - 19 stars on GitHub
io.github.phymbert:spark-search-parent 0.1.8
Spark Search Parent - high performance advanced search features based on Apache Lucene
3 versions - Latest release: over 2 years ago - 22 stars on GitHub