Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
repo1.maven.org "web-crawler" keyword
de.hs-heilbronn.mi:crawler4j-examples 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-commons 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 4 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-parent 5.0.2
Open Source Web Crawler for Java21 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-hsqldb 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-urlfrontier 5.0.2
Open Source Web Crawler for Java11 versions - Latest release: about 1 year ago - 1 dependent package - 2 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-archetype 5.0.2 💰
Open Source Web Crawler for Java2 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-core 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-sleepycat 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 2 dependent packages - 2 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-boms 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-examples-base 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-sleepycat 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 2 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-hsqldb 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 1 dependent package - 2 dependent repositories - 17 stars on GitHub
com.digitalpebble:storm-crawler-external 0.5 💰
Storm-Crawler Java API with external dependencies.2 versions - Latest release: about 9 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-tika 0.7 💰
Tika-based parser bolt for StormCrawler2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
org.apache.nutch:nutch 2.3.1
Apache Nutch is an extensible and scalable web crawler27 versions - Latest release: over 8 years ago - 1 dependent package - 75 dependent repositories - 2,545 stars on GitHub
com.github.crawler-commons:crawler-commons 1.4
crawler-commons is a set of reusable Java components that implement functionality common to any w...10 versions - Latest release: 11 months ago - 10 dependent packages - 46 dependent repositories - 218 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-aws 1.12.1 💰
AWS resources for StormCrawler37 versions - Latest release: over 5 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-core 0.7 💰
Storm-Crawler core Java API.4 versions - Latest release: over 8 years ago - 6 dependent packages - 4 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-archetype 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.37 versions - Latest release: over 5 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-elasticsearch 0.7 💰
Elasticsearch resources for StormCrawler2 versions - Latest release: over 8 years ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-opensearch 2.11 💰
Opensearch resources for StormCrawler5 versions - Latest release: 5 months ago - 2 dependent repositories - 791 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-langid 1.12.1 💰
Language Identification for StormCrawler30 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-sql 0.7 💰
SQL-based resources for StormCrawler2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-solr 0.7 💰
Solr resources for StormCrawler2 versions - Latest release: over 8 years ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-urlfrontier 2.11 💰
URL Frontier resources for StormCrawler12 versions - Latest release: 5 months ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-elasticsearch 1.12.1 💰
Elasticsearch resources for StormCrawler37 versions - Latest release: over 5 years ago - 7 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-aws 0.7 💰
AWS resources for StormCrawler2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-tika 1.12.1 💰
Tika-based parser bolt for StormCrawler37 versions - Latest release: over 5 years ago - 4 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-opensearch-archetype 2.11 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.5 versions - Latest release: 5 months ago - 771 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-external 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.29 versions - Latest release: over 5 years ago - 789 stars on GitHub
Top 9.2% on repo1.maven.org
37 versions - Latest release: over 5 years ago - 12 dependent packages - 14 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-core 1.12.1 💰
Storm-Crawler core Java API.37 versions - Latest release: over 5 years ago - 12 dependent packages - 14 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-elasticsearch-archetype 2.11 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.15 versions - Latest release: 5 months ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-warc 1.12.1 💰
WARC resources for StormCrawler31 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-sql 1.12.1 💰
SQL-based resources for StormCrawler37 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-solr 1.12.1 💰
Solr resources for StormCrawler37 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler 0.7 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.7 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
ai.platon.pulsar:pulsar-resources 1.12.4
Pulsar Resources61 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-persist 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 10 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-beans 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.62 versions - Latest release: about 2 months ago - 9 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-index 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-client 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.62 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-protocol 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 7 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-qa 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.48 versions - Latest release: 12 months ago - 3 dependent packages - 2 dependent repositories - 291 stars on GitHub
com.norconex.collectors:norconex-collector-http 3.0.2
Norconex HTTP Collector is a web spider, or crawler that aims to be very flexible, easy to extend...25 versions - Latest release: 11 months ago - 4 dependent repositories - 153 stars on GitHub
ai.platon.pulsar:pulsar-site-amazon 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...32 versions - Latest release: about 1 year ago - 1 dependent package - 3 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-third 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-scoring 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-ql-common 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 4 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-site-cn-gov 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...32 versions - Latest release: about 1 year ago - 200 stars on GitHub
ai.platon.pulsar:pulsar-master 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.48 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-sites-support 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...32 versions - Latest release: about 1 year ago - 121 stars on
ai.platon.pulsar:pulsar-boilerpipe 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 5 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-dom 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 4 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-common 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 14 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-tools 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:gora-shaded-mongodb 0.8
Since gora-0.x requires mongo-java-driver while spring-boot-2.4.x does not support it, we make a ...2 versions - Latest release: over 2 years ago - 2 dependent packages - 4 dependent repositories - 1 stars on GitHub
ai.platon.pulsar:pulsar-schedule 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-boot 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.62 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-plugins 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-browser 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.48 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-ql 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.62 versions - Latest release: about 2 months ago - 10 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app-resources 1.9.19
Common application resources32 versions - Latest release: about 1 year ago - 200 stars on GitHub
ai.platon.pulsar:pulsar-parse 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-rest 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.62 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app-common-resources 1.9.19
Pulsar App Common Resources32 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-all 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.52 versions - Latest release: about 2 months ago - 3 dependent packages - 7 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-examples 1.10.14
Scrape Web data at scale completely and accurately with high performance, distributed RPA.47 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-filter 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 7 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-spring-support 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 402 stars on GitHub
com.frogfront:web-crawler 0.1.2
Web Crawler Library (Domain BOT)2 versions - Latest release: about 4 years ago - 1 dependent repositories - 4 stars on GitHub
ai.platon.pulsar:pulsar-skeleton 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.61 versions - Latest release: about 2 months ago - 15 dependent packages - 2 dependent repositories - 402 stars on GitHub
de.hs-heilbronn.mi:crawler4j-examples-postgres 5.0.2
Open Source Web Crawler for Java22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-urlfrontier 5.0.2
Open Source Web Crawler for Java11 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 17 stars on GitHub
ai.platon.pulsar:pulsar-tests 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.48 versions - Latest release: 12 months ago - 5 dependent packages - 4 dependent repositories - 291 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.37 versions - Latest release: over 5 years ago - 789 stars on GitHub
Related Keywords
crawler
74
java
42
web-sql
36
web-scraping
36
web-mining
36
scraping
36
scraper
36
data-science
36
data-mining
36
web-automation
29
rpa
29
stormcrawler
23
distributed
23
apache-storm
23
spider
16
web-spider
15
crawler4j
15
bot
1
search-engine
1
norconex-http-collector
1
flexible
1
collector-http
1
webcrawler
1
sitemaps
1
robots-txt
1
open-source
1
library
1
nutch
1
hadoop
1
crawling
1
apache
1