Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

repo1.maven.org "web-crawler" keyword

de.hs-heilbronn.mi:crawler4j-examples 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-commons 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 4 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-parent 5.0.2
Open Source Web Crawler for Java
21 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-hsqldb 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-urlfrontier 5.0.2
Open Source Web Crawler for Java
11 versions - Latest release: about 1 year ago - 1 dependent package - 2 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-archetype 5.0.2 💰
Open Source Web Crawler for Java
2 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-core 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-sleepycat 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 2 dependent packages - 2 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-boms 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-examples-base 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-sleepycat 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 2 dependent packages - 1 dependent repositories - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-with-hsqldb 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 1 dependent package - 2 dependent repositories - 17 stars on GitHub
com.digitalpebble:storm-crawler-external 0.5 💰
Storm-Crawler Java API with external dependencies.
2 versions - Latest release: about 9 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-tika 0.7 💰
Tika-based parser bolt for StormCrawler
2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
org.apache.nutch:nutch 2.3.1
Apache Nutch is an extensible and scalable web crawler
27 versions - Latest release: over 8 years ago - 1 dependent package - 75 dependent repositories - 2,545 stars on GitHub
com.github.crawler-commons:crawler-commons 1.4
crawler-commons is a set of reusable Java components that implement functionality common to any w...
10 versions - Latest release: 11 months ago - 10 dependent packages - 46 dependent repositories - 218 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-aws 1.12.1 💰
AWS resources for StormCrawler
37 versions - Latest release: over 5 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-core 0.7 💰
Storm-Crawler core Java API.
4 versions - Latest release: over 8 years ago - 6 dependent packages - 4 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-archetype 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
37 versions - Latest release: over 5 years ago - 789 stars on GitHub
com.digitalpebble:storm-crawler-elasticsearch 0.7 💰
Elasticsearch resources for StormCrawler
2 versions - Latest release: over 8 years ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-opensearch 2.11 💰
Opensearch resources for StormCrawler
5 versions - Latest release: 5 months ago - 2 dependent repositories - 791 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-langid 1.12.1 💰
Language Identification for StormCrawler
30 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-sql 0.7 💰
SQL-based resources for StormCrawler
2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-solr 0.7 💰
Solr resources for StormCrawler
2 versions - Latest release: over 8 years ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-urlfrontier 2.11 💰
URL Frontier resources for StormCrawler
12 versions - Latest release: 5 months ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-elasticsearch 1.12.1 💰
Elasticsearch resources for StormCrawler
37 versions - Latest release: over 5 years ago - 7 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler-aws 0.7 💰
AWS resources for StormCrawler
2 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-tika 1.12.1 💰
Tika-based parser bolt for StormCrawler
37 versions - Latest release: over 5 years ago - 4 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-opensearch-archetype 2.11 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
5 versions - Latest release: 5 months ago - 771 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-external 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
29 versions - Latest release: over 5 years ago - 789 stars on GitHub
Top 9.2% on repo1.maven.org
com.digitalpebble.stormcrawler:storm-crawler-core 1.12.1 💰
Storm-Crawler core Java API.
37 versions - Latest release: over 5 years ago - 12 dependent packages - 14 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-elasticsearch-archetype 2.11 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
15 versions - Latest release: 5 months ago - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-warc 1.12.1 💰
WARC resources for StormCrawler
31 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-sql 1.12.1 💰
SQL-based resources for StormCrawler
37 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler-solr 1.12.1 💰
Solr resources for StormCrawler
37 versions - Latest release: over 5 years ago - 1 dependent repositories - 789 stars on GitHub
com.digitalpebble:storm-crawler 0.7 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
7 versions - Latest release: over 8 years ago - 1 dependent repositories - 789 stars on GitHub
ai.platon.pulsar:pulsar-resources 1.12.4
Pulsar Resources
61 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-persist 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 10 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-beans 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
62 versions - Latest release: about 2 months ago - 9 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-index 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-client 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
62 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-protocol 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 7 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-qa 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.
48 versions - Latest release: 12 months ago - 3 dependent packages - 2 dependent repositories - 291 stars on GitHub
com.norconex.collectors:norconex-collector-http 3.0.2
Norconex HTTP Collector is a web spider, or crawler that aims to be very flexible, easy to extend...
25 versions - Latest release: 11 months ago - 4 dependent repositories - 153 stars on GitHub
ai.platon.pulsar:pulsar-site-amazon 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...
32 versions - Latest release: about 1 year ago - 1 dependent package - 3 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-third 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-scoring 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-ql-common 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 4 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-site-cn-gov 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...
32 versions - Latest release: about 1 year ago - 200 stars on GitHub
ai.platon.pulsar:pulsar-master 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.
48 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-sites-support 1.9.19
The easy way to crawl and scrape the web: turn large web sites into tables and charts using simpl...
32 versions - Latest release: about 1 year ago - 121 stars on
ai.platon.pulsar:pulsar-boilerpipe 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 5 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:pulsar-dom 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 4 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-common 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 14 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-tools 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 402 stars on GitHub
ai.platon.pulsar:gora-shaded-mongodb 0.8
Since gora-0.x requires mongo-java-driver while spring-boot-2.4.x does not support it, we make a ...
2 versions - Latest release: over 2 years ago - 2 dependent packages - 4 dependent repositories - 1 stars on GitHub
ai.platon.pulsar:pulsar-schedule 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-boot 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
62 versions - Latest release: about 2 months ago - 4 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-plugins 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-browser 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.
48 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-ql 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
62 versions - Latest release: about 2 months ago - 10 dependent packages - 4 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app-resources 1.9.19
Common application resources
32 versions - Latest release: about 1 year ago - 200 stars on GitHub
ai.platon.pulsar:pulsar-parse 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 6 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-rest 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
62 versions - Latest release: about 2 months ago - 3 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-app-common-resources 1.9.19
Pulsar App Common Resources
32 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 200 stars on GitHub
ai.platon.pulsar:pulsar-all 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
52 versions - Latest release: about 2 months ago - 3 dependent packages - 7 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-examples 1.10.14
Scrape Web data at scale completely and accurately with high performance, distributed RPA.
47 versions - Latest release: 12 months ago - 291 stars on GitHub
ai.platon.pulsar:pulsar-filter 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 7 dependent packages - 2 dependent repositories - 402 stars on GitHub
ai.platon.pulsar:pulsar-spring-support 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 402 stars on GitHub
com.frogfront:web-crawler 0.1.2
Web Crawler Library (Domain BOT)
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 4 stars on GitHub
ai.platon.pulsar:pulsar-skeleton 1.12.4
Scrape Web data at scale completely and accurately with high performance, distributed AI-RPA.
61 versions - Latest release: about 2 months ago - 15 dependent packages - 2 dependent repositories - 402 stars on GitHub
de.hs-heilbronn.mi:crawler4j-examples-postgres 5.0.2
Open Source Web Crawler for Java
22 versions - Latest release: about 1 year ago - 17 stars on GitHub
de.hs-heilbronn.mi:crawler4j-frontier-urlfrontier 5.0.2
Open Source Web Crawler for Java
11 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 17 stars on GitHub
ai.platon.pulsar:pulsar-tests 1.10.15
Scrape Web data at scale completely and accurately with high performance, distributed RPA.
48 versions - Latest release: 12 months ago - 5 dependent packages - 4 dependent repositories - 291 stars on GitHub
com.digitalpebble.stormcrawler:storm-crawler 1.12.1 💰
A collection of resources for building low-latency, scalable web crawlers on Apache Storm.
37 versions - Latest release: over 5 years ago - 789 stars on GitHub