StormCrawler
StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
StormCrawler Alternatives & Competitors
The best StormCrawler alternatives based on verified products, community votes, reviews and other factors.
Filter:
7
Open-Source Alternatives.
Latest update:
-
Scrapy | A Fast and Powerful Scraping and Web Crawling Framework.
Key Scrapy features:
Efficiency Built-in Tooling Customization Python Integration
-
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web...
Key Heritrix features:
Flexibility Scalability Open Source Robust Documentation
-
As the only API powered by the Prince HTML-to-PDF engine, DocRaptor provides the best support for complex PDFs with powerful support for headers, page breaks, page numbers, flexbox, watermarks, accessible PDFs, and much more.
Key DocRaptor features:
Ease of Use Customizable High-Quality Rendering Compliance
-
Apache Nutch is a highly extensible and scalable open source web crawler software project.
-
A Platform for Data Crawling and Scraping For Business Developers.
Key Crawlbase features:
Scalability Ease of Use Data Quality Customer Support
-
Content Grabber is an automated web scraping tool.
Key Content Grabber features:
User-Friendly Interface Powerful Automation Integration Capabilities Cloud Services
-
Apify is a web scraping and automation platform that can turn any website into an API.
Key Apify features:
Ease of Use Scalability Integration and API Support Customizability
-
ParseHub is a free web scraping tool. With our advanced web scraper, extracting data is as easy as clicking the data you need.
Key ParseHub features:
User-friendly Interface Advanced Features Cross-platform Compatibility API Integration
-
A cloud-based web scraping platform. Extract data from websites and automate workflows on the web.
-
Import. io helps its users find the internet data they need, organize and store it, and transform it into a format that provides them with the context they need.
Key import.io features:
Ease of Use Data Integration Scalability Speed
-
Octoparse provides easy web scraping for anyone. Our advanced web crawler, allows users to turn web pages into structured spreadsheets within clicks.
Key Octoparse features:
User-Friendly Interface Customizable Workflows Cloud-Based Platform Automatic IP Rotation
-
ScrapeBox the Ultimate Link Harvester, Mass Wordpress and Movable Type Blog Comment Poster Complete with PR Storm Mode and Bulk Pagerank Checker.
Key ScrapeBox features:
Versatility Automation Customization Cost-effective
-
Common Crawl.
Key CommonCrawl features:
Comprehensive Coverage Open Access Regular Updates Format and Compatibility
-
Scale Data Collection with a Simple API.
Key Scraper API features:
Proxy API for Web Scraping Automatic CAPTCHA Handling Headless Browser JavaScript Rendering Highly Scalable Infrastructure