DataGrab is a no-code web scraping platform that allows extracting data at scale for a variety of uses, such as lead generation, machine learning, research, content aggregation, and more.
We offer a free Chrome extension for setting up scrapers, and a cloud service for running them at scale.
We handle everything from proxies, IP rotation, to data storage so you can just focus on defining the data you need.
Features:
Based on our record, DataGrab.io should be more popular than Apache Nutch. It has been mentiond 7 times since March 2021. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
If you prefer a point-and-click solution, check out https://datagrab.io. It offers a Chrome extension for setting up scrapers visually and you can then run them in your local browser or in the cloud. Disclaimer: I'm building it. Source: 12 months ago
If you're OK with a no-code solution, I'm building a tool called DataGrab (https://datagrab.io) that allows you to set up scrapers visually, then run them either in your browser or on the cloud. Source: about 1 year ago
If you prefer a no-code alternative, check out https://datagrab.io. It offers a Chrome extension for setting up scrapers visually and a cloud service for running them in the background at scale. For pagination, it also supports the infinite scrolling technique (which I see is used by this site). Source: over 1 year ago
There are many tools that allow you to build scrapers visually (without coding). I'm building https://datagrab.io, which is one of them. Check it out! Source: almost 2 years ago
I'm building https://datagrab.io, a no-code web scraping tool. I offer a Chrome extension for setting up your scrapers visually and a cloud service for running them in the background. Source: almost 2 years ago
Hi, I have read few comments under the post, there are great suggestions also your questions regarding task are on the point. But I believe handling this with a script might be not easy. If I were you, I would use Apache Nutch or similar open source software/library.I have used Nutch for my thesis for similar task that I had to scrap a lot of blog pages and the other pages they were referencing. You can configure... Source: over 1 year ago
I've never used it, but I was on a project where we considered Apache Nutch: https://nutch.apache.org/. Source: over 1 year ago
Data Miner - Data Miner is a Google Chrome extension that helps you scrape data from web pages and into a CSV file or Excel spreadsheet.
Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
Outscraper - Scrape Google Maps Places, Business Reviews, Photos, Play Market Reviews, and more. Get any public data from the internet by applying cutting-edge technologies. Web Tools & API's.
StormCrawler - StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
Web Scraper - Web site data extraction tool ⚒️
CommonCrawl - Common Crawl