|
Web Content Mining, Screen Scraping |
commercial | free and open source
-
Automation Anywhere, intelligent automation software to automate business & IT processes, including web data extraction and screen scraping.
-
Bixolabs, an elastic web mining platform built w/Bixo, Cascading & Hadoop for Amazon's cloud (EC2).
-
Extractiv, transforms unstructured web content into highly-structured semantic data.
-
Ficstar, customized web extraction, automated data management, and business intelligence.
-
FMiner, a visual web scraping software with a diagram designer.
-
Helium Scraper, a powerful Web Page Scraper / Web Data Extractor that can be set up to extract from the web virtually anything you can point your mouse at.
-
iWebScraping, Web Scraping, Data Extraction, Data Mining Services. Scrape data from YellowPages, Directory, Amazon, eBay, Business Listing, Google Maps.
-
Metafy Anthracite Web Mining Software, visually construct spiders and scrapers without scripts (requires MacOS X 10.4 or newer).
-
Mozenda, More-Zenful-Data, web content mining.
-
Screen Scraper, allows users to scrape structured and unstructured data from websites and format it (free download).
-
Visual Web Ripper, a powerful visual tool used for automated web scraping, web harvesting and content extraction from the web.
-
Web Data Extraction Services provides robust, cutting-edge solutions and services for data extraction from websites.
-
WebQL, for creating turnkey web extraction applications, such as
price collector, patent information aggregator, etc.
-
XML Miner, XML Miner is a system and class library for mining data and text expressed in XML, extracting knowledge and re-using that knowledge in products and applications in the form of fuzzy logic expert system rules.
free and open source
-
Bixo, an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop.
-
DEiXTo, a powerful tool for creating "extraction rules" (wrappers) that describe what pieces of data to scrape from a web page; consists of GUI and a stand-alone extraction rule executor.
-
GNU Wget, command line tool for retrieving files using HTTP, HTTPS and FTP.
-
Pattern, a web mining module for Python; bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider), text analysis (rule-based shallow parser, WordNet interface, tf-idf, ...) and data visualization (graph networks).
-
ScraperWiki, a collaborative platform for web-scraping and screen-scraping code and views.
-
Scrapy, a fast high-level screen scraping and web crawling framework in Python.
-
Trapit, system for personalizing content based on keywords, URLs and reading habits.
-
Web Mining Services, provides free, customized web extracts to meet your needs.
-
WebSundew, a powerful web scraping and web data extraction tool that
extracts data from the web pages with high productivity and speed.
|
|
|