KDnuggets Home » Software » Web Content Mining, Screen Scraping
Latest News


Web Content Mining, Screen Scraping

          

commercial | free and open source
  • Automation Anywhere, intelligent automation software to automate business & IT processes, including web data extraction and screen scraping.

  • Bixolabs, an elastic web mining platform built w/Bixo, Cascading & Hadoop for Amazon's cloud (EC2).
  • Extractiv, transforms unstructured web content into highly-structured semantic data.
  • Ficstar, customized web extraction, automated data management, and business intelligence.
  • FMiner, a visual web scraping software with a diagram designer.
  • Helium Scraper, a powerful Web Page Scraper / Web Data Extractor that can be set up to extract from the web virtually anything you can point your mouse at.
  • iWebScraping, Web Scraping, Data Extraction, Data Mining Services. Scrape data from YellowPages, Directory, Amazon, eBay, Business Listing, Google Maps.
  • Metafy Anthracite Web Mining Software, visually construct spiders and scrapers without scripts (requires MacOS X 10.4 or newer).
  • Mozenda, More-Zenful-Data, web content mining.
  • Screen Scraper, allows users to scrape structured and unstructured data from websites and format it (free download).
  • Visual Web Ripper, a powerful visual tool used for automated web scraping, web harvesting and content extraction from the web.
  • Web Data Extraction Services provides robust, cutting-edge solutions and services for data extraction from websites.
  • WebQL, for creating turnkey web extraction applications, such as price collector, patent information aggregator, etc.
  • XML Miner, XML Miner is a system and class library for mining data and text expressed in XML, extracting knowledge and re-using that knowledge in products and applications in the form of fuzzy logic expert system rules.
free and open source
  • Bixo, an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop.
  • DEiXTo, a powerful tool for creating "extraction rules" (wrappers) that describe what pieces of data to scrape from a web page; consists of GUI and a stand-alone extraction rule executor.
  • GNU Wget, command line tool for retrieving files using HTTP, HTTPS and FTP.
  • Pattern, a web mining module for Python; bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider), text analysis (rule-based shallow parser, WordNet interface, tf-idf, ...) and data visualization (graph networks).
  • ScraperWiki, a collaborative platform for web-scraping and screen-scraping code and views.
  • Scrapy, a fast high-level screen scraping and web crawling framework in Python.
  • Trapit, system for personalizing content based on keywords, URLs and reading habits.
  • Web Mining Services, provides free, customized web extracts to meet your needs.
  • WebSundew, a powerful web scraping and web data extraction tool that extracts data from the web pages with high productivity and speed.

Related
Web Usage Mining and Web Log analysis software
Text Mining software

KDnuggets Home » Software » Web Content Mining, Screen Scraping

Copyright © 2013 KDnuggets.  | SUBSCRIBE to KDnuggets News email  | Tweet Twitter | facebook Facebook | RSS RSS | About KDnuggets