KDnuggets Home » News » 2010 » Mar » Software » DEiXTo: free web data extraction  ( < Prev | 10:n07 | Next > )

DEiXTo: a free, DOM based, web data extraction tool


 
  
a powerful tool for creating "extraction rules" (wrappers) that describe what pieces of data to scrape from a web page; consists of GUI and a stand-alone extraction rule executor.


DEiXToDEiXTo is a free, DOM based, web data extraction tool. It consists of two standalone components:

a) GUI DEiXTo, an MS WindowsT application implementing a graphical user interface that is used to manage extraction rules (build, test, fine-tune, save and modify), and

b) DEiXTo Executor, a stand-alone extraction rule executor (command line utility) that massively and automatically applies extraction rules on targeted HTML pages and produces structured output in a variety of formats.

DEiXTo can contend with a wide range of web sites with high precision and recall, since it provides the user with an arsenal of features aiming at the construction of well-engineered extraction rules. Wrappers built with GUI DEiXTo can be scheduled to run automatically providing periodic and automated access to resources of interest, saving users a lot of time, energy and repetitive effort.

DEiXTo is provided free of charge. You can find more details at deixto.com


KDnuggets Home » News » 2010 » Mar » Software » DEiXTo: free web data extraction  ( < Prev | 10:n07 | Next > )