Are there programming frameworks for web content mining?


Bing Liu answers:

Although many Web content mining problems have the same framework of extraction and integration, the current techniques for dealing with them are very different. One does not deal with structured data in the same way as unstructured text.

I am not aware of any common programming framework for Web content mining, or even for each specific task. Our research works were done mainly using C and C++. However, for structured data extraction, there are tools on the market that either help you extract data or make it easy for you to write rules to extract data. For opinion mining, there are natural language processing packages that are helpful, e.g., part-of-speech taggers, parsers etc.