KDnuggets : News : 2005 : n02 : item14 < PREVIOUS | NEXT >

Publications


Subject: Text Data Mining in Business Intelligence

DM Review Magazine -- January 2005 -- William McKnight

Free-form text data has long been a bane of our existence in data warehousing and business intelligence (BI). If only those operational systems that provide data to the warehouse required the data to be codified at the point of entry, our lives receiving and further processing that data would be much easier. At the least, if pop-ups with suggested contextual data would intervene at the point of entry and promote quality, consistent data, we wouldn't get such a mixed quality bag of data for the warehouse.

...

  • Merrill Lynch and Gartner studies found that 85 to 90 percent of all corporate data is stored as text.
  • The data size resulting from this high level of text is becoming less of a problem as processing capacity continues to double every 18 months.
  • It makes sense for some text - such as e-mails, documents and call center logs -- to be free-form so as not to take away actual value from the data that would result from codification. ...
Here is the rest of the story.

KDnuggets : News : 2005 : n02 : item14 < PREVIOUS | NEXT >

Copyright © 2005 KDnuggets.   Subscribe to KDnuggets News!