KDnuggets » Forums
Latest News



 FAQFAQ    SearchSearch    MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Do you use a database for storing data for data mining ?

 
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Data Mining Open Forum
View previous topic :: View next topic  
Author Message
editor
Site Admin


Joined: 04 Oct 2005
Posts: 120
Location: Boston, MA

PostPosted: Sun Jun 11, 2006 6:11 pm    Post subject: Do you use a database for storing data for data mining ? Reply with quote

Does the power of database system outweigh the extra complexity ?
What is your experience ?
Back to top
View user's profile Send private message Send e-mail Visit poster's website
gabrielac
Contributor


Joined: 16 Feb 2006
Posts: 8

PostPosted: Tue Jul 04, 2006 8:26 am    Post subject: Reply with quote

I often use a database for preprocessing data before actually using it for data mining. Depending on the data mining software, I finally end up using either a text file, or a table with the final preprocessing results.
Back to top
View user's profile Send private message
TimManns
Data Mining Guru


Joined: 25 Sep 2006
Posts: 37
Location: Sydney

PostPosted: Sun Oct 08, 2006 8:22 pm    Post subject: database used for storing and pre-processing Reply with quote

I guess it depends largely on the amount of data, available time-frame and subsequent use of results.

I work for a telecommunications company, and we routinely analyse a few million customers. Our analysis is quite adhoc, so having a fixed datamart is not practical. For scaleablity reasons we must do the preprocessing within the database - making use of the indexed and organised nature of the database. We process and summarise billions of rows of data into a single row per customer/account (with a few hundred columns), then sample and use for model building and testing.

Model building on smaller samples is conducted via flat text files. Often the whole customer base is then scored using a predictive model converted into SQL, occiasionally the customer base data is extacted from the database, scored to a text file and then loaded back into the database.

The final database scored table is used by numerous CRM systems and marketing campaigns, which rely on the data being within the database.

If we didn't use the database for preprocessing we wouldn't be able to analyse the data within a single working day (it would weeks rather than hours).

Cheers

Tim
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Data Mining Open Forum All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KDnuggets » Forums

Copyright © 2012 KDnuggets.   Subscribe to KDnuggets News! Tweet Twitter | facebook Facebook | RSS RSS | About KDnuggets