KDnuggets » Forums
Latest News



 FAQFAQ    SearchSearch    MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

an help with some definitions

 
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Data Mining Beginners
View previous topic :: View next topic  
Author Message
carnifex



Joined: 04 Oct 2008
Posts: 2

PostPosted: Sat Oct 04, 2008 11:42 am    Post subject: an help with some definitions Reply with quote

Hi everybody.
I'm an university student and as a part of my thesis, I should define some KDD concept and compare the different interpretations (if there are more then 1 definition).

I'd appreciate very much an help to define the following concepts.

1) What is a model?
2) How would you classify the models (I mean, in what subclasses/subsets would you organize all models)?
3) Do you agree with this definition of a "classification model":a model that is produced in output by a classification algorithm?
4) What is a data mining method (used by a data mining algorithm)?
5) About data mining, do you think most in terms of data or rather in terms of task?
I mean, do you prefer to define an algorithm from a functional point of view (input/output) or in any other way, for example its purpose, its task?

Thank you in advance,
Emanuele Storti
Back to top
View user's profile Send private message
TimManns
Data Mining Guru


Joined: 25 Sep 2006
Posts: 37
Location: Sydney

PostPosted: Tue Oct 07, 2008 5:21 pm    Post subject: Reply with quote

Hello Emanuele,

Here's my feedback from an industry (telco) marketing point of view. I'm probably old-fashioned, typically use only neural nets, decision trees, and k-means. Will be trying some SVM later;

1) A 'Model' is any calculation or formula that produces an output (column/field) that generates or can be used to generate new information.
Typically we use Neural nets or decision trees, but sometimes simple business rules or math to create a score.

2) I personally classify based upon how I use the model output score.

I think of any model as a 'classification model' when it is used to determine one of a pre-determined selection of outcomes (eg, i could predict churn yes/no, sales $, fraud 0/1, etc). To keep things simple for the business this includes numeric outputs and scores (as long as it classifies predetermined outcome such as churn).

The term 'segmentation model' is used when the model score is used to group customers together into homogenous groups (commonly behavioural segmentation).

3) depends what its used for, but prob 'yes'.

4) can't answer, too vague question. Need clarification.

5) Purpose. The type of data used, the type of data/score outputted by the model are also very important in determining if and how it would be used. Scaleability with large datasets determines if it can be used more than anything else.

In our business we (even data miners) talk of 'churn model', 'credit risk model', 'up-sell' model. Yes, 'we' the analyts choose what data and models to use to accomplish this, but the wider community within the business know nothing of 'Neural nets' or 'classification models' etc.

Hope that helps

Tim
Back to top
View user's profile Send private message
carnifex



Joined: 04 Oct 2008
Posts: 2

PostPosted: Wed Oct 08, 2008 3:44 am    Post subject: Reply with quote

Thank you Tim, your reply helped me to understand better the main differences between the academic and the business points of views about data mining.

About the question four:
4) What is a data mining method (used by a data mining algorithm)?

I often read many authors talking about Methods with different meanings: some data miners use this term as a synonym for "Algorithm"(tree methods, NeuralNet methods, etc...), while others use it with a different meaning (for example a shortcut for Computational Method o for Search Method -like "greedy search" or "steepest descent"- that are the way an algorithm works to generate the better model by optimizing some score function).

So this fact contributes to render the term quite ambigue. In sum, I think that can hardly be found a common interpretation for this term, unlike the terms Task or Algorithm whose definitions are generally accepted.

Emanuele
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Data Mining Beginners All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KDnuggets » Forums

Copyright © 2012 KDnuggets.   Subscribe to KDnuggets News! Tweet Twitter | facebook Facebook | RSS RSS | About KDnuggets