|
| View previous topic :: View next topic |
| Author |
Message |
carnifex
Joined: 04 Oct 2008 Posts: 2
|
Posted: Sat Oct 04, 2008 11:42 am Post subject: an help with some definitions |
|
|
Hi everybody.
I'm an university student and as a part of my thesis, I should define some KDD concept and compare the different interpretations (if there are more then 1 definition).
I'd appreciate very much an help to define the following concepts.
1) What is a model?
2) How would you classify the models (I mean, in what subclasses/subsets would you organize all models)?
3) Do you agree with this definition of a "classification model":a model that is produced in output by a classification algorithm?
4) What is a data mining method (used by a data mining algorithm)?
5) About data mining, do you think most in terms of data or rather in terms of task?
I mean, do you prefer to define an algorithm from a functional point of view (input/output) or in any other way, for example its purpose, its task?
Thank you in advance,
Emanuele Storti |
|
| Back to top |
|
 |
TimManns Data Mining Guru
Joined: 25 Sep 2006 Posts: 37 Location: Sydney
|
Posted: Tue Oct 07, 2008 5:21 pm Post subject: |
|
|
Hello Emanuele,
Here's my feedback from an industry (telco) marketing point of view. I'm probably old-fashioned, typically use only neural nets, decision trees, and k-means. Will be trying some SVM later;
1) A 'Model' is any calculation or formula that produces an output (column/field) that generates or can be used to generate new information.
Typically we use Neural nets or decision trees, but sometimes simple business rules or math to create a score.
2) I personally classify based upon how I use the model output score.
I think of any model as a 'classification model' when it is used to determine one of a pre-determined selection of outcomes (eg, i could predict churn yes/no, sales $, fraud 0/1, etc). To keep things simple for the business this includes numeric outputs and scores (as long as it classifies predetermined outcome such as churn).
The term 'segmentation model' is used when the model score is used to group customers together into homogenous groups (commonly behavioural segmentation).
3) depends what its used for, but prob 'yes'.
4) can't answer, too vague question. Need clarification.
5) Purpose. The type of data used, the type of data/score outputted by the model are also very important in determining if and how it would be used. Scaleability with large datasets determines if it can be used more than anything else.
In our business we (even data miners) talk of 'churn model', 'credit risk model', 'up-sell' model. Yes, 'we' the analyts choose what data and models to use to accomplish this, but the wider community within the business know nothing of 'Neural nets' or 'classification models' etc.
Hope that helps
Tim |
|
| Back to top |
|
 |
carnifex
Joined: 04 Oct 2008 Posts: 2
|
Posted: Wed Oct 08, 2008 3:44 am Post subject: |
|
|
Thank you Tim, your reply helped me to understand better the main differences between the academic and the business points of views about data mining.
About the question four:
4) What is a data mining method (used by a data mining algorithm)?
I often read many authors talking about Methods with different meanings: some data miners use this term as a synonym for "Algorithm"(tree methods, NeuralNet methods, etc...), while others use it with a different meaning (for example a shortcut for Computational Method o for Search Method -like "greedy search" or "steepest descent"- that are the way an algorithm works to generate the better model by optimizing some score function).
So this fact contributes to render the term quite ambigue. In sum, I think that can hardly be found a common interpretation for this term, unlike the terms Task or Algorithm whose definitions are generally accepted.
Emanuele |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|
|