KDnuggets » Forums
Latest News



 FAQFAQ    SearchSearch    MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Use of Unsupervised models for fraud detection

 
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Classification & Clustering
View previous topic :: View next topic  
Author Message
prakash.sridharan
Contributor


Joined: 16 Jan 2008
Posts: 5
Location: Mumbai

PostPosted: Thu Jan 15, 2009 9:58 am    Post subject: Use of Unsupervised models for fraud detection Reply with quote

Hi,

Has anybody come across the use of cluster analysis/other unsupervised models for fraud detection. I'm facing a specific challenge to use an unsupervised model for detecting fraud. The dataset does not have a specific variable as a target/dependent variable.

This is in the field of Health Insurance. I'm unable to divulge too many details in this regard. But the kind of variables we have here are similar to what we encounter in Banking - Credit Card Fraud etc...

Thanks in advance for your help

Prakash
Back to top
View user's profile Send private message Yahoo Messenger
TimManns
Data Mining Guru


Joined: 25 Sep 2006
Posts: 37
Location: Sydney

PostPosted: Sun Jan 18, 2009 4:24 pm    Post subject: yes, try searching for info on 'network intrusion detection' Reply with quote

Over time or at a specific point in time?

- if over a period of time looking for changing/new events
Some of the early virus detection programs worked in this way (ie. anything new on the system was considered a virus/problem).

I've done a couple of projects using Kohonen network nets to observe a large sample and build a model, then observe changes that occur in any individual. Kohonen was useful because you could fit the clusters on a 2D grig map and use flexible criteria about how many X or Y points an individual varies over time from their original cluster classification.

If an individual changes clusters then this may suggest a change in behaviour and/or frauduent behaviour. Stolen credit card etc.

- if at a snap-shot fixed point in time.
Just cluster the data. Any rows not being easily clustered may be frauduent or exceptional for some reason. Simple checking of outlierrs often helps too.

Cheers

Tim
Back to top
View user's profile Send private message
r_bhatt



Joined: 20 Jan 2009
Posts: 1

PostPosted: Tue Jan 20, 2009 7:50 am    Post subject: Unsupervised models for fraud detection Reply with quote

Prakash:

One could use fuzzy logic and clustering to solve the problem.

First you should create variables that measure the potential abnormal patterns in claims behavior. Example of variables could be:
- Number of claims filed through the physician/ service provider in the past month/ 3 months/ 6 months
- Number of claims filed by medical condition code divided by historical average for number of claims by that medical condition code
- Number of claims filed by customer ID
- Disparity features that measure the disparity between age, gender, occupation and medical condition code (e.g., Osteoporosis claims filed by a 20-year old male student) -- this has to be done by medical condition code

Each claim could then be scored on a percentile basis on each variable (looking back 3-6 months). You could then cluster the claims on these percentile scores to get outliers or could use simple scoring algorithms based on business understanding.

You can contact me (raj at knowledgefoundry.net) if you need more details.

cheers
Back to top
View user's profile Send private message
clifton.phua



Joined: 10 Jul 2007
Posts: 3

PostPosted: Tue Jan 20, 2009 8:27 am    Post subject: peer group analysis, spike detection, anomaly detection Reply with quote

if your purpose is to detect fraud in real-time, you can probably try out peer group analysis, spike detection, or anomaly detection approaches (they have been applied to various kinds of fraud detection before). however, you still need class labels to evaluate your models/algorithms.
Back to top
View user's profile Send private message
prakash.sridharan
Contributor


Joined: 16 Jan 2008
Posts: 5
Location: Mumbai

PostPosted: Fri Jan 23, 2009 1:31 am    Post subject: Reply with quote

Raj,

Thank you very much for your suggestion. Its a very interesting idea. I would like to learn more about it. I'll contact you.

Clifton,
This is the approach we are exploring at the moment.

Thanks all for your suggestions.
Back to top
View user's profile Send private message Yahoo Messenger
Display posts from previous:   
Post new topic   Reply to topic    www.kdnuggets.com Forum Index -> Classification & Clustering All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KDnuggets » Forums

Copyright © 2012 KDnuggets.   Subscribe to KDnuggets News! Tweet Twitter | facebook Facebook | RSS RSS | About KDnuggets