| Author |
Message |
Topic: Question about data processing |
editor
Replies: 2
Views: 85
|
Forum: Data Mining Beginners Posted: Mon Jun 03, 2013 6:24 am Subject: Converting categorical (nominal) features to boolean |
This conversion is standard when applying many types of machine learning algorithms like neural nets which cannot deal easily with nominal values.
For example, say you have a feature COLOR, and it ... |
Topic: data mining on POI data |
editor
Replies: 1
Views: 100
|
Forum: Data Mining Open Forum Posted: Tue May 21, 2013 4:18 pm Subject: POIs (point of interest) clustering |
You need to use not Weka clustering but some geographic clustering,
see
http://stackoverflow.com/questions/10108368/detecting-geographic-clusters
http://www.rise-group.org/risem/clusterpy/ |
Topic: Please advise how I can obtain data |
editor
Replies: 2
Views: 92
|
Forum: Data Mining Beginners Posted: Tue May 21, 2013 4:12 pm Subject: data mining to find ethnic groups |
It is not easy, but technically possible - one can identify many jewish and muslim family names and you can probably buy some spam email list covering UK. But there would be many false positives.
M ... |
Topic: Creating a corpus of online news |
editor
Replies: 2
Views: 165
|
Forum: Text Mining, Web Mining, Association Rules, and Other Algorithms Posted: Fri May 03, 2013 4:25 pm Subject: Creating a corpus of online news |
check GDELT database: Global Data on Events, Location and Tone, which is an amazing tool for data journalists.
http://www.kdnuggets.com/2013/04/gdelt-global-data-on-events-location-tone.html |
Topic: Application Recommendation |
editor
Replies: 1
Views: 150
|
Forum: Data Mining Applications Posted: Fri May 03, 2013 4:24 pm Subject: Data mining answer to Worlfram Alpha |
| What you are looking for - automatically show basic visualizations, display significant correlations, and run some basic tests of significance between different variable interactions - you can get som ... |
Topic: An up to date keyword set on global news |
editor
Replies: 1
Views: 314
|
Forum: Data Mining Beginners Posted: Fri May 03, 2013 4:20 pm Subject: English keyword set |
Have you tried Google N-grams - you can download their data from
http://storage.googleapis.com/books/ngrams/books/datasetsv2.html |
Topic: Theoretical aspects of DM? HELP! |
editor
Replies: 2
Views: 477
|
Forum: Data Mining Beginners Posted: Tue Apr 23, 2013 6:52 am Subject: Books to learn data mining / data science |
Does not matter which books you start with -
Elements of Statistical Learning is very good book but more theoretical than others.
I also like Weka 3 book and
Learning from Data - see http://wor ... |
Topic: Which course is more important for data mining? |
editor
Replies: 3
Views: 382
|
Forum: Data Mining Beginners Posted: Sun Apr 21, 2013 9:37 am Subject: statistics courses |
Statistics courses are more important for data mining than analysis of algorithms - check also free courses on machine learning from Coursera
and Caltech |
Topic: Pursuing an education in data mining/analytics |
editor
Replies: 1
Views: 493
|
Forum: Data Mining Open Forum Posted: Tue Apr 16, 2013 7:05 pm Subject: Education in Analytics / Data Mining |
There are many free courses on analytics / data mining - see
http://www.kdnuggets.com/2013/02/online-courses-statistics-machine-learning-analytics.html
http://www.kdnuggets.com/2013/04/caltech-fr ... |
Topic: Using Association Rules to predict future sales (income) |
editor
Replies: 4
Views: 1445
|
Forum: Data Mining Open Forum Posted: Tue Apr 09, 2013 8:06 pm Subject: Predicting future sales values |
| yes, there is huge work on predicting time series values and lots of textbooks and courses. Neural networks and other special methods will work well. |
| |