KDnuggets™ News 13:n03, Feb 13
Features (9) | Software (6) | Webcasts (2) | Courses, Events (3) | Jobs (6) | Meetings (5) | Academic (1) | Competitions (4) | Publications (10) | NewsBriefs (4) | CFP (14) | Quote
Features
- Data Scientist Valentine's Day Adjustment - Feb 12, 2013.
Data Scientists may have the sexiest profession of the 21st century, but how do they solve their Valentine's Day problem? New KDnuggets Cartoon examines a solution.
- New Poll: Analytics/DM Salary? - Feb 11, 2013.
New KDnuggets Poll is asking: if you are in Analytics/Data Science/Data Mining field, what is your Annual Salary/Income? See past results and trends and vote, and we will publish latest results and trends.
- Gartner Magic Quadrant for Business Intelligence and Analytics Platforms - Feb 7, 2013.
In 2012 Data Discovery became a mainstream part of BI and analytic architecture. The market also saw increased activity in real time, content and predictive analytics. The leaders in the market include Microsoft, IBM, Tableau, QlikTech, Oracle, SAS, MicroStrategy, Tibco Spotfire, Information Builders, and SAP.
- Data Science Top Influencers - Jan 30, 2013.
Data Science A-list - latest update: @analyticbridge, @derrickharris, @kdnuggets, @rwang0, @gilpress, @cloud_aware, @swgoof, @data_nerd, and @al3xandru.
- PAW San Francisco 2013 Features Speakers from Leading Companies - Feb 12, 2013.
Predictive Analytics World is returning to San Francisco, April 14-19, 2013. This year's agenda is packed with sessions by industry luminaries from top performing companies. Early bird rate until Feb 22 and special KDnuggets discount!
- Top news for Feb 3-9: KDD-2013 Call for Papers; Microsoft F# for Big Data programming - Feb 10, 2013.
KDD-2013 Call for Papers; Microsoft F# for Big Data programming; Salford: The Evolution of Regression
Top jobs: Data Analytics Position at Crum and Forster; Senior Communications Officer at Bill and Melinda Gates Foundation. - Top news for Jan 27 - Feb 2: 5 years to become a data scientist; Particle Physics and Recommendation Engines - Feb 3, 2013.
Poll Results: How long to become a good data scientist; How Particle Physics Is Improving Recommendation Engines; Free Online Data Science course
Top jobs: Data Scientist at LivingSocial; Data Mining Researcher at Stevens Capital Management - Top news, jobs in January: Exclusive: Interview with Chief Scientist Obama 2012 Campaign; 5 years to data scientist - Feb 1, 2013.
KDnuggets Exclusive: Interview with Rayid Ghani, Chief Scientist Obama 2012 Campaign; 5 years to become a good data scientist? The Big Data Landscape
Top jobs: Statistical Modeler/Data Mining Consultant at Objectifi; Senior Data Scientist - Algorithms/Analytics (Recent Grad) at Netflix; - Additions to KDnuggets in January - Feb 1, 2013.
Many new education options for analytics and data mining; data mining cartoons; new companies, datasets, education, meetings, publications, software, and solutions.
Software
- Provalis Research - a Leader in Text Analytics; Hurwitz Text Analytics Victory Index - Feb 8, 2013.
Hurwitz and Associates Victory Index recognized Provalis as a leader in Text Analytics, excelling in product value and ROI. The victors include IBM, SAS, OpenText, Clarabridge, and Attensity.
- DEAP 0.9: Distributed Evolutionary Algorithms in Python - Feb 7, 2013.
DEAP (Distributed Evolutionary Algorithms in Python) is a novel evolutionary computation framework for rapid prototyping and testing of ideas. DEAP 0.9.0 has many improvements, including SCOOP , a distributed task module allowing concurrent parallel programming.
- Lavastorm Analytics Survey - Win an iPad Mini, see results - Feb 6, 2013.
Some of the results so far: The biggest struggle these folks are having is being able to manipulate and integrate the data; Over 60% of these respondents will be increasing their investment in analytics in 2013.
- 3 Generations of Machine Learning and Data Mining Tools - Feb 6, 2013.
Three different paradigms available for implementing Machine Learning (ML) algorithms both from the literature and from the open source community.
- BabelNet 1.1: an encyclopedic dictionary and a multilingual ontology - Feb 5, 2013.
BabelNet was created by mapping Wikipedia to the most popular computational lexicon of English (WordNet), producing an "encyclopedic dictionary" that provides babel synsets - concepts and named entities in many languages, connected with semantic relations.
- Microsoft F# for Big Data programming - Feb 4, 2013.
Microsoft's F# language geared to parallel programming, data-oriented problem-solving. F# language was carefully designed to facilitate data-oriented problem-solving and reduce bugs in data manipulation.
Webcasts
- On-Demand Webcast: Analytically Speaking featuring John Sall - Feb 12, 2013.
SAS co-founder and executive VP John Sall talks about the discipline of statistics as a framework for uncovering phenomena, on the joy of discovery, and the impact of statistics on our lives.
- Salford: The Evolution of Regression: Hands-on Webinar Series, Mar 1, 15, 29, Apr 12 - Feb 6, 2013.
Regression is one of the most popular modeling methods, but the classical approach has significant problems. This free webinar series will cover improvements to conventional and logistic regression, including modern ensemble and data mining approaches, and will be of value to any classically trained statistician or modeler.
Courses, Events
- FREE VIDEO COURSE: Accounts Receivable Recovery and Collections Analytics - Feb 12, 2013.
Enroll in the Accounts Receivable Recovery and Collections Analytics 4-part video course and learn how to use collections analytics to improve your A/R recovery.
- Discover the power of business analytics - Feb 11, 2013.
The SAS Business Knowledge Series offer courses for all levels of data miners and data scientists, where you can learn advanced processes and state-of-the-art techniques from top experts. Spring classes are filling quickly; register today!
- SFBayACM Course: Practical Data Visualization with R Mar 9, San Jose, CA - Feb 5, 2013.
This day-long workshop will provide practical review of R major graphing capabilities, including base graphics and new lattice and ggplot2 packages. The speaker is a leading researcher and practioner, author of "R in Action" book.
Jobs
- Research Staff at NEC Labs, Princeton, NJ - Feb 12, 2013.
NEC Labs America conducts research in the area of large-scale complex systems, creating innovative analytics from big data to simplify and automate the management of physical systems (automobiles, power plants, etc.), as well as large-scale IT systems and services.
- Manager, Data Analytics at Central Michigan U, Mount Pleasant, MI - Feb 11, 2013.
Design, analyze, and implement Institutional Research projects and provide analytical support for assessing academic and administrative operations.
- Data Analytics and Optimization at ExxonMobil, Clinton, NJ - Feb 7, 2013.
Join our research team to develop machine learning, statistical signal processing and optimization algorithms to solve challenging problems involving real-world physics, chemical and engineering data sets and models.
- Senior Communications Officer at Bill & Melinda Gates Foundation, Seattle, WA - Feb 5, 2013.
Guided by the belief that every life has equal value, the Gates Foundation works to help all people lead healthy, productive lives. This role will produce the next version of gatesfoundation.org website, and help develop a digital analytics approach.
- Data Analytics Position at Crum & Forster, Morristown, NJ (Regional office possible) - Feb 4, 2013.
Crum & Forster, an insurance products & services company, is building a start up Data Analytics team that will lead and conduct research & development initiatives.
- Lead Analytic Scientist, Pharma at Verisk Health, Salt Lake City area, UT - Feb 4, 2013.
Join a dynamic, growing company providing important tools to control fraud, waste and abuse in health care and experience the exciting growth of Utah's "Silicon Slopes" along the Wasatch Range.
Meetings
- Sentiment Analysis Symposium, New York, May 8, 2013 - Feb 6, 2013.
Attend the premier event covering social media and social intelligence for customer experience, finance, healthcare, market research, and media applications. Special discount for KDnuggets subscribers.
- PAW: Predictive Analytics World Toronto, March 18-21, 2013 - Feb 6, 2013.
Predictive Analytics World offers an unparalleled perspective on analytics, with 4 days of 20+ sessions, intensive workshops and extensive networking events for predictive analytics practitioners at every level of expertise. Special KDnuggets discount.
- KDD-2013 Call for Papers - Feb 5, 2013.
KDD-2013, 19th ACM SIGKDD Knowledge Discovery and Data Mining Conference, is the premier conference on data mining and data science. KDD is a dual track conference hosting both a research track and an industrial/government track. Abstracts due Feb 15.
- Big Data TechCon, Boston, Apr 8-10, Boston - Feb 4, 2013.
Attend Big Data TechCon, April 8-10 in Boston, the HOW-TO conference for Big Data. Practical tutorials. Technical classes. Special KDnuggets discount. Early reg by Feb 22.
- Hadoop Innovation Summit, Feb 20-21, San Diego Top Companies, KDnuggets discount - Feb 1, 2013.
The Hadoop Innovation Summit, February 20-21 in San Diego is the ideal platform to share challenges and best practices with engineers and executives from top companies and pioneering Hadoop technologies. Exclusive offer for KDnuggets subscribers.
Academic/Research positions
- Doctoral scholarships in knowledge discovery / machine learning at DIPF, Frankfurt am Main, Germany - Feb 7, 2013.
Scholarships are granted for completing a doctoral thesis in CS with a strong focus on machine learning and knowledge discovery, applied to scientific and educational research publications. Applications due Feb 28, 2013.
Competitions
- Kaggle: Predicting Parkinson's Disease Progression with Smartphone Data - Feb 11, 2013.
Can we objectively measure the symptoms of Parkinson's disease with a smartphone? Michael J. Fox Foundation developed a data collection app that uses smartphone sensors, and collected data from a group of Parkinson's patients and control subjects. Enter by Mar 26 and help millions of Parkinson's patients.
- KDD-2013 Cup competition proposals - Feb 9, 2013.
A good competition task should be practically useful, challenging, not require extensive domain knowledge, and have objective evaluated function. Of particular interest are non-traditional tasks/data that require novel techniques and/or thoughtful feature construction.
- ACM SIGSPATIAL GISCUP Geo-Fencing Contest - Feb 4, 2013.
The contest will focus on geo-fencing - a virtual perimeter for a real-world geographic area, widely used in location-based services, such as advertisements and child location services.
- CrowdAnalytix: Predicting likelihood of Online purchases - Jan 30, 2013.
The goal of this modeling contest is to predict the likelihood of online purchases using visits and purchases ecommerce data. Prizes for the Top 5 contestants on the private test set leaderboard.
Publications
- NoSQL and Big Data Just Hype? Interview with MySQL Creator - Feb 10, 2013.
The whole thing with the "new NoSQL movement" started with a blog post from a Twitter employee that said MySQL was not good enough and they needed "something better", like Cassandra. 3 years later, Twitter is still using MySQL.
- Lufthansa and Pentaho: Big Data Analytics Interview - Feb 8, 2013.
Lufthansa is now able to aggregate and feed data into a management cockpit to analyze collected data for key decision-making purposes in the future, enabling the company to detect patterns on large amounts of data at a rapid speed.
- Big Data Myth and Big Data Velocity - Jan 31, 2013.
"There is only so much static data in the world as of today. The vast majority of new data is arriving from a high velocity source", says VoltDB CTO Scott Jarr in an interview with Roberto Zicari.
- Big Data: Trends, Strategies, and SAP Technology - Jan 30, 2013.
IDC white paper discusses the emerging technologies of the Big Data movement, breaks them out by roles and use cases, and identifies the relevant SAP technology.
- Top KDnuggets tweets, Feb 8-10: 53 Billion anon requests available; When SVM trump other classification methods - Feb 11, 2013.
Big Click Data available: 53 Billion anon HTTP reqs; When SVM trump other classification methods - good explanation; Report on Coursera online course in data analysis; Twitter can be used to predict if the user is likely to become sick, track flu spread
- Top KDnuggets tweets, Feb 6-7: Why Facebook Graph Search really matters; Can Kaggle competition be automated? - Feb 8, 2013.
Why Facebook Graph Search really matters: it combines #BigData and NLP; Can Kaggle data mining competition winner be automated? 3 generations of tools for Machine Learning/Data Mining algorithms; interesting history in this paper "Big Data, the Phenomenon, the Term ..."
- Top KDnuggets tweets, Feb 4-5: A data scientist collection of useful and open datasets http; Big Data For Dummies - Feb 6, 2013.
A data scientist collection of useful and open datasets; Never thought will see this, but ... "Big Data For Dummies"; David Brooks finds small anecdotes in Big Data - basketball players don't have hot streaks; "House of Cards" gives viewers exactly what #BigData says we want. This won't end well
- Top KDnuggets tweets, Feb 1-3: Postdoc training from academia to data science; What Nate Silver Gets Wrong - Feb 4, 2013.
6-wk postdoc training, bridges the gap between academia and a career in data; What Nate Silver Gets Wrong; Data journalism: 22 key links - free visualization tools, timelines; Astronomy vs. Data Science: Jobs: 335 vs 140K
- Top KDnuggets tweets, Jan 30-31: What separates a Great Data Scientist from a good one; SQL is not dead - New databases - Feb 1, 2013.
What separates a Great Data Scientist from a good one; SQL is not dead ! New databases create new analytics opps for SQL pros; 8 brilliant minds on the future of online education and MOOCs; More Data or Better Algorithms? Better Data beats them both (but Better Questions beats all 3)
- Top KDnuggets tweets, Jan 28-29: KNIME 2.7 open-source analytics adds R; Machine Learning Cheat Sheet for scikit-learn - Jan 30, 2013.
KNIME 2.7, a leading open-source data/text mining suite, adds R; Machine Learning Cheat Sheet for python scikit-learn; Lessons from transition from Astronomer to a Data Scientist: What your statistical software says about you: R, Python, Julia, Stata, SPSS; Internet Explorer causes murders?
News Briefs
- MasterCard invests and partners with Mu Sigma - Feb 7, 2013.
MasterCard Advisors, a division of MasterCard, made an equity investment in Mu Sigma, a leading analytics firm. The firms are also partnering to develop and sell Big Data analytics solutions.
- Causata Customer Experience Platform offers Omni-Channel Offer Management, uses Machine Learning, Predictive Modeling - Feb 6, 2013.
New machine learning and improved predictive modeling capabilities help analysts and marketers act on terabytes of data for customer segmentation and real-time decisioning.
- Best Analytics Companies to work for - Feb 1, 2013.
The list of "best companies to work for" is led by Google and SAS, but there are relatively few other analytics-related companies on the list. Why is that?
- SiSense Big Data Cloud offering, support for Windows Azure - Jan 30, 2013.
Microsoft and SiSense team up to offer Big Data Analytics in the cloud.
CFP - Calls for Papers
- KDD-2013 W: KDD-2013: Knowledge Discovery and Data Mining, Workshop Proposals, due Feb 15
- KDD-2013: KDD-2013: Knowledge Discovery and Data Mining, due Feb 15
- ISI: Intelligence and Security Informatics, focus on Big Data, due Feb 15
- KDD-2013 I/G: KDD-2013: Knowledge Discovery and Data Mining, Industry/Government track, due Feb 15
- LACRO 2013: Learning Objects Analytics for Collections, Repositories and Federations , due Feb 15
- IADIS-13: European Conf. on Data Mining 2013, due Feb 28
- KDBI - EPIA2013: Knowledge Discovery and Business Intelligence, due Mar 15
- ECML-PKDD 2013 T: ECML-PKDD 2013 tutorials , due Mar 29
- TIR-13: Text-Based Information Retrieval, due Mar 30
- GTM 2013: Global TechMining Conference, due Apr 1
- BSIC 2013: Behavior and Social Informatics and Computing , due Apr 6
- SOCIALCOM 2013 : 2013 ASE/IEEE International Conference on Social Computing, due Jun 1
- UDSM-13: Uncovering Deception in Social Media, due Jun 15
- GISCUP 2013: GIS-focused algorithm competition, due Aug 1
Quote
Give me six hours to chop down a tree and I will spend the first four sharpening the axe. Abraham Lincoln, born Feb 12, 1809.
via Brainy Quote. Also applicable to data preparation step in data mining.