KDnuggets™ News 13:n09, Apr 10
Features (11) | Software (6) | Webcasts (1) | Courses, Events (3) | Jobs (11) | Competitions (3) | Publications (6) | Tweets (6) | NewsBriefs (4) | CFP (16) | Quote
Features
- Webcast/Debate: Should Data Science Become a Profession and Self-regulate? - Apr 8, 2013.
Join/Watch on Google Hangout, Apr 10, 1 pm PT, 4 pm ET, with Gregory Piatetsky-Shapiro (KDnuggets), Eric Siegel (Predictive Analytics World) and Michael Walker (Rose Business Technologies) for a spirited discussion on data science as an independent profession with a code of conduct and self-regulation. Tag #DataScienceProfession.
- New Poll: Largest Dataset Analyzed / Data Mined? - Apr 6, 2013.
New KDnuggets Poll is asking: what was the largest dataset you analyzed/ data mined? Please vote on www.kdnuggets.com and see the trends.
- KDnuggets Cartoon looks at IRS, Big Data, and Taxes - Apr 6, 2013.
It is tax paying season! This KDnuggets cartoon uses Big Data to help reduce the pain of paying taxes.
- Caltech Prof. Abu-Mostafa on what he learned from his MOOC course "Learning from Data" - Apr 5, 2013.
Caltech professor Yaser Abu-Mostafa on the goals for his online MOOC course "Learning from Data", and how online courses are transforming education.
- Caltech Prof. Abu-Mostafa on his MOOC course "Learning from Data" and Machine Learning - Apr 5, 2013.
KDnuggets talks with top Caltech professor Yaser Abu-Mostafa about his current online MOOC course "Learning from Data", Machine Learning, and Big Data.
- PAW: See Who Attending Predictive Analytics World San Francisco, Apr 14-19, 2013 - Apr 9, 2013.
Join us and network with predictive analytics experts and thought leaders who come to PAW to share their latest best practices, techniques and insight.
- Data Scientist Hat - Apr 1, 2013.
Many professions have a traditional hat and a logo. What should a self-respecting data scientist wear? We review several proposals. Send us your suggestions.
- Top news for Mar 31 - Apr 6: Learning from "Learning from Data"; IBM Accelerates Big Data; Cartoon: Big Data and Taxes; Data Scientist Hat - Apr 7, 2013.
Caltech Prof. Abu-Mostafa on his MOOC course "Learning from Data" and Machine Learning; IBM Accelerates Big Data; Cartoon: IRS, Big Data, and Taxes; Data Scientist Hat;
Top jobs: Data Scientist at Science Exchange; Data Mining Research Engineer - Algorithms at Bosch RTC - Additions to KDnuggets in March - Apr 1, 2013.
SPMF Sequential Pattern Mining Framework, Open Data Census, and new companies, datasets, education, meetings, publications, software, and solutions
- Top news in March: Data Science Code of Conduct; Top LinkedIn Groups; Data Driven Journalism Tutorials - Apr 1, 2013.
Data Science Code of Professional Conduct and "Golden Rule"; Top LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science; NICAR13 Data Driven Journalism Tutorials, Presentations
Top jobs: Data Science for Social Good Summer Fellowship at U. Chicago; Data Scientist at Qubit - Top news for Mar 24-30: Data Scientists against a Pledge; Top LinkedIn Groups; Klout Big Data Influencers - Mar 31, 2013.
Poll Results: Data Scientists against a Pledge; Klout Top Big Data Influencers; Top LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science;
Top jobs: Data Analytics, Sr. Manager at Accenture Interactive; Data Scientist at Autodesk
Software
- IBM Accelerates Big Data - KDnuggets Special Report - Apr 4, 2013.
IBM announced several related technologies in a bid to lead the Big Data Market, including a dramatic 8-25x BLU Acceleration for DB2, an easy-to-use Big Data Platform, and a system for Hadoop.
- Quadrigram platform for customized data visualizations - Apr 8, 2013.
Quadrigram is a highly flexible platform for creating data visualizations and solutions. It uses a visual language, with modules to importing/exporting many data types, performing all kinds of operations, controlling the data flow of data, and a large catalogue of visualizations and visual metaphors.
- Rexer Analytics 2013 Data Miner Survey - Apr 8, 2013.
Data Analysts, Predictive Modelers, Data Scientists, Data Miners, and all other types of analytic professionals, students, and academics: Please participate in the Rexer Analytics 2013 Data Miner Survey. The survey closes on April 22, so please participate now!
- 11 segments of Big Data Ecosystem, according to Sqrrl - Mar 30, 2013.
Sqrrl, a Big Data startup founded by the ex-NSA engineers, breaks down Big Data Market into 11 segments.
- Need Data Mining Consultant for bioinformatics project to diagnose schizophrenia - Mar 29, 2013.
Kahn Technologies has developed an algorithm for detecting an EEG signal for diagnosing schizophrenia. We are looking for a data mining expert, preferably with experience in oscillatory dynamics, to help in writing grants and to implement a data-mining system.
- Big Data is a Software and Services Business - Mar 28, 2013.
All companies whose main focus was Big Data got all their revenue from software and services, none from hardware. The top 10 companies in Big Data overall are big companies, like IBM, HP, Teradata, Dell, and Oracle which derived only 2.4% of their revenue from Big Data.
Webcasts
- Webinar: Data Mining: Failure to Launch [Apr 25] - Apr 2, 2013.
Learn how to get started with predictive modeling and overcome strategic and tactical limitations that cause data mining projects to fall short of their potential. Next webinar is April 25.
Courses, Events
- Get results with SAS data mining courses - Apr 9, 2013.
Learn advanced processes and state-of-the-art techniques from leading industry experts and get more from your data. Spring classes are filling quickly, so register today!
- TMA Courses in Data Analytics [May: Washington, DC; June: Denver, CO] - Apr 2, 2013.
Get up to speed in data mining faster and more effectively than with any other training program available. Next courses in Washington, DC and Denver, CO.
- Caltech free online course: Learning from Data - Apr 2, 2013.
Free, introductory Machine Learning online course, taught by a top-rated Caltech professor. Lectures recorded from a live broadcast, including QnA.
Jobs
- Data Analytics Scientist at Archimedes, San Francisco, CA - Apr 8, 2013.
Join a world-class team developing, validating, and applying a mathematical model of disease progression and treatment, based on biology, physiology, medicine, epidemiology and cutting edge analytic techniques.
- Data Mining Research Engineer - Algorithms at Bosch RTC, Palo Alto, CA - Apr 3, 2013.
Research, develop and apply advanced statistical algorithms for analysis of large-scale, high-dimensional data across Bosch business areas: automotive, healthcare, industrial.
- Data Mining Research Engineer - Active Learning at Bosch RTC, Palo Alto, CA - Apr 3, 2013.
For over a century the name "Bosch" was associated with forward-looking technology and trailblazing inventions that have made history. This job focus is to research, develop and apply active learning methods in applications across Bosch business areas: automotive, healthcare, industrial.
- Data Mining Engineer - Big Data/HPC at Bosch RTC, Palo Alto, CA - Apr 3, 2013.
The Bosch Group manufactures and markets many automotive, industrial, power, and security products. Your goal is to develop and implement algorithms for distributed and parallel predictive analytics.
- Data Scientist at Science Exchange, Palo Alto, CA - Apr 2, 2013.
Science Exchange wants to organize public scientific research data and make scientific research more efficient. The goal is to collect and analyze data to help drive marketing, product decisions, and new features of Science Exchange. Great benefits and perks!
- Sr. Data Scientist at Apple, Cupertino, CA - Mar 29, 2013.
Changing the world is all in a day's work at Apple. If you love innovation, here's your chance to make a career of it. Design, develop, and field analyses that have direct and measurable impact to the management of the Apple Online Store.
- Sr. Database Analyst, Applied Analytics at Nike, Portland, OR - Mar 29, 2013.
Employ advanced data modeling techniques to find trends that suggest new marketing and business opportunities and follow through with the recommendation of key strategies. Exceptionally strong, hands-on analytics experience, a blend of "traditional" CRM direct targeting with a modern "big data" mindset.
- Global Director, Applied Analytics at Nike, Portland, OR - Mar 29, 2013.
Lead a team of analytics professionals in the development of holistic customer profiles, business-driving analysis / insight and CRM targeting strategies.
- Data Analytics at Schwab, Englewood, CO - Mar 28, 2013.
The Analytics Insight and Loyalty group provides quantitative-based decision support and market research to many business units across Schwab. This position provides decision and analytical support for Schwab Affluent and Branch business teams.
- Senior Performance Scientist at Quantcast, San Francisco, CA - Mar 27, 2013.
Have the opportunity to make a very large business impact by understanding how the final advertising performance is affected by the individual statistical and engineering components.
- Modeling Scientist/Engineer at Quantcast, San Francisco, CA - Mar 27, 2013.
Focus on creatively tackling Quantcast's most complex quantitative modeling problems and advancing the company's core statistical inference and algorithmic technology for audience targeting
Competitions
- CHALEARN: Cause-effect pairs challenge - Apr 9, 2013.
Bring your own data and use the challenge to discover causal relationships! The goal of this event is to evaluate for a large number of pairs of variables whether one is a cause of the other. You can contribute algorithms or contribute data.
- DATA MINING CUP 2013 Student competition launched - Apr 4, 2013.
The tasks for the 14th Data Mining Cup, a leading competition for students and young data miners, have been published. This year the focus is to forecast orders in an online shop. The winners will be announced during prudsys User Days 2-3 June 2013.
- Lumina: Determine the Complete Economic Impact of Post-Secondary Education - Mar 28, 2013.
Propose novel and unexpected correlations between post-secondary education attainment and outcomes with a clear economic impact, such as, but not only, increased income. Deadline Apr 25, 2013.
Publications
- Picking Winners In Big Data - Apr 8, 2013.
The value in data world is usually not just in the software, and building on top of Hadoop is not strategy for long term advantage.
- Book: Getting Started with Business Analytics - Apr 4, 2013.
Making no assumptions about your knowledge or technical skills, this book guides you through a journey into the world of business analytics, exploring its contents, capabilities, and applications.
- Big Data on Books: Decline of Emotional Expression in 20th Century - Apr 2, 2013.
Study of millions of English language books (using Google Ngram data) finds distinct historical periods of positive and negative moods. Overall, emotion-related words have decreased, except for fear which increased towards the end of 20th century.
- IIA Analytics 3.0 Framework: The Era of Impact - Mar 30, 2013.
Analytics 3.0 represents a new phase in the evolution of enterprise adoption of business analytics, bringing together Analytics 1.0 - Traditional analytics and Analytics 2.0 - Big Data analytics.
- Big Data Download: Employee Performance and People Analytics - Mar 29, 2013.
CNBC/Yahoo new online segment "Big Data Download" looks at collecting employee data - is it good for productivity or is the Big Brother ? What happens when employee badges are replaced with sociometric badges and how it improved productivity by 23% at BoA call center.
- Klout Top Big Data Influencers - Mar 28, 2013.
Top Big Data influencers, according to Klout: @cloudera, @hmason, @dpatil, @acroll, @dhinchcliffe, @kevinweil, @kdnuggets, @edd
Top Tweets
- Top KDnuggets tweets, Apr 5-7: Need to know both sides: Intro to SAS for R programmers; The importance of stupidity in research - Apr 8, 2013.
You need to know both sides: an introduction to SAS for R Programmers; The (unexpected) importance of stupidity in scientific research; R version 3 released - what is new, how to upgrade; The Bubble in Bitcoin, the internet secret currency - can you predict when it will burst?
- Top KDnuggets tweets, Apr 3-4: 100 Savvy Sites on Statistics and Quantitative Analysis; Great Tutorial: Intro to scikit-learn: ML with Python - Apr 5, 2013.
100 Savvy Sites on Statistics and Quantitative Analysis; Great Tutorial: Introduction to scikit-learn: Machine Learning in Python; Graph-Based Recommendation Systems at eBay: modeling taste with Cassandra; DATA MINING CUP 2013 Student competition launched - forecast online orders
- Top KDnuggets tweets, Apr 1-2: People often reveal more online; Caltech free online course: Learning from Data - Apr 3, 2013.
People often reveal more online than they want; Caltech free online course: Learning from Data, Apr 2 - Jun 11; What should a self-respecting Data Scientist wear? #DataScienceHat; Healthcare analytics market to exceed $10B by 2017, will grow 24%/year
- Top KDnuggets tweets, Mar 29-31: Getting Started with Python for Data Scientists; US Computer science enrollments rise astonishing 29 pct - Apr 1, 2013.
Getting Started with Python for Data Scientists; US CS enrollments rise astonishing 29% in 2011-12; 11 segments of Big Data Ecosystem, according to Sqrrl; Doug Cutting, creator of #Hadoop and Lucene, Apache Chair, chief architect of Cloudera
- Top KDnuggets tweets, Mar 27-28: #FastData: R package for massive online data mining; Study shows: only 2% ecommerce visits came from social - Mar 29, 2013.
Besides Big Data, there is #FastData: stream R package is for massive online stream mining; Stop Hyping Social: study shows 2% of ecommerce visits came from social; #BigData for fighting poverty and fraud - what 150 data scientists/hackers did over the weekend; For Math lovers: The Improbable Life of Paul Erdos
- Top KDnuggets tweets, Mar 25-26: rminer: R package, simplifies use of data mining algorithms; Google Universal Analytics beta open to all - Mar 27, 2013.
rminer: R package, simplifies use of data mining algorithms in classification/regression; Google Universal Analytics beta open to all; new universal tracking, more custom analytics; Are data scientists overpaid? They show up on a "top 10 overpaid jobs"; This paper examines the risks of using spreadsheets for statistical analysis
News Briefs
- KNIME User Group Meeting Highlights - Apr 4, 2013.
KNIME 6th User Group Meeting attracted about 150 attendees from many industries and from around the world, who came to see what's new in KNIME.
- Pivotal - new Big Data spinoff from EMC, VMware - Mar 30, 2013.
Pivotal unit, set to launch April 1, will integrate an unusual and powerful set of software components into a single big data platform.
- Angoss adds Real-Time Analytics Capability - Mar 28, 2013.
Angoss real-time scoring cloud service delivers intelligent, real-time scores or recommendations, and integrates with operational systems.
- DataKind/WorldBank Big Data Exploration against poverty and fraud - Mar 27, 2013.
About 150 data scientists, civic hackers, visual analytics savants, poverty specialists, and fraud/anti-corruption experts made DataKind/WorldBank DataDive at Washington DC a success. See some highlights.
CFP - Calls for Papers
- ASE Big Data: 2013 ASE/IEEE Int. Conf. on Social Computing, due Apr 15
- IWCDM-2013: The 1st Int. Workshop on Cloud Data Mining, due Apr 15
- TIR'13: Text-Based Information Retrieval, due Apr 20
- UDM : Ubiquitous Data Mining Workshop, due Apr 20
- IJSNM13: Clustering & evolution mining of complex networks, due Apr 22
- IADIS-DM: IADIS EUROPEAN CONFERENCE ON DATA MINING 2013, due May 1
- SENTIC 13: KDD13 Workshop on Issues of Sentiment Discovery and Opinion Mining, due May 8
- DICTAP2013: Digital Information and Communication Technology and Applications , due May 10
- BIOKDD '13: 12th Int. Workshop on Data Mining in Bioinformatics, due May 10
- DS 2013: DISCOVERY SCIENCE 2013, due May 11
- KDIR 2013: 5th Int. Conf. on Knowledge Discovery and Information Retrieval , due May 16
- MultiClust 2013: Multiple Clusterings, Multi-view Data, and Multi-source Knowledge-driven Clustering, due May 28
- IEEE BigData 2013: 2013 IEEE Int. Conf. on Big Data, due Jun 2
- DeMiMoP'13: Decision Mining & Modeling for Business Processes, due Jun 25
- IMMM 2013: Advances in Information Mining and Management, due Jun 26
- AusDM 2013: 11th Australasian Data Mining Conf., due Jul 15
Quote
"If I had only one hour to save the world, I would spend fifty-five minutes defining the problem, and only five minutes finding the solution."
Albert Einstein