KDnuggets™ News 13:n18, Jul 31
Features (8) | Software (4) | Webcasts (2) | Courses, Events (5) | Meetings (1) | Jobs (14) | Academic (2) | Competitions (2) | Publications (6) | Tweets (6) | NewsBriefs (4) | CFP (11) | Quote
Features
- Poll Results: Online Analytics certificates and MS degrees are popular - Jul 30, 2013.
Latest KDnuggets Poll finds strong interest in online certificates and MS degrees in Analytics, Big Data, Data Mining, and Data Science, even among those who already have graduate degrees.
- Data: Portals, Government, State, City, Local, and Public - Jul 29, 2013.
KDnuggets updated list of government, state, city, local, and public datasets, directories, and portals, both global and for USA, Canada, Europe, Asia, Australia, and other regions.
- Data Mining and Data Science "Nobel Prize": ACM SIGKDD 2013 Innovation Award to Prof. Jon Kleinberg - Jul 22, 2013.
KDD Innovation Award, the highest award for technical excellence in the field of Data Mining and Data Science, was awarded to Prof. Jon Kleinberg for his seminal contributions to the analysis of social and information networks, mining the web graph, study of cascading behaviors in networks, and the development of algorithmic models of human behavior.
- KDnuggets Interview with Jon Kleinberg, revisited - Jul 22, 2013.
Exclusive: Jon Kleinberg on his first interesting program, explaining landmark HITS algorithm, social network analysis and small-world networks, Rebel King, key insights from KDD Best Papers, Facebook and other social networking sites, and advice to students and young researchers.
- ACM SIGKDD 2013 Service Award to Gabor Melli - Jul 22, 2013.
Dr. Gabor Melli is recognized for his substantial technical contributions to the practice and application of data mining and for his outstanding service to the global KDD community.
- IEEE ICDM Research Contributions and Outstanding Service Awards, Nominations due Aug 22 - Jul 23, 2013.
The IEEE ICDM Research Contributions Award recognizes influential research contributions to the field of data mining. The Outstanding Service Award is for major service contributions that have promoted data mining as a field and ICDM as the world premier research conference in data mining.
- Top news for Jul 21-27: CIO 10 Top Big Data Startups; Data Mining and Data Science "Nobel Prize" to Jon Kleinberg - Jul 28, 2013.
CIO 10 Top Big Data Startups; Data Mining and Data Science "Nobel Prize": SIGKDD 2013 Innovation Award to Prof. Jon Kleinberg; 5 Roles You Need on Your Big Data Team
Top jobs: Research Scientist at HP Labs, Analytics; Researchers at Hitachi America Big Data Lab - Top news for Jul 14-20: DataMind: FREE Online Interactive Learning Platform for R; KDnuggets Big Data Science Summer Reading List; - Jul 21, 2013.
DataMind: FREE Online Interactive Learning Platform for R; Cartoon: NSA, cat videos, UFO reports, and Pizza connection? KDnuggets Big Data Science Summer Reading List;
Top jobs: Real Time Data Mining UX designer at Adtheorent; PhD Student, Mixing Meta-Modeling and Data-Mining, France.
Software
- Alteryx Strategic Analytics, free version - Jul 18, 2013.
Download Alteryx Strategic Analytics, Project Edition (free version) and get instant analytics, including statistical, predictive, and spatial - with any data, from spreadsheets to Big Data, with an easy to use visual workflow.
- Big collection of data sites, services, marketplaces and more - Jul 25, 2013.
Here is a big collection of data services, data marketplaces, data search tools, social data sources, portals, platforms, sources for Government, NGO, local, and news data, and more.
- Provalis Research QDA Miner 4.1 and SimStat 2.6 Text Mining Software - Jul 23, 2013.
The unique integration of QDA Miner (for easy-to-use qualitative analysis), SimStat (for statistical analysis and bootstrapping) and WordStat (for text-mining and quantitative content-analysis), allows researchers to integrate numerical and text data into a single project.
- ILNumerics: High Performance Math Library for C# and .NET - Jul 25, 2013.
ILNumerics is a numerical library for .NET that turns C# into a 1st class mathematical language, with a Matlab-like high-level syntax, high performance, and 2D/3D visualization features. Free Community Edition (GPL).
- Textalytics Industry Specific Text Mining APIs - Jul 24, 2013.
The Core API filters words based on syntax (noun, verb, article) to extract the key words of a document, while Media Analysis API pinpoints buying signals in social conversations and identifies customers sentiment.
Webcasts
- Webinar (Aug 6), Text Analytics Case Studies from LinkedIn, BoA, and Serendio - Jul 30, 2013.
LinkedIn, Bank of America and Serendio will share case studies on how they use insights from text analytics to save time and money and build competitive strategies - join us on Aug 6.
- Machine Learning Online Roundtable: How to Make it Work, July 25 - Jul 18, 2013.
In this roundtable moderated by Ismail Parsa of Amazon, experts from Twitter, Skytree, Uber, and Adconion will discuss how to apply machine learning to practical problems in real organizations. Live on July 25 or on-demand afterwards.
Courses, Events
- Elder Research Course: Tools for Discovering Patterns in Data, Sep 9-10, Charlottesville, VA - Jul 27, 2013.
Drawing on 20 years of experience, Dr. John Elder will explain powerful analytic methods for classification and estimation, compare the leading algorithms, and demonstrate their effectiveness on practical applications. Attendees will receive the award-winning Handbook of Statistical Analysis and Data Mining Applications, and fully functional (limited time) data mining software from SAS, IBM/SPSS, and StatSoft.
- Supercomputer Data Mining Boot Camps, San Diego, Sep 12-13, Oct 17-18 - Jul 30, 2013.
The Power to Predict: The Sexiest Job in the 21st Century. Register for UCSD Data Mining Boot Camps scheduled to be held at the San Diego Supercomputer Center on Sep 12-13 and Oct 17-18.
- Northwestern Online MS in Predictive Analytics - Jul 25, 2013.
Learn from distinguished faculty and industry experts, build statistical and analytic expertise, and prepare for leadership-level career opportunities - build in-demand skills for the growing analytics field.
- Applied Predictive Analytics Training with Statistics.com - Jul 19, 2013.
Learn inside tricks and methods in a new online training course developed by CrowdAnalytix (a Kaggle competitor), in partnership with Statistics.com, the leading provider of online education in statistics and analytics.
- INSOFE: Master Big Data Analytics Online - Jul 18, 2013.
Taught by experts who are Carnegie Mellon, JHU, and Stanford alumni, INSOFE programs helped many to become data scientists and get industry certifications and at lower cost than similar programs.
Meetings
- Data Marketing 2013, Toronto, Dec 9-10 - Jul 28, 2013.
Technology and data enable marketers to deliver communications that are much more relevant through effective micro-segmentation, sentiment analysis, behavior prediction and personalization. DATA MARKETING 2013 will address these challenges with a unique approach.
Jobs
- Data Scientist, Strategic at Groupon, Palo Alto, CA - Jul 30, 2013.
Looking for creative and innovative minds to join the team to focus on strategic data analysis ideas that will propel the future growth of Groupon.
- Data Scientist at Groupon, Seattle, WA - Jul 30, 2013.
Analyze data to develop business and marketing operations, strategies and tactics to promote site traffic, increasing subscriber and customer acquisition by improving conversion, and increasing customer lifetime value.
- Data Scientist at PURE Entertainment, Montreal, Canada - Jul 26, 2013.
PURE Entertainment is a newly formed television production and distribution company in search of a Data Scientist that will be responsible for building and leading the businesses analytics operations.
- Senior Applied Researcher, Ads at Microsoft, Sunnyvale, CA - Jul 26, 2013.
Our team does Click Prediction of Ads which is one of the fundamental problems in Sponsored Search. A successful candidate should be passionate about machine learning, mining patterns from big data, and creative with feature and data engineering.
- Data Mining Scientist at Apple, Austin, TX - Jul 25, 2013.
Apple Data Mining Lab is looking for an outstanding data mining scientist to work with business managers to design, develop, and field data mining solutions that have direct and measurable impact to Apple.
- Jr. Data Mining Architect at Objectifi, Toronto, Canada - Jul 24, 2013.
Be a key member on the Objectifi team, work directly with and learn from our Professional Services team, and actively on client engagements.
- Research Scientist at HP Labs, Analytics Lab, Palo Alto, CA - Jul 24, 2013.
Research scientist with skills in big data, data management systems, cloud computing, in-memory databases, data modeling, or data integration. Entry-level position most appropriate for a new/recent PhD.
- Senior Hadoop Engineer, Machine Learning at LocalResponse, New York, NY - Jul 23, 2013.
Enhance and extend data capture and analytics systems and build systems based on machine learning to build and improve user models based on social signals to predict behavior for advertising.
- Know CUDA? Join Accelerate Diagnostics in saving lives as Sr. Software Engineer at Accelerate Diagnostics, Tucson, AZ - Jul 22, 2013.
Be a key member of the team developing our clinical diagnostics system, and work closely with both instrument engineering and assay/algorithm development groups.
- Researchers at Hitachi America Big Data Lab, Santa Clara, CA - Jul 19, 2013.
The lab will focus on technology innovation and business development for novel big data solutions and will contribute to the realization of Hitachi vision of "Social Innovation" and establish Hitachi as a leader in big data.
- Data Scientist at Visually, San Francisco, CA - Jul 19, 2013.
Work with the Director of Analytics to develop analytics frameworks that will help grow the business by providing the Sales, Marketing and Operations teams with strategic direction.
- Risk Modeling Analyst at Paychex, Rochester, NY - Jul 18, 2013.
Analyze internal and external data; develop and implement scorecards/data driven predictive models for Enterprise Risk Management, Marketing, Sales and Field Operations, and more.
- Senior Data Analyst at Crico, Cambridge, MA - Jul 18, 2013.
Perform BI and statistical modeling using medical data and other data for CRICO, the patient safety and medical insurance company serving the Harvard medical community.
- Analytics Business Analyst at Morgan Stanley, New York, NY - Jul 17, 2013.
Not your typical financial services job - take a leadership role in helping Morgan Stanley Research drive strategic data mining initiatives that are at the cornerstone of our business strategy. Serve as the interface between senior stakeholders and modelers and be hands-on validating, visualizing, deploying, and evangelizing new models.
Academic/Research positions
- Postdoc researcher, Applied Data Mining at U. Antwerpen, Belgium - Jul 26, 2013.
Work on "New opportunities in online advertising for publishers" project nd, conducted with several partners including a large Belgian publisher, and help develop and apply new predictive modeling techniques for such big data.
- Postdoctoral Fellow in Bioinformatics/Biostatistics at U. Alberta and Cross Cancer Institute, Edmonton, Alberta, Canada - Jul 17, 2013.
Develop a diagnostic model utilizing genomic (SNP), clinical and dosimetric parameters that identifies patients at risk for toxicity from radiation therapy for their prostate cancer.
Competitions
- Kaggle Belkin Energy Disaggregation Competition - Jul 20, 2013.
Use machine learning on EMI signatures and other data to understand what appliances are used as a step for providing personalized and cost-effective energy saving recommendations.
- Large Scale Hierarchical Text Classification Challenge - Jul 19, 2013.
This challenge comprises three tracks and is based on two large datasets created from the ODP web directory (DMOZ) and Wikipedia. There are 3 tracks: Very Large Scale Supervised Learning; Multi-task learning; and Refinement-learning.
Publications
- Book: Data Clustering: Algorithms and Applications - Jul 29, 2013.
The chapters are carefully constructed to cover the area of clustering comprehensively with up-to-date surveys, making this book accessible to beginning data scientists and analysts.
- McKinsey eBook (free): Big Data, Analytics, and the Future of Marketing and Sales - Jul 29, 2013.
This ebook from McKinsey explores the business opportunities, company examples, and organizational implications of Big Data and advanced analytics.
- 5 Roles You Need on Your Big Data Team - Jul 27, 2013.
Getting value from Big Data requires also paying enough attention to people, and is not just about hiring the best talent. Also very important is identifying the roles the companies really need.
- LionBook Chapter 5: Mastering generalized linear least-squares - Jul 24, 2013.
After reading this chapter you are expected to improve from a casual modeler to a professional least-squares guru. Losing accuracy is not a weakness but a strength, an opportunity to create more powerful models by simplifying the analysis.
- CIO 10 Top Big Data Startups - Jul 21, 2013.
The final ranking is based on reader votes, but also big-name end users, VC funding, the management team and market positioning.
- Getting Started with Amazon Redshift - Jul 17, 2013.
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service. This step-by-step, practical guide to the world of Redshift teaches you how to load, manage, and query data on Redshift.
Top Tweets
- Top KDnuggets tweets, July 26-28: Statistical Data Analysis in Python and Pandas;Julia - a high-level scientific language - Jul 29, 2013.
Statistical Data Analysis in Python and Pandas, SciPy2013 Tutorial; A Beginner look at Julia - a high-level scientific language; Statisticians are envious, asking "Aren't We Data Science?" No; Intuitive Classification and Clustering using kNN and Python, well explained
- Top KDnuggets tweets, July 24-25: Big collection of data sites, services; July 31 webinar: Collaborative Filtering - how to with R - Jul 26, 2013.
Must fave! Excellent collection of sites, services, marketplaces, and APIs for data; July 31 webinar: Collaborative Filtering, turn your visitors into customers; Want many #BigData infographics in one place ? Here is Big board on Pinterest; Kaggle and NLP Logix Chief Scientist resolve dispute from flight prediction contest
- Top KDnuggets tweets, July 22-23: The Rise of DIY Data Scientist: most of Kaggle competition w; An excellent introduction to MapReduce and Hadoop in Courser - Jul 24, 2013.
The Rise of DIY Data Scientist: most of Kaggle competition winners took Machine Learning on Coursera; An excellent introd to MapReduce and Hadoop; How Text Analysis found that J. K. Rowling was the author of "Cuckoos Calling"; Overview: Data Science with Hadoop, by Hortonworks
- Top KDnuggets tweets, July 19-21: All R packages and manuals, searchable; Data Science + Gamification + Online training = Datamind - Jul 22, 2013.
R documentation - all R packages and manuals, searchable; Data Science + Gamification + Online training = DataMind; How to meaningfully use Twitter Analytics, Facebook Insights; Tableau Online, now offers Visual Analytics in the Cloud, connects live to 150 sources
- Top KDnuggets tweets, July 17-18: Good tutorial: Machine Learning on Big Data; The Amazing 3D Topography of Tweets - Jul 19, 2013.
Good tutorial (93 slides): Machine Learning on #BigData; Very cool: The Topography of Tweets: Amazing 3D interactive viz; What do data scientists and data miners listen to?; Data Science 101 - Five data preparation mistakes to avoid
- Top KDnuggets tweets, July 15-16: Unix commands for data mining; SW engineering principles for data science - Jul 18, 2013.
Useful Unix commands for data mining; SW engineering principles every data scientist should know; The Big Data Job Boom; The more confident an expert is in prediction ...
News Briefs
- Notre Dame CARE: Collaborative Assessment Recommendation Engine personalized disease risk predictions - Jul 27, 2013.
U. of Notre Dame researchers have developed a computer-aided method that uses electronic medical records to offer the promise of rapid advances toward personalized health care, disease management and wellness.
- Clustify 3.2 Adds Graphical Visualization of Document Clusters - Jul 23, 2013.
Clustify software does conceptual clustering, near-duplicate detection, content-based email threading, and automatic categorization / predictive coding.
- US has one third of world data - Jul 20, 2013.
The US stores 898 exabytes (898 billion gigabytes) of data, nearly a third of the global total. Western Europe has 19% and China has 13%.
- Anderson Analytics OdinText Patent for Powerful New Text Analytics Process - Jul 17, 2013.
US Patent 8,473,498 leverages contextual data and provides a process for filtering out the noise which is so common in unstructured data. Both of these important benefits have been deficient in text analytics software until now.
CFP - Calls for Papers
- ICDM Tutorials: ICDM '13: The 13th IEEE Int. Conf. on Data Mining - Tutorial Proposals, due Aug 3
- ICDM Workshop papers: ICDM '13: The 13th IEEE Int. Conf. on Data Mining - Workshop papers, due Aug 3
- ICDM Demos: ICDM '13: The 13th IEEE Int. Conf. on Data Mining - Demo proposals, due Aug 9
- COMAD: 19th Int. Conf. on Management of Data, due Aug 12
- BigSpatial-2013: The 2nd ACM SIGSPATIAL Int. Workshop on Analytics for Big Geospatial Data, due Aug 13
- BioASQ : Workshop on biomedical semantic indexing and question answering , due Aug 15
- WSDM2014: Seventh ACM Int. Conf. on Web Search and Data Mining, due Aug 21
- TNNLS-LNG: IEEE TNNLS Special Issue on Learning in Non-(geo)metric Spaces, due Oct 1
- BIWA 2014: Oracle BIWA Summit 2014, due Oct 25
- CIBD: Computational Intelligence in Big Data, due Nov 1
- FLAIRS-27: The 27th Int. Conf. of the Florida Artificial Intelligence Research Society , due Nov 18
Quote
@kdnuggets: Big data is not magic - laws of human behavior are very imprecise and have a lot of randomness #cxo
@marmarlade: #cxo @IBMbigdata A8 "From one conversation with a million, to a million conversations with one."
from IBM July 29, 2013 Tweetchat on Big Data and Customer Segmentation