Avoiding Complexity of Machine Learning Problems
Sometimes machine learning is the perfect tool for a task. Sometimes it is unnecessary overkill. Here are important lessons learned from the Quora engineering team.
on Mar 31, 2016 in Complexity, Machine Learning, Quora, Xavier Amatriain
Pattern Curators of the Cognitive Era
Machine learning has a critical dependency on human learning. But not just on Data Scientists, but on legions of people who legions of individuals who prepare training data to guide algorithms.
on Mar 31, 2016 in Curation, Data Curation, IBM Watson, Machine Learning
Academic/Research positions in Business Analytics, Data Science, Machine Learning in March 2016
Academic/Research positions Analytics and Data Science in Ningbo China, Budapest, Wisconsin, Helsinki, Porto, Bonn, and San Diego.
on Mar 31, 2016 in Bonn, Budapest, CA, China, Helsinki, Porto, Research Positions, San Diego, WI
The Shortest Path to Behavioral Analytics
Behavioral analytics provides the tools to answer complex business questions like retention and churn trends and its causes, multi-dimensional funnel analysis, and more, including intuitive querying and delightful behavioral reports. We can set you up in no time.
on Mar 31, 2016 in Behavioral Analytics, Cooladata, Customer Behavior
The Rise of Dark Data and How It Can Be Harnessed
Dark data isn’t just a small portion of big data, but the biggest and fastest growing. It holds massive potential for those who can harness it successfully.
on Mar 31, 2016 in Dark Data, Excel, Surveys
Don’t be afraid to Fail – Start Now with Data Science
An argument for why aspiring data scientists should stop waiting for permission and start doing data science.
on Mar 30, 2016 in Advice, Data Science, Failure, Success
Top KDnuggets tweets, Mar 22-29: If Hollywood Made Movies About MachineLearning; Data Scientist on Every @AirBNB Leadership Team
If Hollywood Made Movies About Machine Learning; Why Airbnb Has a Data Scientist on Every Leadership Team; Very useful guide for Data Cleaning in Python; Data scientist Hilary Mason wants to show you the (near) future.
on Mar 30, 2016 in AirBnB, Data Cleaning, Hilary Mason, Movies, Python, Top tweets
How to Compute the Statistical Significance of Two Classifiers Performance Difference
To determine whether a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. Here we are demonstrating how you can compute difference between two models using it.
on Mar 30, 2016 in Classifier, Cross-validation, Model Performance, Statistical Significance
IDC/KDnuggets Advanced Analytics Survey, 2016
What are the use cases for machine learning? What's the typical analytics workflow? Should we have a data science team? Should we outsource our analytics? Help IDC and KDnuggets answer these questions and read the results on KDnuggets and in other places.
on Mar 29, 2016 in Advanced Analytics, IDC, Survey
HPE Haven OnDemand and Microsoft Azure Machine Learning: Power Tools for Developers and Data Scientists
While both HPE and Microsoft machine learning platforms offer numerous possibilities for developers and data scientists, HPE Haven OnDemand is a diverse collection of APIs for interacting with data designed with flexibility in mind, allowing developers to quickly perform data tasks in the cloud.
on Mar 29, 2016 in Azure ML, Haven OnDemand, HPE, Microsoft, Prediction
How To Become A Machine Learning Expert In One Simple Step
This post looks at perhaps the most important, and often overlooked, step in learning machine learning, an aspect which can make the biggest difference in one's skill set.
on Mar 29, 2016 in Advice, Kaggle, Machine Learning
100 Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning
Stay on top of your data science skills game! Here’s a list of about 100 most active and interesting blogs on Big Data, Data Science, Data Mining, Machine Learning, and Artificial intelligence.
on Mar 29, 2016 in Big Data, Blogs, Data Science, Deep Learning, Hadoop, Machine Learning
Spring dream offer for PAW Business, PAW Manufacturing, TAW in Chicago – Reg. by Apr 2
Join analytics community in Chicago at PAW Business, Manufacturing, or Text Analytics events this June, and save with code SPRINGDREAM if you register by April 2.
on Mar 29, 2016 in Chicago, IL, Manufacturing, PAW, Predictive Analytics World, Text Analytics
PASS Hands-on Analytics Training on Power BI, R, Dataviz, Excel, May 2-4, San Jose
Get first-rate business analytics training at the PASS Business Analytics Conference, May 2-4, San Jose. Check highlights from our unique tracks that follow the Analyst Journey and cover topics like Power BI, R, Data Visualization, Advanced Analytics for Excel and more.
on Mar 28, 2016 in CA, Data Visualization, Excel, PASS, Power BI, R, San Jose, Training
Strata + Hadoop World San Jose, Keynote Live Streaming, Mar 30-31
Watch Strata + Hadoop San Jose 2016 Conference Keynotes live on March 30 and March 31. Topics include Hadoop at 10, Predictive Analytics for on-demand economy, Real-Time, Summoning the demon of AI, Cybersecurity, The theorem that wouldn't die, and Nonsense science by comedian Paula Poundstone.
on Mar 28, 2016 in Doug Cutting, Hadoop, Keynote Speech, San Jose, Strata
Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department
An exploration of data science team building, with insight into why engineers should not write ETL, and other not-so-subtle pieces of advice.
on Mar 28, 2016 in Advice, Data Engineering, Data Scientist, ETL, Stitch Fix
HR Analytics Starter Kit – Intro to R
We review tools to help you start performing HR analytics with a focus on R platform, and providing useful examples for the HR and Workforce analytics using R.
on Mar 28, 2016 in HR, R, Use Cases, Workforce Analytics
Don’t Buy Machine Learning
In many projects, the amount of effort spent on R&D on Machine Learning is usually a small fraction of the total effort, or it’s not even there because we plan it for a future phase after building the application first.
on Mar 28, 2016 in Advice, Industry, Machine Learning
Top stories for Mar 20-26: R Learning Path: From beginner to expert in 7 steps; Top 10 Data Science Resources on Github
R Learning Path: From beginner to expert in R in 7 steps; Top 10 Data Science Resources on Github; 7 Steps to Mastering Machine Learning With Python; 21 Must-Know Data Science Interview Questions and Answers.
on Mar 27, 2016 in Top stories
Cartoon: Citizen Data Scientist At Work
KDnuggets Cartoon examines Citizen Data Scientist at work and his previous career as a citizen dentist and a citizen pilot.
on Mar 26, 2016 in Cartoon, Citizen Data Scientist, Humor
IU to offer one of the first data science courses to use real clinical trial data
Indiana University and Eli Lilly will offer one of the first data sciences courses employing real-world clinical trial data. The course, part of IU Data Science MS program, will give students a rare opportunity to practice the advanced analysis of clinical trial data using anonymized data collected during the safe testing of potential new drugs.
on Mar 25, 2016 in Data Science Education, Healthcare, Indiana University, Pharma
How to combat financial fraud by using big data?
Financial fraud methods are becoming more sophisticated and the techniques to combat such attacks also need to evolve. Big data has brought with it novel fraud detection and prevention techniques such as behavioral analysis and real-time detection to give fraud fighting techniques a new perspective.
on Mar 25, 2016 in Alibaba, Banking, Big Data, Fraud, Fraud Detection, Fraud Prevention
Whether Evolution or Revolution, The Internet of Things is Here to Stay
The Internet of Things (IoT) is the next boom you need to know about, and insurance provider AIG has recently released a no-nonsense whitepaper providing an overview of the landscape in the space.
on Mar 25, 2016 in AIG, Internet of Things, IoT
Data Science Tools – Are Proprietary Vendors Still Relevant?
We examine and quantify the dramatic impact of open source tools like R and Python on SAS, IBM, Microsoft, and other proprietary Data Science vendors. We also investigate how open source tools were faring against each other, which are growing, which are falling, and look R versus Python debate.
on Mar 25, 2016 in Data Science Tools, IBM, Microsoft, Open Source, Python, R, SAS
Ethics In Machine Learning: What we learned from Tay chatbot fiasco?
As Microsoft chatbot Tay showed, Machine Learning brings us into a new world where our views on ethics and political correctness will be challenged. ML learns from us. In both good and bad ways, it reflects what we really are.
on Mar 25, 2016 in AI, Bots, Chatbot, Ethics, Microsoft, MLconf, Seattle, Twitter, WA
CrowdSignals.io, Building Big Mobile Social Sensor dataset
CrowdSignals.io a crowdfunding campaign to generate the largest mobile and sensor dataset available to the Data Science community for use in research and product development.
on Mar 25, 2016 in Big Data, Crowdsourcing, Datasets, IoT, Mobile, Sensors
XGBoost: Implementing the Winningest Kaggle Algorithm in Spark and Flink
An overview of XGBoost4J, a JVM-based implementation of XGBoost, one of the most successful recent machine learning algorithms in Kaggle competitions, with distributed support for Spark and Flink.
on Mar 24, 2016 in Apache Spark, Distributed Systems, Flink, Kaggle, XGBoost
“Citizen Data Scientist” Revolution
The naysayers are on the wrong side of "citizen" Data Scientist debate. Business users already have self-service BI capabilities and make decisions whether they are statistically sound or not. We can’t stop them from making decisions but should make statistically sound decisions easier. This new approach is called Smart Data Discovery.
on Mar 24, 2016 in BeyondCore, Citizen Data Scientist, Gartner
Top 10 Data Science Resources on Github
The top 10 data science projects on Github are chiefly composed of a number of tutorials and educational resources for learning and doing data science. Have a look at the resources others are using and learning from.
on Mar 24, 2016 in Coursera, GitHub, IPython, Johns Hopkins, Open Source, Top 10
Training a Computer to Recognize Your Handwriting
The remarkable system of neurons is the inspiration behind a widely used machine learning technique called Artificial Neural Networks (ANN), used for image recognition. Learn how you can use this to recognize handwriting.
on Mar 24, 2016 in Image Recognition, Neural Networks
Doing Data Science: A Kaggle Walkthrough – Cleaning Data
Gain insight into the process of cleaning data for a specific Kaggle competition, including a step by step overview.
on Mar 23, 2016 in Data Cleaning, Data Preparation, Kaggle, Pandas, Python
R Learning Path: From beginner to expert in R in 7 steps
This learning path is mainly for novice R users that are just getting started but it will also cover some of the latest changes in the language that might appeal to more advanced R users.
on Mar 23, 2016 in 7 Steps, Data Preparation, Data Science Education, Data Visualization, DataCamp, Hadley Wickham, Learning Path, Maps, R
Top KDnuggets tweets, Mar 16-21: After 150 Years, ASA Says “NO” to p-values; Free Resources to Learn #MachineLearning
After 150 Years, ASA Says "NO" to p-values; Using Deep Q-Network to Learn Play Flappy Bird; Why we work so hard: The problems is overworked professionals are NOT miserable; Free Resources to Learn #MachineLearning.
on Mar 22, 2016 in Deep Learning, Online Education, P-value, Statistics, Video Games
Lift Analysis – A Data Scientist’s Secret Weapon
Gain insight into using lift analysis as a metric for doing data science. Understand how to use it for evaluating the performance and quality of a machine learning model.
on Mar 22, 2016 in Data Science, Lift charts, Metrics
Must Know Tips for Deep Learning Neural Networks
Deep learning is white hot research topic. Add some solid deep learning neural network tips and tricks from a PhD researcher.
on Mar 22, 2016 in Convolutional Neural Networks, Deep Learning
Big Data, Data Visualization, Internet of Things, San Francisco, April 21-22
What do Big Data, the Internet of Things and Data Visualization have in common? See all of them at the upcoming Data Festival in San Francisco on Apr 21-22. Get special access with code KDACCESS.
on Mar 22, 2016 in Big Data Summit, CA, Data Visualization, IE Group, Innovation, IoT, San Francisco
When Big Data Means Bad Analytics
When analytics delivers disappointing results, it is often because there is not enough analytic expertise, and/or lack of understanding of a business objectives for using Big Data in the first place. To avoid failure, insist on high standards.
on Mar 21, 2016 in Big Data, Data Science Education, Failure, FICO, Hiring
AlphaGo is not the solution to AI
The field will be better off without an bust cycle it is important for people to keep and inform a balanced view of successes and their extent. AlphoGo might be a step forward for the AI community, but it is still no way close to the true AI.
on Mar 21, 2016 in AI, AlphaGo, Deep Learning, DeepMind, Go, John Langford
Analytics Hiring Strong, Staying In One Job Is Weak
With more companies jumping on the data-driven bandwagon, companies have been creating new roles and new data science and analytics teams. It is right time to make your move and land up in your dream job.
on Mar 21, 2016 in Analytics, Burtch Works, Career, Hiring
Top stories for Mar 13-19: After 150 Years, the ASA Says No to p-values; New KDnuggets Tutorials Page
After 150 Years, the ASA Says No to p-values; New KDnuggets Tutorials Page: Learn R, Python, Data Visualization, Data Science, and more. R vs Python for Data Science: The Winner is ...
on Mar 20, 2016 in Top stories
Netflix Prize Analyzed: Movie Ratings and Recommender Systems
A 195-page monograph by a top-1% Netflix Prize contestant. Learn about the famous machine learning competition. Improve your machine learning skills. Learn how to build recommender systems.
on Mar 18, 2016 in Free ebook, Netflix, Recommender Systems
3 Telecom Developments Which impact IoT Analytics
Highlights and developments to watch from Mobile World Congress 2016 which will impact IoT analytics in future.
on Mar 18, 2016 in Analytics, IoT, Telecom
Big Data Will Rule Your Home
The "connected home" is the next frontier for Big Data, and soon our lives may be significantly impacted by the analytical firepower from the IoT. Would benefits outweigh the risks and how would you then feel if your fridge locks you out because your scales and wearables have sounded the warning signs?
on Mar 18, 2016 in Big Data, Connected Home, IoT, Privacy
Watson Developer Challenge: build conversational apps using Watson language APIs
Watson Developer Challenge is an online hackathon to build conversational apps using Watson new language service APIs for NLP, document conversion, and speech and machine learning algorithms. Coders have till Apr 15 to build software that lets users interact with Watson through a natural conversational interface.
on Mar 18, 2016 in API, Cognitive Computing, Competition, Hackathon, IBM Watson, Natural Language Processing
The Data Science Game – Student Competition
The Data Science Game returns this year, with university students competing for dominance. Details for this iteration and further information is provided here.
on Mar 17, 2016 in Competition, Data Science, France, Kaggle, Paris, Student Competition
Exclusive Interview with Alexander Gray, Skytree CEO: Fast, Automated, Machine Learning Software for Free?
We discuss how Skytree compares with competition, how does it perform relative to expert Data Scientists, how does Skytree Automodel compare to Deep Learning, and more.
on Mar 17, 2016 in Alexander Gray, Automated, Data Science Platform, Deep Learning, Skytree
Open Data with GraphDB & GraphDB Fundamentals – Upcoming Ontotext Webinars
Get guidance through the gigantic sea of freely available Open Data and learn how it can empower you analysis (Mar 24) and learn how to use GraphDB to full potential and meet your analytics goals (Apr 7). Meet Ontotext in April in London or San Diego.
on Mar 17, 2016 in Graph Analytics, GraphDB, London, Ontotext, Open Data, San Diego
Data is the New Everything
Data gets a lot of mainstream attention these days, and has been compared to all sorts of different things. This is a lighthearted look at some of the top suggestions from Google autocomplete when searching for the phrase "data is the new" something.
on Mar 17, 2016 in Data, Google, Oil & Gas, Search Engine
Last chance – Join the analytics community this April [$150 discount]
Last chance to sign up for 3 analytics events in San Francisco, Apr 3-7, 2016 - Predictive Analytics World for Business, PAW Workforce, and the eMetrics Summit. Use code KDN150 for an additional discount (until Apr 2).
on Mar 16, 2016 in CA, PAW, Predictive Analytics World, San Francisco, Workforce Analytics
Big Data Bootcamp, Austin, April 8-10
This 3 day extensive Bootcamp provides a fast paced, vendor agnostic, technical overview of the Big Data landscape. No prior knowledge of databases or programming is assumed. Use code KDNUGGETS to save.
on Mar 16, 2016 in 3Vs of Big Data, Austin, Bootcamp, Global Big Data Conference, TX
New KDnuggets Tutorials Page: Learn R, Python, Data Visualization, Data Science, and more
Introducing new KDnuggets Tutorials page with useful resources for learning about Business Analytics, Big Data, Data Science, Data Mining, R, Python, Data Visualization, Spark, Deep Learning and more.
on Mar 16, 2016 in Data Science Education, Online Education, Python, R
The Evolution of the Data Scientist
We trace the evolution of Data Science from ancient mathematics to statistics and early neural networks, to present successes like AlphaGo and self-driving car, and look into the future.
on Mar 16, 2016 in Automated, Data Scientist, Demis Hassabis, Evolution, Mathematics, Statistics
Career Advice to Data Scientists – Go Make More Money
Data Scientist should offer the enterprise more than the ability (and cost) of doing analysis, but behave as an executive with expertise in analysis and help lead the enterprise on decisions, investments, and operations.
on Mar 16, 2016 in Advice, Career, Data Science Skills, Skills
Top KDnuggets tweets, Mar 07-15: Great collection of Must Know Tips/Tricks in #DeepLearning
Great collection of Must Know Tips/Tricks in #DeepLearning; 4 Lessons for Brilliant #DataVisualization; 12 Apps with a Billion+ active Users - less time now to get to 1B users;Timeline of Artificial Intelligence #AI victories, 1997-3041, Chess, Jeopardy, Go.
on Mar 16, 2016 in AI, Data Visualization, Deep Learning, DeepMind, Go
After 150 Years, the ASA Says No to p-values
The ASA has recently taken a position against p-values. Read the overview and opinion of a well-respected statistician to gain additional insight.
on Mar 15, 2016 in ASA, P-value, Statistics
Journey to Open Data Science, March 23 Webinar
Learn how to drive collaboration and teamwork through open data science; mitigate legal risk through indemnification and appropriate package selection; bring advanced analytics to Excel-loving analysts with AnacondaXL.
on Mar 15, 2016 in Continuum Analytics, Data Science, Excel, Open Source, Python, R
Wind and Weather – Open Text Data Digest
It’s soothing to watch the wind flows cycle and clouds form and dissipate. Now an app called Windyty lets you navigate real-time and predictive views of the weather yourself, controlling the area, altitude, and variables such as temperature, air pressure, humidity, clouds, or precipitation.
on Mar 15, 2016 in Data Visualization, OpenText, Weather
Self-Paced E-learning course: Advanced Analytics in a Big Data World.
The course covers the entire analytics process, from data preprocessing to advanced modeling, including ensemble methods (bagging, boosting, random forests), neural networks, SVMs, Bayesian networks, social networks, monitoring and more.
on Mar 15, 2016 in Advanced Analytics, Bart Baesens, Big Data, Online Education
How to tell a great analyst from a good analyst
Good analyst help businesses to stay in the competition, but great analyst sets the business apart from its competition. Learn more about how to be a great analyst by walking that extra mile.
on Mar 15, 2016 in Analyst, Data Science Skills, Quandl
BIG Data Analytics for Industrial Process Improvement April 18-20, Niagara Falls, Canada
Using multivariate data analysis, the course will teach you how to improve product quality, increase yield and how to monitor, optimize and control your process.
on Mar 14, 2016 in Big Data Analytics, Canada, Industry, Niagara Falls, Process, ProSensus
TMA Predictive Analytics Data Mining Training, [Washington, DC, May]
Successful analytics in the big data era does not start with data and software, but with hands-on, immersive training and goal-driven strategy - get it from The Modeling Agency in Washington, DC, May 16-20, 23-24.
on Mar 14, 2016 in Data Mining Training, DC, The Modeling Agency, TMA, Washington
What Should Data Scientists Know About Psychology?
Due to training in the scientific method, data management, statistics/data analysis, subject matter expertise, and communicating results into substantive knowledge psychology researchers must have a solid understanding of data science and vice-versa.
on Mar 14, 2016 in Data Scientist, Methodology, Psychology
The Anchors of Trust in Data Analytics
An exploration of some of the critical questions and challenges emerging around trust in data and analytics. The four anchors of trust that will shape public confidence in D&A in the age of the analytical enterprise are highlighted.
on Mar 14, 2016 in Analytics, Big Data, Data Analytics, KPMG, Trust
When Good Advice Goes Bad
Consider these 4 examples of good statistical advice which, when misused, can go bad.
on Mar 14, 2016 in Andrew Gelman, Bayesian, Overfitting, P-value, Statistics
What is the influence of Big Data in Medicine?
The 360-degree customer view is the idea, that companies can get a complete view of customers by aggregating data from the various touch points that a user. And, big data is helping to materialize this idea, which will revolutionize the healthcare.
on Mar 14, 2016 in Big Data, Customer Analytics, Healthcare
Top stories for Mar 6-12: R or Python? Consider learning both; The Data Science Process, Rediscovered
R or Python? Consider learning both; The Data Science Process, Rediscovered; R vs Python for Data Science: The Winner is ...; 7 Steps to Mastering Machine Learning With Python.
on Mar 13, 2016 in Top stories
Fraud Bots Mess Up Your Big Data
The bots that cause digital ad fraud also mess up analytics. When they create fake visits, pageviews, ad impressions, clicks, etc. those metrics are not real and should be corrected for.
on Mar 11, 2016 in Bots, Fraud Detection
3 Viable Ways to Extract Data from the Open Web
We look at 3 main ways to handle data extraction from the open web, along with some tips on when each one makes the most sense as a solution.
on Mar 11, 2016 in Crawler, import.io, Web Mining, Web services, Webhose.io
4 Lessons for Brilliant Data Visualization
Get some pointers on data visualization from a noted expert in the field, and gain some insight into creating your own brilliant visualizations by following these 4 lessons.
on Mar 11, 2016 in Data Visualization, Healthcare, Visualization
How to get structured data from the web without crawling
When you need data from the web, you don't have to build a crawler. Webhose.io does the heavy lifting for you. Its crawlers download and structure millions of posts a day, and store and index the data so all you have to do is to define what data you need.
on Mar 10, 2016 in Crawler, Unstructured data, Web services, Webhose.io
How to Use Cohort Data to Analyze User Behavior
In the world of data analysis, cohorts are often pushed aside due to their seemingly complex nature. Learn what this analysis can offer and how to do it.
on Mar 10, 2016 in CleverTap, Market Analytics, Predictive Analytics, Trends
The Data Science Puzzle, Explained
The puzzle of data science is examined through the relationship between several key concepts in the data science realm. As we will see, far from being concrete concepts etched in stone, divergent opinions are inevitable; this is but another opinion to consider.
on Mar 10, 2016 in Artificial Intelligence, Data Mining, Data Science, Deep Learning, Explained, Machine Learning
Practical Career Advice and Best Practices in Analytics
Being an analyst is not only a technical job it also has a peoples side to it. Given that many MBAs, engineers, and even non-quantitative graduates are interested in Analytics careers, we are sharing some advice on best practices for excelling with Analytics in your career.
on Mar 10, 2016 in Analytics Consultant, Career, Data Science Skills
Deriving Better Insights from Time Series Data with Cycle Plots
Visualization plays key role in analysis of time series data, to understand underlying trends. Here we are demonstrating the cycle plot which shows both the cycle or trend and the day-of-the-week or the month-of-the-year effect.
on Mar 9, 2016 in CleverTap, Data Visualization, Time Series
Top February stories: 21 Must-Know Data Science Interview Q&A; Gartner 2016 MQ for Advanced Analytics: gainers and losers
21 Must-Know Data Science Interview Questions and Answers; Top 10 TED Talks for the Data Scientists; Gartner 2016 Magic Quadrant for Advanced Analytics Platforms: gainers and losers.
on Mar 8, 2016 in Top stories
Simplilearn disrupts Big Data Industry with Masters and Flexi Pass Programs
Simplilearn, the largest online certification training company, offers 3 separate Big Data Masters Programs, courses on Hadoop and Spark, its unique CloudLab, and certification.
on Mar 8, 2016 in Big Data, Certification, Hadoop, Master of Science, Online Education, Simplilearn
Technically Speaking: Watch the webcast series
These real-word case studies deliver key insights on overcoming the challenges in data collection, preparation, and analysis. Find the webcast that fits your current challenge.
on Mar 8, 2016 in Data Visualization, JMP, Statistical Modeling
AI and Machine Learning: Top Influencers and Brands
Onalytica gives us a new list of the top 100 Artifical Intelligence and Machine Learning influencers and brands, and provides some insight into the relationships between them.
on Mar 8, 2016 in About Gregory Piatetsky, AI, Influencers, Kirk D. Borne, Machine Learning, Onalytica, Top list
Watch the Geek Rap Video – Predictive Analytics Song
“PREDICT THIS!” is the first pop song to present analytics content with Gangnam Style humor, and media-blending 80’s throwback visuals. The rapper, formerly known as Dr. Eric Siegel (co-founder of Predictive Analytics World) said, “I only answer to ‘Dr. Data’ now.”
on Mar 8, 2016 in Eric Siegel, Humor, Music, Predictive Analytics
R or Python? Consider learning both
The key to become a data science professional is in understanding the underlying data science concepts and work towards expanding your programming toolbox as much as you can. Hence, one should understand when to use Python and when to pick R, rather mastering just one language.
on Mar 8, 2016 in DataCamp, Dataiku, Jupyter, Python, Python vs R, R
Online Master of Science in Predictive Analytics, Apply by Apr 15
Build in-demand skills for the growing analytics field: learn statistical concepts and practical applications from distinguished Northwestern faculty and from industry experts, and learn the management and leadership skills necessary to implement high-level, data-driven decisions. Summer application deadline April 15.
on Mar 8, 2016 in Master of Science, Northwestern, Online Education, Predictive Analytics
Self-Paced E-Learning course: Credit Risk Modeling
The course covers basic and advanced modeling, including stress testing Probability of Default (PD), Loss Given Default (LGD ) and Exposure At Default (EAD) models.
on Mar 8, 2016 in Bart Baesens, Credit Risk, Online Education, Risk Modeling
Deep Learning: an Interview with Yoshua Bengio
Yoshua Bengio is a renowned figure in the machine learning and specifically deep learning, here is an interview with Yoshua about his thoughts on media interest in the field, future developments and more.
on Mar 8, 2016 in Deep Learning, RE.WORK, Yoshua Bengio
Top KDnuggets tweets, Feb 29 – Mar 06: Data Science Process; Wisdom of Crowds fails to solve this simple puzzle
Wisdom of Crowds fails to solve this simple #math #puzzle ; #DataScience Process - the work flow of a data scientist; R is the fastest-growing language on #StackOverflow; @DeepDrumpf #DeepLearning #Twitterbot imitates #DTrump, more plausible than real one.
on Mar 7, 2016 in Data Science, Donald Trump, Methodology, Process, R
KDD Cup 2016: measuring the impact of research institutions
KDD Cup 2016 competition aims at addressing a long-standing puzzle in academic society: how to rank an academic institution? Develop models to rank academic institutions based on their potential paper acceptance in the upcoming top-tier conferences.
on Mar 7, 2016 in CA, KDD Cup, KDD-2016, Research, San Francisco, SIGKDD
Introducing GraphFrames, a Graph Processing Library for Apache Spark
An overview of Spark's new GraphFrames, a graph processing library based on DataFrames, built in a collaboration between Databricks, UC Berkeley's AMPLab, and MIT.
on Mar 7, 2016 in Apache Spark, Databricks, Graph Analytics
[Super Early Bird Reminder] Quintuple Analytics Events, Chicago
The super early bird deadline ends March 11 for the gathering of 5 analytics events, including PAW Business, PAW Manufacturing, and Text Analytics World, Chicago June 20-23, 2016. Register with code KDN150 and save.
on Mar 7, 2016 in Chicago, IL, Manufacturing, PAW, Predictive Analytics World, Text Analytics
Fastest Growing Programming Languages and Computing Frameworks
A new model for ranking programming languages and predicting the growth of user adoption. Includes current language rankings and predictions.
on Mar 7, 2016 in Data Science, Javascript, Programming Languages, SQL, Trends
Trump vs Clinton – What are the Odds?
Even with 5% advantage for Clinton, statistical analysis and examining how undecided break towards these candidates, we estimate a 25%-30% chance that Trump would be elected president.
on Mar 7, 2016 in Donald Trump, Elections, Hillary Clinton, Politics
Top stories for Feb 28 – Mar 5: The Data Science Process; Why Spark Reached the Tipping Point in 2015
R vs Python for Data Science: The Winner is ...; 7 Steps to Mastering Machine Learning With Python; The Data Science Process; Why Spark Reached the Tipping Point in 2015.
on Mar 6, 2016 in Top stories
Nurture by Numbers – Big Data and Children
Driven by rising healthcare costs and competitions for top schools, more organisations and individuals are turning to Big Data and Analytics to try and give their children the upper hand.
on Mar 5, 2016 in Big Data, Children, Education, Healthcare, Privacy, Speech Recognition
Webinar: Driving Data Democracy: Hadoop and Redshift, Mar 16
The Hadoop ecosystem has improved markedly over the past few years. MPP databases allow analytics teams to easily query massive structured data sets. Learn how these pipelines work on March 16.
on Mar 4, 2016 in Amazon Redshift, Hadoop, Looker, MPP Database, SQL
Apache Big Data, Vancouver, May 9-12, KDnuggets Discount, Early bird ends Mar 6
Apache Big Data brings together the full suite of Big Data open source projects - check the amazing lineup of keynotes and breakout sessions and save with code APBD16KDN20.
on Mar 4, 2016 in Apache, Apache Spark, Big Data, Canada, Doug Cutting, Hadoop, Matei Zaharia, Vancouver
Automated Data Science and Data Mining
Automated Data Science is becoming more popular. Here is our initial list of automated Data Science and Data Mining platforms.
on Mar 4, 2016 in Automated, Data Science Platform, DataRobot
OpenText Data Visualization – Red Carpet Edition
In the this latest edition we present handsome variation on the bubble chart, plotting numbers of nominations against Oscars won, and how many films fall into each category.
on Mar 4, 2016 in Data Visualization, Movies, OpenText
The Data Science Process
What does a day in the data science life look like? Here is a very helpful framework that is both a way to understand what data scientists do, and a cheat sheet to break down any data science problem.
on Mar 4, 2016 in CRISP-DM, Data Science, Methodology, Springboard
Sentiment Analysis Symposium, New York City, July 12, CFP, Early Bird
2016 Sentiment Analysis Symposium will examine the business value of sentiment, opinion, and emotion in our big data world. Submit your proposal to speak by March 15 or use early-bird rates until Apr 2.
on Mar 4, 2016 in Affective Computing, Emotion, New York City, NY, Sentiment Analysis
100 upcoming March – November Meetings in Analytics, Big Data, Data Mining, Data Science
Coming soon: Global Data Science Conf, Strata + Hadoop San Jose, PAW San Francisco, INFORMS Business Analytics, MLConf NYC, PAKDD, Big Data Innovation Summit West, and many more.
on Mar 3, 2016 in Boston, CA, Chicago, IL, London, MA, New York City, NY, San Francisco, UK
scikit-feature: Open-Source Feature Selection Repository in Python
scikit-feature is an open-source feature selection repository in python, with around 40 popular algorithms in feature selection research. It is developed by Data Mining and Machine Learning Lab at Arizona State University.
on Mar 3, 2016 in Data Mining, Data Science, Feature Extraction, Feature Selection, Machine Learning, Python
PAW Predictive Analytics San Francisco – 1 month before showtime
Just ONE month before analytics professionals take the stage at PAW San Francisco. Check the career-enhancing analytics events lined up to showcase the cutting edge and use code KDN150 to save.
on Mar 3, 2016 in CA, PAW, Predictive Analytics World, San Francisco
Top Big Data Processing Frameworks
A discussion of 5 Big Data processing frameworks: Hadoop, Spark, Flink, Storm, and Samza. An overview of each is given and comparative insights are provided, along with links to external resources on particular related topics.
on Mar 3, 2016 in Apache Samza, Apache Spark, Apache Storm, Flink, Hadoop
The Rise Of The Robot
Atlas, the latest robot from Google's Boston Dynamics a pretty resilient chap. He can trudge through uneven snow, be knocked off his feet and get up again. and do work that can take place in any warehouse. We examine what it means for our future.
on Mar 3, 2016 in Google, Humans vs Machines, Robots
Top /r/MachineLearning Posts, February: AlphaGo, Distributed TensorFlow, Neural Network Image Enhancement
In February on /r/MachineLearning, we get a run-down of the AlphaGo matches, Distributed TensorFlow is released, convolutional neural nets are cleaning Star Wars images, vintage science is on parade, military machine learning is criticized, and the overwhelmed researcher is given advice.
on Mar 2, 2016 in Advice, Convolutional Neural Networks, Deep Learning, Go, Google, Star Wars, TensorFlow
Top Spark Ecosystem Projects
Apache Spark has developed a rich ecosystem, including both official and third party tools. We have a look at 5 third party projects which complement Spark in 5 different ways.
on Mar 2, 2016 in Apache Mesos, Apache Spark, Cassandra, Databricks, Distributed Systems
2nd Annual Global Big Data For Executives Conference, March 7-9 2016
Conference emphasizes on sharing real world executive experiences, how to create a balanced big data team, new methods used in Big data cross multiple industry verticals, Panel Sessions, Keynote Sessions and workshop. Use KDNKEYNOTE to attend keynote sessions for free.
on Mar 1, 2016 in Big Data, CA, Global Big Data Conference, Santa Clara
New Salford Predictive Modeler 8
Salford Predictive Modeler software suite: Faster. More Comprehensive Machine Learning. More Automation. Better results. Take a giant step forward in your data science productivity with SPM 8. Download and try it today!
on Mar 1, 2016 in Data Science Platform, Decision Trees, Gradient Boosting, Predictive Modeler, Regression, Salford Systems
Learn to apply analytics to meet business objective
Business Analytics skills are in high demand. Get your credentials with Penn State World Campus 9-credit online Graduate Certificate in Business Analytics.
on Mar 1, 2016 in Business Analytics, Certificate, Online Education, Penn State
Webinar: Building Data Products for Predictive Maintenance, Mar 9
Dr. Kirk Borne, Data Science thought leader, will address the interesting trends in the world of data science. DataRPM Co-founder Ruban Phukan will show how a cognitive data science platform like DataRPM helps companies to build data products.
on Mar 1, 2016 in Cognitive Computing, DataRPM, Kirk D. Borne, Predictive Maintenance
The Mirage of a Citizen Data Scientist
The term "citizen data scientist" has been irritating me recently. I explain why I think it both a bad term and a bad idea, and what we need instead.
on Mar 1, 2016 in Citizen Data Scientist, Data Analyst, Data Scientist, Gartner, Overfitting
Dynamic Data Visualization with PHP and MySQL: Election Spending
Learn how to fetch data from MySQL database using PHP and create dynamic charts with that data, using an interesting example of New Hampshire primary election spending.
on Mar 1, 2016 in Data Visualization, FusionCharts, MySQL, PHP
Distributed TensorFlow Has Arrived
Google has open sourced its distributed version of TensorFlow. Get the info on it here, and catch up on some other TensorFlow news at the same time.
on Mar 1, 2016 in Deep Learning, Distributed Systems, Google, Matthew Mayo, TensorFlow
Data Science and Disability
Data Science and Artificial Intelligence has come to the forefront of technology in the last few years. Learn, how practitioners are taking a more philanthropic outlook on life, supporting people suffering with both physical and mental disabilities.
on Mar 1, 2016 in Data Science, Disability, Healthcare
Academic/Research positions in Business Analytics, Data Science, Machine Learning in February 2016
Academic/Research positions Analytics and Data Science at Oslo Research Lab, Eindhoven U. of Technology, NCSR Demokritos, U. Iowa, U. Helsinki and Aalto, NJCU, Manchester Metropolitan U., Umea University, Monash University.
on Mar 1, 2016 in Aalto, Finland, Greece, Jersey City, Netherlands, NJ, Norway, Research Positions, Sweden, UK
|