- Tips for Data Scientists: Think Like a Business Executive - May 18, 2016.
Thinking like a Data Scientist is important; it puts businesses and business leaders in an analytical frame of mind. But it is also important for Data Scientists to be able to think like business executives. Read on to find out why.
Advice, Analytics, Data Scientist
- The Amazing Power of Word Vectors - May 18, 2016.
A fantastic overview of several now-classic papers on word2vec, the work of Mikolov et al. at Google on efficient vector representations of words, and what you can do with them.
Pages: 1 2
Distributed Representation, NLP, word2vec
- Embrace the Random: A Case for Randomizing Acceptance of Borderline Papers - May 16, 2016.
A case for using randomization in the selection of borderline academic papers, a particular use case which has parallels with many other possible scenarios.
Academics, ICML, NIPS, Random, Randomization
- Practical skills that practical data scientists need - May 13, 2016.
The long story short, data scientist needs to be capable of solving business analytics problems. Learn more about the skill-set you need to master to achieve so.
Business Context, Data Scientist, Mathematics, Skills, SQL
- Troubleshooting Neural Networks: What is Wrong When My Error Increases? - May 13, 2016.
An overview of some of the things that could lead to an increased error rate in neural network implementations.
Deep Learning, Neural Networks, Overfitting
- Are Deep Neural Networks Creative? - May 12, 2016.
Deep neural networks routinely generate images and synthesize text. But does this amount to creativity? Can we reasonably claim that deep learning produces art?
Artificial Intelligence, Deep Learning, Generative Adversarial Network, Generative Models, Recurrent Neural Networks, Reinforcement Learning, Zachary Lipton
- Deep Learning and Neuromorphic Chips - May 12, 2016.
The 3 main ingredients to creating artificial intelligence are hardware, software, and data, and while we have focused historically on improving software and data, what if, instead, the hardware was drastically changed?
AI, Brain, Deep Learning, Neural Networks
- Implementing Neural Networks in Javascript - May 12, 2016.
Javascript is one of the most prevalent and fastest growing languages in existence today. Get a quick introduction to implementing neural networks in the language, and direction on where to go from here.
Javascript, MNIST, Neural Networks
- Meet the 11 Big Data & Data Science Leaders on LinkedIn - May 6, 2016.
In this post, we present a list of popular data science leaders on LinkedIn. Follow these leaders who will keep you in touch with the latest Data Science happenings!
About Gregory Piatetsky, Bernard Marr, Big Data, Data Scientist, DJ Patil, Hilary Mason, Influencers, LinkedIn, Tom Davenport
- Why Implement Machine Learning Algorithms From Scratch? - May 6, 2016.
Even with machine learning libraries covering almost any algorithm implementation you could imagine, there are often still good reasons to write your own. Read on to find out what these reasons are.
Algorithms, Machine Learning
- How Much do Analytics Salaries Increase when Changing Jobs? - May 4, 2016.
A data-informed analysis of analytics career salaries and their increase when changing jobs.
Analytics, Burtch Works, Career, Salary
- A Data Science Approach to Writing a Good GitHub README - May 4, 2016.
Readme is the first file every user will look for, whenever they are checking out the code repository. Learn, what you should write inside your readme files and analyze your existing files effectiveness.
Algorithmia, GitHub, Text Mining
- Datasets Over Algorithms - May 3, 2016.
The average elapsed time between key algorithm proposals and corresponding advances is about 18 years; the average elapsed time between key dataset availabilities and corresponding advances is less than 3 years, 6 times faster.
Algorithms, Datasets
- How to Network and Build a Personal Brand in Data Science - May 2, 2016.
SpringBoard shares some ideas on how to network and build a data career, as taken from a new guide they have put together on the topic.
Career, KDD, Mentorship, Strata
- How to Use Cohort Analysis to Improve Customer Retention - May 2, 2016.
Cohort analysis is a subset of behavioral analytics that takes the user data and breaks them into related groups for analysis. Let’s understand using cohort analysis with an example of daily cohort of app users.
Pages: 1 2
Churn, CleverTap, Customer Analytics, Customer Behavior
- Cartoon: When Automation Goes Too Far - Apr 30, 2016.
KDnuggets Cartoon looks into the future of Automated Data Science and Marketing - when will automation go too far?
Automated, Automation, Beer, Cartoon, Marketing
- Angoss 9.6 Data Science Software Suite - Apr 29, 2016.
Angoss software provides users with comprehensive scorecard building functionality that is fast, reliable, accurate, and business centric.
Angoss, Data Science Platform, Optimization, Tableau
- Data Scientist Survey: What Is An Interesting Result? - Apr 28, 2016.
A survey requesting feedback from data scientists on their opinion of what an interesting result is. The survey is anonymous, has only a single mandatory question, and takes only 5 minutes.
Analytics, Survey
- Machine Learning for Artists – Video lectures and notes - Apr 28, 2016.
Art has always been deep for those who appreciate it... but now, more than ever, deep learning is making a real impact on the art world. Check out this graduate course, and its freely-available resources, focusing on this very topic.
Art, Convolutional Neural Networks, Deep Learning, Machine Learning, Recurrent Neural Networks
- Eugenics – journey to the dark side at the dawn of statistics - Apr 27, 2016.
Today is the 80th anniversary of the death of Karl Pearson, one of the founding father of statistics (correlation coefficient, principal components, the p-value, and much more). He was also deeply involved with eugenics, a jarring reminder that truth often comes bundled with a measure of darkness.
Correlation, Eugenics, Karl Pearson, Statistics
- Three Pitfalls to Avoid When Building Data Science Into Your Business - Apr 27, 2016.
An overview of pitfalls to avoid when building data science into your business, how to avoid them, and what to do instead.
Advice, Data Science Team
- How to Remove Duplicates in Large Datasets - Apr 27, 2016.
Dealing with huge datasets can be tricky, especially the data cleaning process. One of such processing is de-duplication, find out how you can solve this using the statistical techniques.
CleverTap, Data Cleaning, Data Preparation
- Microsoft is Becoming M(ai)crosoft - Apr 25, 2016.
This post is an overview and discussion of Microsoft's increasing investment in, and approach to, artificial intelligence, and how their philosophy differs from their competitors.
AI, Artificial Intelligence, Computer Vision, Cortana, Machine Learning, Microsoft, Natural Language Processing, Speech Recognition
- Advantages of a Career in Data Science - Apr 23, 2016.
As the rampant growth of data science continues across industries, the opportunities are plenty for both aspiring and expert data scientists. Here is an overview of data science industries, opportunities and work locations.
Career, Data Science Education, Data Scientist, Industry
When Does Deep Learning Work Better Than SVMs or Random Forests®? - Apr 22, 2016.
Some advice on when a deep neural network may or may not outperform Support Vector Machines or Random Forests.
Advice, Deep Learning, random forests algorithm, Support Vector Machines, SVM
- Top 10 IPython Notebook Tutorials for Data Science and Machine Learning - Apr 22, 2016.
A list of 10 useful Github repositories made up of IPython (Jupyter) notebooks, focused on teaching data science and machine learning. Python is the clear target here, but general principles are transferable.
Data Science, Deep Learning, GitHub, IPython, Machine Learning, Python, Sebastian Raschka, TensorFlow
- Comprehensive Guide to Learning Python for Data Analysis and Data Science - Apr 20, 2016.
Want to make a career change to Data Science using python? Well learning anything on your own can be a challenge & a little guidance could be a great help, that is exactly what this article will provide you with.
Pages: 1 2
Data Analysis, Data Science Education, DataCamp, Python
- Does Your Company Need a Data Scientist? - Apr 19, 2016.
Your company needs a data scientist... doesn't it? It very well may not, but you need to know either way. Read on to determine whether or not your company could benefit from the skills of an on-board data scientist.
Pages: 1 2
Advice, Data Scientist, Noise
- Deep Learning for Chatbots, Part 1 – Introduction - Apr 19, 2016.
The first in a series of tutorial posts on using Deep Learning for chatbots, this covers some of the techniques being used to build conversational agents, and goes from the current state of affairs through to what is and is not possible.
Chatbot, Deep Learning, Siri
- Top 15 Frameworks for Machine Learning Experts - Apr 19, 2016.
Either you are a researcher, start-up or big organization who wants to use machine learning, you will need the right tools to make it happen. Here is a list of the most popular frameworks for machine learning.
Data Science Tools, Deep Learning, Devendra Desale, Machine Learning, MLlib
- Using Big Data Analytics To Prevent Crimes The “Minority Report” Way - Apr 18, 2016.
The idea of using artificial intelligence for the crime prevention has been around for more than a decade. In this post, we present four examples, including how using analytics, we can prevent a criminal from re-offending.
Big Data Analytics, Crime, Machine Learning, Surveillance
- 12 Inspiring Women In Data Science, Big Data - Apr 15, 2016.
It’s been well documented that women don’t come close to parity in STEM fields with their counterparts. Could the rise of big data and data science offer women a clearer path to success in technology? Here’s a list of 12 inspiring women who work in big data and data
Big Data, Data Science, InformationWeek, Women
- Recommender Systems: New Comprehensive Textbook by Charu Aggarwal - Apr 15, 2016.
Covers recommender systems comprehensively, both fundamentals and advanced topics, organized into: Algorithms and evaluation, recommendations in specific domains and contexts, and advanced topics and applications.
Book, Charu Aggarwal, Recommender Systems
- What Developers Actually Need to Know About Machine Learning - Apr 14, 2016.
Some guidance on what, exactly, it is that developers need to know to get up to speed with machine learning.
Advice, Developers, Machine Learning
- Association Rules and the Apriori Algorithm: A Tutorial - Apr 14, 2016.
A great and clearly-presented tutorial on the concepts of association rules and the Apriori algorithm, and their roles in market basket analysis.
Pages: 1 2
Algobeans, Annalyn Ng, Apriori, Association Rules
- Regression & Correlation for Military Promotion: A Tutorial - Apr 13, 2016.
A clear and well-written tutorial covering the concepts of regression and correlation, focusing on military commander promotion as a use case.
Pages: 1 2
Algobeans, Correlation, Military, Regression
- Advantages and Risks of Self-Service Analytics - Apr 13, 2016.
Self-service analytics is likely to spread in all the business layers, and with proper care to avoid certain risks, the culture of self-service analytics will help all organizations.
Analytics, Citizen Data Scientist, Gartner, Risks, Self-service
- New Deep Learning Book Finished, Finalized Online Version Available - Apr 12, 2016.
What will likely become known as the seminal book on deep learning is finally finished, with the online version finalized and freely-accessible to all those interested in mastering deep neural networks.
Aaron Courville, Book, Deep Learning, Free ebook, Ian Goodfellow, Yoshua Bengio
- CrowdFlower 2016 Data Science Report - Apr 11, 2016.
A new data science report with survey results related to the success and challenges of data scientists, and where data science is going as a discipline.
CrowdFlower, Data Science, Report
- JSU Computational and Data-Enabled Science and Engineering Program - Apr 7, 2016.
JSU is among the first minority serving institutions to create a Big Data focused doctoral and graduate program for MS and PhD in Computational and Data-Enabled Science and Engineering - apply now.
Data Science Education, Jackson, Jackson State University, MS
- Basics of GPU Computing for Data Scientists - Apr 7, 2016.
With the rise of neural network in data science, the demand for computationally extensive machines lead to GPUs. Learn how you can get started with GPUs & algorithms which could leverage them.
Algorithms, CUDA, Data Science, GPU, NVIDIA
- Deep Learning for Internet of Things Using H2O - Apr 6, 2016.
H2O is feature-rich open source machine learning platform known for its R and Spark integration and it’s ease of use. This is an overview of using H2O deep learning for data science with the Internet of Things.
Deep Learning, H2O, Internet of Things, IoT, R
- 10 Signs Of A Bad Data Scientist - Apr 6, 2016.
With the number of people claiming to be a data scientist growing, the “true” data scientists are becoming hard to find. Here your guide identify the clues to catch a bad data scientists.
Data Scientist, Skills
- Salford Predictive Modeler 8: Faster. More Machine Learning. Better results - Apr 4, 2016.
Take a giant step forward with SPM 8: Download and try it for yourself just released version 8 and get better results.
Classification, Data Science Platform, Decision Trees, Regression, Salford Systems, TreeNet
- The Secret to a Perfect Data Science Interview - Apr 1, 2016.
How to interview a Data Scientist, in 5 steps. The secret to answering every question perfectly :).
Cartoon, Data Scientist, Humor, Interview Questions
- How to Compute the Statistical Significance of Two Classifiers Performance Difference - Mar 30, 2016.
To determine whether a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. Here we are demonstrating how you can compute difference between two models using it.
Classifier, Cross-validation, Model Performance, Statistical Significance
- 100 Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning - Mar 29, 2016.
Stay on top of your data science skills game! Here’s a list of about 100 most active and interesting blogs on Big Data, Data Science, Data Mining, Machine Learning, and Artificial intelligence.
Pages: 1 2
Big Data, Blogs, Data Science, Deep Learning, Hadoop, Machine Learning
- Don’t Buy Machine Learning - Mar 28, 2016.
In many projects, the amount of effort spent on R&D on Machine Learning is usually a small fraction of the total effort, or it’s not even there because we plan it for a future phase after building the application first.
Advice, Industry, Machine Learning
- Cartoon: Citizen Data Scientist At Work - Mar 26, 2016.
KDnuggets Cartoon examines Citizen Data Scientist at work and his previous career as a citizen dentist and a citizen pilot.
Cartoon, Citizen Data Scientist, Humor
- How to combat financial fraud by using big data? - Mar 25, 2016.
Financial fraud methods are becoming more sophisticated and the techniques to combat such attacks also need to evolve. Big data has brought with it novel fraud detection and prevention techniques such as behavioral analysis and real-time detection to give fraud fighting techniques a new perspective.
Alibaba, Banking, Big Data, Fraud, Fraud Detection, Fraud Prevention
- XGBoost: Implementing the Winningest Kaggle Algorithm in Spark and Flink - Mar 24, 2016.
An overview of XGBoost4J, a JVM-based implementation of XGBoost, one of the most successful recent machine learning algorithms in Kaggle competitions, with distributed support for Spark and Flink.
Apache Spark, Distributed Systems, Flink, Kaggle, XGBoost
- Top 10 Data Science Resources on Github - Mar 24, 2016.
The top 10 data science projects on Github are chiefly composed of a number of tutorials and educational resources for learning and doing data science. Have a look at the resources others are using and learning from.
Coursera, GitHub, IPython, Johns Hopkins, Open Source, Top 10
- Doing Data Science: A Kaggle Walkthrough – Cleaning Data - Mar 23, 2016.
Gain insight into the process of cleaning data for a specific Kaggle competition, including a step by step overview.
Pages: 1 2
Data Cleaning, Data Preparation, Kaggle, Pandas, Python
- R Learning Path: From beginner to expert in R in 7 steps - Mar 23, 2016.
This learning path is mainly for novice R users that are just getting started but it will also cover some of the latest changes in the language that might appeal to more advanced R users.
Pages: 1 2 3
7 Steps, Data Preparation, Data Science Education, Data Visualization, DataCamp, Hadley Wickham, Learning Path, Maps, R
- Lift Analysis – A Data Scientist’s Secret Weapon - Mar 22, 2016.
Gain insight into using lift analysis as a metric for doing data science. Understand how to use it for evaluating the performance and quality of a machine learning model.
Data Science, Lift charts, Metrics
- Must Know Tips for Deep Learning Neural Networks - Mar 22, 2016.
Deep learning is white hot research topic. Add some solid deep learning neural network tips and tricks from a PhD researcher.
Pages: 1 2
Convolutional Neural Networks, Deep Learning
- Netflix Prize Analyzed: Movie Ratings and Recommender Systems - Mar 18, 2016.
A 195-page monograph by a top-1% Netflix Prize contestant. Learn about the famous machine learning competition. Improve your machine learning skills. Learn how to build recommender systems.
Free ebook, Netflix, Recommender Systems
- The Data Science Game – Student Competition - Mar 17, 2016.
The Data Science Game returns this year, with university students competing for dominance. Details for this iteration and further information is provided here.
Competition, Data Science, France, Kaggle, Paris, Student Competition
- New KDnuggets Tutorials Page: Learn R, Python, Data Visualization, Data Science, and more - Mar 16, 2016.
Introducing new KDnuggets Tutorials page with useful resources for learning about Business Analytics, Big Data, Data Science, Data Mining, R, Python, Data Visualization, Spark, Deep Learning and more.
Data Science Education, Online Education, Python, R
- The Evolution of the Data Scientist - Mar 16, 2016.
We trace the evolution of Data Science from ancient mathematics to statistics and early neural networks, to present successes like AlphaGo and self-driving car, and look into the future.
Automated, Data Scientist, Demis Hassabis, Evolution, Mathematics, Statistics
- How to tell a great analyst from a good analyst - Mar 15, 2016.
Good analyst help businesses to stay in the competition, but great analyst sets the business apart from its competition. Learn more about how to be a great analyst by walking that extra mile.
Analyst, Data Science Skills, Quandl
- What Should Data Scientists Know About Psychology? - Mar 14, 2016.
Due to training in the scientific method, data management, statistics/data analysis, subject matter expertise, and communicating results into substantive knowledge psychology researchers must have a solid understanding of data science and vice-versa.
Data Scientist, Methodology, Psychology
- What is the influence of Big Data in Medicine? - Mar 14, 2016.
The 360-degree customer view is the idea, that companies can get a complete view of customers by aggregating data from the various touch points that a user. And, big data is helping to materialize this idea, which will revolutionize the healthcare.
Big Data, Customer Analytics, Healthcare
- 3 Viable Ways to Extract Data from the Open Web - Mar 11, 2016.
We look at 3 main ways to handle data extraction from the open web, along with some tips on when each one makes the most sense as a solution.
Crawler, import.io, Web Mining, Web services, Webhose.io
- The Data Science Puzzle, Explained - Mar 10, 2016.
The puzzle of data science is examined through the relationship between several key concepts in the data science realm. As we will see, far from being concrete concepts etched in stone, divergent opinions are inevitable; this is but another opinion to consider.
Pages: 1 2
Artificial Intelligence, Data Mining, Data Science, Deep Learning, Explained, Machine Learning
- The Data Science Process, Rediscovered - Mar 9, 2016.
The Data Science Process is a relatively new framework for doing data science. It is compared to previous similar frameworks, and a discussion on process innovation versus repetition is then undertaken.
Data Science
- Deriving Better Insights from Time Series Data with Cycle Plots - Mar 9, 2016.
Visualization plays key role in analysis of time series data, to understand underlying trends. Here we are demonstrating the cycle plot which shows both the cycle or trend and the day-of-the-week or the month-of-the-year effect.
CleverTap, Data Visualization, Time Series
- Top February stories: 21 Must-Know Data Science Interview Q&A; Gartner 2016 MQ for Advanced Analytics: gainers and losers - Mar 8, 2016.
21 Must-Know Data Science Interview Questions and Answers; Top 10 TED Talks for the Data Scientists; Gartner 2016 Magic Quadrant for Advanced Analytics Platforms: gainers and losers.
Top stories
- AI and Machine Learning: Top Influencers and Brands - Mar 8, 2016.
Onalytica gives us a new list of the top 100 Artifical Intelligence and Machine Learning influencers and brands, and provides some insight into the relationships between them.
About Gregory Piatetsky, AI, Influencers, Kirk D. Borne, Machine Learning, Onalytica, Top list
- Watch the Geek Rap Video – Predictive Analytics Song - Mar 8, 2016.
“PREDICT THIS!” is the first pop song to present analytics content with Gangnam Style humor, and media-blending 80’s throwback visuals. The rapper, formerly known as Dr. Eric Siegel (co-founder of Predictive Analytics World) said, “I only answer to ‘Dr. Data’ now.”
Eric Siegel, Humor, Music, Predictive Analytics
- Self-Paced E-Learning course: Credit Risk Modeling - Mar 8, 2016.
The course covers basic and advanced modeling, including stress testing Probability of Default (PD), Loss Given Default (LGD ) and Exposure At Default (EAD) models.
Bart Baesens, Credit Risk, Online Education, Risk Modeling
- Introducing GraphFrames, a Graph Processing Library for Apache Spark - Mar 7, 2016.
An overview of Spark's new GraphFrames, a graph processing library based on DataFrames, built in a collaboration between Databricks, UC Berkeley's AMPLab, and MIT.
Apache Spark, Databricks, Graph Analytics
- Fastest Growing Programming Languages and Computing Frameworks - Mar 7, 2016.
A new model for ranking programming languages and predicting the growth of user adoption. Includes current language rankings and predictions.
Data Science, Javascript, Programming Languages, SQL, Trends
- The Data Science Process - Mar 4, 2016.
What does a day in the data science life look like? Here is a very helpful framework that is both a way to understand what data scientists do, and a cheat sheet to break down any data science problem.
CRISP-DM, Data Science, Methodology, Springboard
- scikit-feature: Open-Source Feature Selection Repository in Python - Mar 3, 2016.
scikit-feature is an open-source feature selection repository in python, with around 40 popular algorithms in feature selection research. It is developed by Data Mining and Machine Learning Lab at Arizona State University.
Data Mining, Data Science, Feature Extraction, Feature Selection, Machine Learning, Python
- Top Big Data Processing Frameworks - Mar 3, 2016.
A discussion of 5 Big Data processing frameworks: Hadoop, Spark, Flink, Storm, and Samza. An overview of each is given and comparative insights are provided, along with links to external resources on particular related topics.
Apache Samza, Apache Spark, Apache Storm, Flink, Hadoop
- Top Spark Ecosystem Projects - Mar 2, 2016.
Apache Spark has developed a rich ecosystem, including both official and third party tools. We have a look at 5 third party projects which complement Spark in 5 different ways.
Apache Mesos, Apache Spark, Cassandra, Databricks, Distributed Systems
- New Salford Predictive Modeler 8 - Mar 1, 2016.
Salford Predictive Modeler software suite: Faster. More Comprehensive Machine Learning. More Automation. Better results. Take a giant step forward in your data science productivity with SPM 8. Download and try it today!
Data Science Platform, Decision Trees, Gradient Boosting, Predictive Modeler, Regression, Salford Systems
- The Mirage of a Citizen Data Scientist - Mar 1, 2016.
The term "citizen data scientist" has been irritating me recently. I explain why I think it both a bad term and a bad idea, and what we need instead.
Citizen Data Scientist, Data Analyst, Data Scientist, Gartner, Overfitting
- Dynamic Data Visualization with PHP and MySQL: Election Spending - Mar 1, 2016.
Learn how to fetch data from MySQL database using PHP and create dynamic charts with that data, using an interesting example of New Hampshire primary election spending.
Pages: 1 2
Data Visualization, FusionCharts, MySQL, PHP
- Distributed TensorFlow Has Arrived - Mar 1, 2016.
Google has open sourced its distributed version of TensorFlow. Get the info on it here, and catch up on some other TensorFlow news at the same time.
Deep Learning, Distributed Systems, Google, Matthew Mayo, TensorFlow
- Data Science and Disability - Mar 1, 2016.
Data Science and Artificial Intelligence has come to the forefront of technology in the last few years. Learn, how practitioners are taking a more philanthropic outlook on life, supporting people suffering with both physical and mental disabilities.
Data Science, Disability, Healthcare
- Building Zoomable Line Charts in jQuery - Feb 25, 2016.
Learn how to build zoomable line charts using FusionCharts’ core JS library and its jQuery charts plugin, and get started making some beautiful data visualizations for the web.
Data Visualization, FusionCharts, Javascript
- Tree Kernels: Quantifying Similarity Among Tree-Structured Data - Feb 23, 2016.
An in-depth, informative overview of tree kernels, both theoretical and practical. Includes a use case and some code after the discussion.
Pages: 1 2 3
Decision Trees, Graph Mining, Web Mining
- A comparison between PCA and hierarchical clustering - Feb 23, 2016.
Graphical representations of high-dimensional data sets are the backbone of exploratory data analysis. We examine 2 of the most commonly used methods: heatmaps combined with hierarchical clustering and principal component analysis (PCA).
Clustering, Data Visualization, Life Science, PCA, Qlucore
- How Small is the World, Really? - Feb 22, 2016.
Social network analysis is back in the news again, with a recent Facebook project which determined that there are an average of 3.5 intermediaries between any 2 Facebook users. But this is different than "6 degrees of separation." Read on to find out why, and how.
Duncan Watts, Facebook, Small World
- Top 10 Data Visualization Projects on Github - Feb 22, 2016.
Github provides a number of open source data visualization options for data scientists and application developers integrating quality visuals. This is a list and description of the top project offerings available, based on the number of stars.
D3.js, Data Visualization, GitHub, Matthew Mayo, Open Source, Top 10
- How Data Science is Fighting Disease - Feb 22, 2016.
Many organisations are starting to use Data Science as a method of tracking, diagnosing and curing some of the world’s most widespread diseases. We look at 3 common diseases, and how Data Science is used to save lives.
Ebola, Enlitic, Healthcare, MJFF
- 21 Must-Know Data Science Interview Questions and Answers, part 2 - Feb 20, 2016.
Second part of the answers to 20 Questions to Detect Fake Data Scientists, including controlling overfitting, experimental design, tall and wide data, understanding the validity of statistics in the media, and more.
Pages: 1 2 3
Anomaly Detection, Data Science, Data Visualization, Overfitting, Recommender Systems
- Getting Started with Data Visualization - Feb 19, 2016.
Data visualization is on the rise nowadays. This step-by-step tutorial covers the process of creating your first data visualization using FusionCharts.
Data Visualization, FusionCharts, Javascript
- Opening Up Deep Learning For Everyone - Feb 19, 2016.
Opening deep learning up to everyone is a noble goal. But is it achievable? Should non-programmers and even non-technical people be able to implement deep neural models?
Caffe, Deep Learning, Feature Engineering, Open Source, TensorFlow
- Data Lake Plumbers: Operationalizing the Data Lake - Feb 18, 2016.
Gain insight into data lakes, their benefits, when they are appropriate, and how to operationalize them. How do they compare to the data warehouse?
Data Lake, Data Warehouse, ETL, Hadoop
- Big Data Is Driving Your Car - Feb 18, 2016.
Never mind driverless cars! Big Data is already hard at work in every aspect of the automotive industry, including safety, design, marketing and more. We look at where Big Data is having an impact on the cars that we are driving.
Big Data, Cars, IoT
- How IBM Watson is Taking on The World - Feb 18, 2016.
We have made tremendous progress in the field of data analysis and on the other, our technology is getting smart. IBM has taken a solid stride in the direction of Artificial Intelligence by unveiling its supercomputer IBM Watson, learn what it can do, its adopters and what it holds for the future.
Artificial Intelligence, DeZyre, IBM, Watson
- Amazon Machine Learning: Nice and Easy or Overly Simple? - Feb 17, 2016.
Amazon Machine Learning is a predictive analytics service with binary/multiclass classification and linear regression features. The service is fast, offers a simple workflow but lacks model selection features and has slow execution times.
Amazon, Classification, Machine Learning, MLaaS
Gartner 2016 Magic Quadrant for Advanced Analytics Platforms: gainers and losers - Feb 16, 2016.
We compare Gartner 2016 Magic Quadrant Advanced Analytics Platforms vs its 2015 version and identify notable changes for leaders and challengers: SAS, IBM, RapidMiner, KNIME, Dell, Angoss, and Microsoft.
Advanced Analytics, Dell, Gartner, IBM, Knime, Magic Quadrant, RapidMiner, SAS
- The ICLR Experiment: Deep Learning Pioneers Take on Scientific Publishing - Feb 15, 2016.
Deep learning pioneers Yann LeCun and Yoshua Bengio have undertaken a grand experiment in academic publishing. Embracing a radical level of transparency and unprecedented public participation, they've created an opportunity not only to find and vet the best papers, but also to gather data about the publication process itself.
Academics, arXiv, Deep Learning, ICLR, Neural Networks, Yann LeCun, Yoshua Bengio, Zachary Lipton
- Data Scientist Valentine’s Day Collection - Feb 13, 2016.
We review Data Scientist Valentine's Day options with several topical cartoons, including Scarledoopython, Neural net predictions, and dating algorithm adjustments.
Cartoon, Humor, Valentine's Day
- Elementary, My Dear Watson! An Introduction to Text Analytics via Sherlock Holmes - Feb 12, 2016.
Want to learn about the field of text mining, go on an adventure with Sherlock & Watson. Here you will find what are different sub-domains of text mining along with a practical example.
Dato, NLP, Sherlock Holmes, Text Analytics
Scikit Flow: Easy Deep Learning with TensorFlow and Scikit-learn - Feb 12, 2016.
Scikit Learn is a new easy-to-use interface for TensorFlow from Google based on the Scikit-learn fit/predict model. Does it succeed in making deep learning more accessible?
Deep Learning, Google, Matthew Mayo, Python, scikit-learn, TensorFlow
- Data Science Skills for 2016 - Feb 12, 2016.
As demand for the hottest job is getting hotter in new year, the skill set required for them is getting larger. Here, we are discussing the skills which will be in high demand for data scientist which include data visualization, Apache Spark, R, python and many more.
Apache Spark, CrowdFlower, Data Science, Python, Skills, SQL
- Does Machine Learning allow opposites to attract? - Feb 11, 2016.
Most online dating sites use 'Netflix-style' recommendations which match people based on their shared interests and likes. What about those matches that work so well because people are so different - here is my example.
Love, Machine Learning, Online Dating, Recommendations
21 Must-Know Data Science Interview Questions and Answers - Feb 11, 2016.
KDnuggets Editors bring you the answers to 20 Questions to Detect Fake Data Scientists, including what is regularization, Data Scientists we admire, model validation, and more.
Pages: 1 2 3
Bootstrap sampling, Data Science, Interview Questions, Kirk D. Borne, Precision, Recall, Regularization, Yann LeCun
- Auto-Scaling scikit-learn with Spark - Feb 11, 2016.
Databricks gives us an overview of the spark-sklearn library, which automatically and seamlessly distributes model tuning on a Spark cluster, without impacting workflow.
Apache Spark, Databricks, Open Source, scikit-learn
- 9 Must-Have Datasets for Investigating Recommender Systems - Feb 11, 2016.
Gain some insight into a variety of useful datasets for recommender systems, including data descriptions, appropriate uses, and some practical comparison.
Datasets, Lab41, Recommender Systems
- 4 Reasons Why We Need More Women In Big Data - Feb 10, 2016.
Gender imbalance in the workforce has been highlighted alarmingly during the recent years. Here, we are providing you a couple of reasons, including the inherent advantage and lack of stereotype for role to hire women data scientists.
Big Data, Hiring, Women
Top 10 TED Talks for the Data Scientists - Feb 9, 2016.
TEDTalks have been a great platform for sharing ideas and inspirations. Here, we have sifted ten interesting talks for the data scientist from statistics, social media and economics domains.
Data Science, Hans Rosling, Social Networks, Statistics, TED
- Avoid These Common Data Visualization Mistakes - Feb 8, 2016.
Data Visualization is a handy tool which can lead to interesting discoveries about the data, which otherwise wouldn’t have been possible. But, there are common mistakes which could produce the misdirecting results. Learn what are they and how you can avoid them.
Data Visualization, Mistakes
- Cartoon: Deeper Deep Learning - Feb 1, 2016.
New KDnuggets Cartoon looks at a creative new way of achieving even better results and breaking through Machine Learning barriers with even "deeper" Deep Learning approach.
Cartoon, Deep Learning
- AI Supercomputers: Microsoft Oxford, IBM Watson, Google DeepMind, Baidu Minwa - Feb 1, 2016.
In the world of AI, this is the equivalent of the US and USSR competing to put their guy on the moon first. Here is a profile of some of the giants locked into the AI space race.
AI, Baidu, Chris Pearson, DeepMind, Google, IBM Watson, Microsoft, Minwa
- Python Data Science with Pandas vs Spark DataFrame: Key Differences - Jan 29, 2016.
A post describing the key differences between Pandas and Spark's DataFrame format, including specifics on important regular processing features, with code samples.
Apache Spark, Pandas, Python
- Is Deep Learning Overhyped? - Jan 29, 2016.
With all of the success that deep learning is experiencing, the detractors and cheerleaders can be seen coming out of the woodwork. What is the real validity of deep learning, and is it simply hype?
Deep Learning, Hype, Matthew Mayo, Quora, Yoshua Bengio
- Deep Learning with Spark and TensorFlow - Jan 28, 2016.
The integration of TensorFlow with Spark leverages the distributed framework for hyperparameter tuning and model deployment at scale. Both time savings and improved error rates are demonstrated.
Apache Spark, Deep Learning, Distributed Systems, TensorFlow
- Businesses Will Need One Million Data Scientists by 2018 - Jan 28, 2016.
Deepening shortage of Data Science talent and cybersecurity challenges are trends shaping business in 2016.
Challenges, Cybersecurity, Data Scientist, Deloitte, Internet of Things, Trends
- How to Check Hypotheses with Bootstrap and Apache Spark - Jan 28, 2016.
Learn how to leverage bootstrap sampling to test hypotheses, and how to implement in Apache Spark and Scala with a complete code example.
Apache Spark, Bootstrap sampling, Dmitry Petrov, Statistical Analysis
- Useful Data Science: Feature Hashing - Jan 28, 2016.
Feature engineering plays major role while solving the data science problems. Here, we will learn Feature Hashing, or the hashing trick which is a method for turning arbitrary features into a sparse binary vector.
Feature Engineering, Hashing, Python, Will McGinnis
- Implementing Your Own k-Nearest Neighbor Algorithm Using Python - Jan 27, 2016.
A detailed explanation of one of the most used machine learning algorithms, k-Nearest Neighbors, and its implementation from scratch in Python. Enhance your algorithmic understanding with this hands-on coding exercise.
Pages: 1 2 3
K-nearest neighbors, Python, Python Tutorial
- How to Tackle a Lottery with Mathematics - Jan 27, 2016.
With mathematical rigor and narrative flair, Adam Kucharski reveals the tangled history of betting and science. The house can seem unbeatable. In this book, Kucharski shows us just why it isn't. Even better, he shows us how the search for the perfect bet has been crucial for the scientific pursuit of a better world.
Lottery, Mathematics
- Google Launches Deep Learning with TensorFlow MOOC - Jan 26, 2016.
Google and Udacity have partnered for a new self-paced course on deep learning and TensorFlow, starting immediately.
Google, Matthew Mayo, MOOC, TensorFlow, Udacity
- Top 2015 KDnuggets Stories on Analytics, Big Data, Data Science, Data Mining, Machine Learning, updated - Jan 21, 2016.
R vs Python for Data Science: The Winner is ...; 60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning; Top 20 Python Machine Learning Open Source Projects; 50+ Data Science and Machine Learning Cheat Sheets.
Top stories
- Anthony Goldbloom gives you the Secret to winning Kaggle competitions - Jan 20, 2016.
Kaggle CEO shares insights on best approaches to win Kaggle competitions, along with a brief explanation of how Kaggle competitions work.
Anthony Goldbloom, Competition, Deep Learning, Feature Engineering, Kaggle, Neural Networks, Success
- Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers - Jan 18, 2016.
Are you interested in massive amounts of data for research? Yahoo has just released the largest-ever machine learning dataset to the research community.
Anonymized, Dataset, Machine Learning, Yahoo
- Research Leaders on Data Mining, Data Science and Big Data key advances, top trends - Jan 18, 2016.
Research Leaders in Data Science and Big Data reflect on the most important research advances in 2015 and the key trends expected to dominate throughout 2016.
Pages: 1 2
Bing Liu, Charu Aggarwal, Deep Learning, Ingo Mierswa, Internet of Things, IoT, Michael Berthold, Mohammed Zaki, Neural Networks, Padhraic Smyth, Pedro Domingos, Research, Trends
- Data Science Humor: Google Analytics, if Applied in Real Life - Jan 16, 2016.
From the lighter side: how Google Analytics would look if applied in real life situations.
Google Analytics, Happy Data Scientist, Humor
- Top 100 Big Data Experts to Follow - Jan 15, 2016.
Maptive gives us another list of top Big Data Influencers to check out, including data-driven reasons as to why individuals are included.
Big Data Influencers, Maptive, Matthew Mayo, Top list
- Top 10 Deep Learning Projects on Github - Jan 13, 2016.
The top 10 deep learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
Caffe, Deep Learning, GitHub, Open Source, Top 10, Tutorials
- Free Online Course: Statistical Learning - Jan 12, 2016.
With a free MOOC from Stanford, dive into statistical learning with the respected professors who literally wrote the book on it.
MOOC, Robert Tibshirani, Statistical Learning, Trevor Hastie
- Attention and Memory in Deep Learning and NLP - Jan 12, 2016.
An overview of attention mechanisms and memory in deep neural networks and why they work, including some specific applications in natural language processing and beyond.
Pages: 1 2
Deep Learning, Machine Translation, NLP, Recurrent Neural Networks
- 7 Steps to Understanding Deep Learning - Jan 11, 2016.
There are many deep learning resources freely available online, but it can be confusing knowing where to begin. Go from vague understanding of deep neural networks to knowledgeable practitioner in 7 steps!
Pages: 1 2
7 Steps, Caffe, Convolutional Neural Networks, Deep Learning, Matthew Mayo, Recurrent Neural Networks, TensorFlow, Theano
- Understanding Rare Events and Anomalies: Why streaks patterns change - Jan 8, 2016.
We often look back at the past year and an overall history of rare events, and try to then extrapolate future odds of the same rare event, based on that. We illustrate here, that rare past events have no usefulness in understanding the rarity of the same events in the future!
Pages: 1 2
Anomaly Detection, Predictions, S&P 500