Search results for "visualization"
-
Data Science at the Command Line: Exploring Data">
See what's available in the freely-available book "Data Science at the Command Line" by digging into data exploration in the terminal.
Data Science at the Command Line: Exploring Data
https://www.kdnuggets.com/2018/02/data-science-command-line-book-exploring-data.html
-
Top 15 Scala Libraries for Data Science in 2018
For your convenience, we have prepared a comprehensive overview of the most important libraries used to perform machine learning and Data Science tasks in Scala.https://www.kdnuggets.com/2018/02/top-15-scala-libraries-data-science-2018.html
-
5 Machine Learning Projects You Should Not Overlook">
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out!
5 Machine Learning Projects You Should Not Overlook
https://www.kdnuggets.com/2018/02/5-machine-learning-projects-overlook-feb-2018.html
-
5 Fantastic Practical Machine Learning Resources">
This post presents 5 fantastic practical machine learning resources, covering machine learning right from basics, as well as coding algorithms from scratch and using particular deep learning frameworks.
5 Fantastic Practical Machine Learning Resources
https://www.kdnuggets.com/2018/02/5-fantastic-practical-machine-learning-resources.html
-
My Journey into Deep Learning
In this post I’ll share how I’ve been studying Deep Learning and using it to solve data science problems. It’s an informal post but with interesting content (I hope).https://www.kdnuggets.com/2018/01/journey-into-deep-learning.html
-
Want to Become a Data Scientist? Try Feynman Technique">
Get over the impostor syndrome by developing a strong understanding about the various Data Science topics using the Feynman Technique
Want to Become a Data Scientist? Try Feynman Technique
https://www.kdnuggets.com/2018/01/data-scientist-feynman-technique.html
-
Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI">
A complete and unbiased comparison of the three most common Cloud Technologies for Machine Learning as a Service.
Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI
https://www.kdnuggets.com/2018/01/mlaas-amazon-microsoft-azure-google-cloud-ai.html
-
Governance in Data Science
Governance roles for data science and analytics teams are becoming more common... One of the key functions of this role is to perform analysis and validation of data sets in order to build confidence in the underlying data sets.https://www.kdnuggets.com/2018/01/governance-data-science.html
-
How Not To Lie With Statistics
Darrell Huff's classic How to Lie with Statistics is perhaps more relevant than ever. In this short article, I revisit this theme from some different angles.https://www.kdnuggets.com/2018/01/how-not-lie-statistics.html
-
How Nonprofits Can Benefit from the Power of Data Science
Nonprofits can use analytics to boost their fundraising efforts, measure and monitor the impact of their activities, build predictive models, optimize allocation of funds, and morehttps://www.kdnuggets.com/2018/01/nonprofits-data-science.html
-
70 Amazing Free Data Sources You Should Know">
70 free data sources for 2017 on government, crime, health, financial and economic data, marketing and social media, journalism and media, real estate, company directory and review, and more to start working on your data projects.
70 Amazing Free Data Sources You Should Know
https://www.kdnuggets.com/2017/12/big-data-free-sources.html
-
Getting Started with TensorFlow: A Machine Learning Tutorial
A complete and rigorous introduction to Tensorflow. Code along with this tutorial to get started with hands-on examples.https://www.kdnuggets.com/2017/12/getting-started-tensorflow.html
-
Transitioning to Data Science: How to become a data scientist, and how to create a data science team">
"A good data scientist in my mind is the person that takes the science part in data science very seriously; a person who is able to find problems and solve them using statistics, machine learning, and distributed computing."
Transitioning to Data Science: How to become a data scientist, and how to create a data science team
https://www.kdnuggets.com/2017/12/transitioning-data-science-become-data-scientist-data-science-team.html
-
How to Generate FiveThirtyEight Graphs in Python
In this post, we'll help you. Using Python's matplotlib and pandas, we'll see that it's rather easy to replicate the core parts of any FiveThirtyEight (FTE) visualization.https://www.kdnuggets.com/2017/12/generate-fivethirtyeight-graphs-python.html
-
Data Science, Machine Learning: Main Developments in 2017 and Key Trends in 2018">
The leading experts in the field on the main Data Science, Machine Learning, Predictive Analytics developments in 2017 and key trends in 2018.
Data Science, Machine Learning: Main Developments in 2017 and Key Trends in 2018
https://www.kdnuggets.com/2017/12/data-science-machine-learning-main-developments-trends.html
-
Best Masters in Data Science and Analytics – Europe Edition
The third part of our comprehensive, unbiased survey of graduate programs in Data Science and Analytics, examining the programs from Europe.https://www.kdnuggets.com/2017/12/best-masters-data-science-analytics-europe.html
-
Big Data: Main Developments in 2017 and Key Trends in 2018">
As we bid farewell to one year and look to ring in another, KDnuggets has solicited opinions from numerous Big Data experts as to the most important developments of 2017 and their 2018 key trend predictions.
Big Data: Main Developments in 2017 and Key Trends in 2018
https://www.kdnuggets.com/2017/12/big-data-main-developments-2017-key-trends-2018.html
-
Graph Analytics Using Big Data
An overview and a small tutorial showing how to analyze a dataset using Apache Spark, graphframes, and Java.https://www.kdnuggets.com/2017/12/graph-analytics-using-big-data.html
-
A General Approach to Preprocessing Text Data
Recently we had a look at a framework for textual data science tasks in their totality. Now we focus on putting together a generalized approach to attacking text data preprocessing, regardless of the specific textual data science task you have in mind.https://www.kdnuggets.com/2017/12/general-approach-preprocessing-text-data.html
-
Evolutionary Algorithms for Feature Selection
Feature selection is a very important technique in machine learning. In this post we discuss one of the most common optimization algorithms for multi-modal fitness landscapes - evolutionary algorithms.https://www.kdnuggets.com/2017/11/rapidminer-evolutionary-algorithms-feature-selection.html
-
Understanding Deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras">
We show how to build a deep neural network that classifies images to many categories with an accuracy of a 90%. This was a very hard problem before the rise of deep networks and especially Convolutional Neural Networks.
Understanding Deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras
https://www.kdnuggets.com/2017/11/understanding-deep-convolutional-neural-networks-tensorflow-keras.html
-
A Framework for Approaching Textual Data Science Tasks">
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
A Framework for Approaching Textual Data Science Tasks
https://www.kdnuggets.com/2017/11/framework-approaching-textual-data-tasks.html
-
Best Masters in Data Science and Analytics in US/Canada
Second comprehensive list of master's degrees in the US and Canada with tuition information and duration.https://www.kdnuggets.com/2017/11/best-masters-data-science-analytics-us-canada.html
-
8 Ways to Improve Your Data Science Skills in 2 Years
Two years. Two years is the maximum amount of time you should spend focused on your learning, education and training. That’s exactly why this guide is focused on honing the most beneficial skills in two years.https://www.kdnuggets.com/2017/11/8-ways-improve-data-science-skills-2-years.html
-
The Python Graph Gallery
Welcome to the Python Graph Gallery, a website that displays hundreds of python charts with their reproducible code snippets.https://www.kdnuggets.com/2017/11/python-graph-gallery.html
-
PySpark SQL Cheat Sheet: Big Data in Python
PySpark is a Spark Python API that exposes the Spark programming model to Python - With it, you can speed up analytic applications. With Spark, you can get started with big data processing, as it has built-in modules for streaming, SQL, machine learning and graph processing.https://www.kdnuggets.com/2017/11/pyspark-sql-cheat-sheet-big-data-python.html
-
Best Online Masters in Data Science and Analytics – a comprehensive, unbiased survey">
The first comprehensive and objective survey of online Masters in Analytics / Data Science, including rankings, tuition, and duration of the education program.
Best Online Masters in Data Science and Analytics – a comprehensive, unbiased survey
https://www.kdnuggets.com/2017/11/best-online-masters-analytics-data-science.html
-
A Day in the Life of a Data Scientist">
Are you interested in what a data scientist does on a typical day of work? Each data science role may be different, but these five individuals provide insight to help those interested in figuring out what a day in the life of a data scientist actually looks like.
A Day in the Life of a Data Scientist
https://www.kdnuggets.com/2017/11/day-life-data-scientist.html
-
Tips for Getting Started with Text Mining in R and Python
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.https://www.kdnuggets.com/2017/11/getting-started-text-mining-r-python.html
-
Interpreting Machine Learning Models: An Overview">
This post summarizes the contents of a recent O'Reilly article outlining a number of methods for interpreting machine learning models, beyond the usual go-to measures.
Interpreting Machine Learning Models: An Overview
https://www.kdnuggets.com/2017/11/interpreting-machine-learning-models-overview.html
-
Getting Started with Machine Learning in One Hour!
Here is a machine learning getting started guide which grew out of the author's notes for a one hour talk on the subject. Hopefully you find the path helpful.https://www.kdnuggets.com/2017/11/getting-started-machine-learning-one-hour.html
-
6 Books Every Data Scientist Should Keep Nearby">
The best way to stay in touch is to continue brushing up on your knowledge while also maintaining experience. It’s the perfect storm or combination of skills to help you succeed in the industry.
6 Books Every Data Scientist Should Keep Nearby
https://www.kdnuggets.com/2017/10/6-books-every-data-scientist-should-keep-nearby.html
-
Top 6 errors novice machine learning engineers make
What common mistakes beginners do when working on machine learning or data science projects? Here we present list of such most common errors.https://www.kdnuggets.com/2017/10/top-errors-novice-machine-learning-engineers.html
-
Top 10 Machine Learning with R Videos
A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.https://www.kdnuggets.com/2017/10/top-10-machine-learning-r-videos.html
-
Ranking Popular Deep Learning Libraries for Data Science">
We rank 23 open-source deep learning libraries that are useful for Data Science. The ranking is based on equally weighing its three components: Github and Stack Overflow activity, as well as Google search results.
Ranking Popular Deep Learning Libraries for Data Science
https://www.kdnuggets.com/2017/10/ranking-popular-deep-learning-libraries-data-science.html
-
7 Types of Artificial Neural Networks for Natural Language Processing">
What is an artificial neural network? How does it work? What types of artificial neural networks exist? How are different types of artificial neural networks used in natural language processing? We will discuss all these questions in the following article.
7 Types of Artificial Neural Networks for Natural Language Processing
https://www.kdnuggets.com/2017/10/7-types-artificial-neural-networks-natural-language-processing.html
-
7 Techniques to Visualize Geospatial Data">
In this article, we explore 7 interesting yet simple techniques to visualize geospatial data that will help you visualize your data better.
7 Techniques to Visualize Geospatial Data
https://www.kdnuggets.com/2017/10/7-techniques-visualize-geospatial-data.html
-
Want to Become a Data Scientist? Read This Interview First">
There’s been a lot of hype about Data Science... and probably just as much confusion about it.
Want to Become a Data Scientist? Read This Interview First
https://www.kdnuggets.com/2017/10/become-data-scientist-read-interview-first.html
-
Data Science Bootcamp in Zurich, Switzerland, January 15 – April 6, 2018
Come to the land of chocolate and Data Science where the local tech scene is booming and the jobs are a plenty. Learn the most important concepts from top instructors by doing and through projects. Use code KDNUGGETS to save.https://www.kdnuggets.com/2017/10/propulsion-data-science-bootcamp-zurich.html
-
An opinionated Data Science Toolbox in R from Hadley Wickham, tidyverse
Get your productivity boosted with Hadley Wickham's powerful R package, tidyverse. It has all you need to start developing your own data science workflows.https://www.kdnuggets.com/2017/10/tidyverse-powerful-r-toolbox.html
-
Top 15 Master of Data Science Programs You May Want To Consider
Top MS of Data Science Programs in the US - on-campus and online that teach you how to humanize data and what you can do to make a difference in your company.https://www.kdnuggets.com/2017/10/top-15-master-data-science-programs.html
-
Data Science –The need for a Systems Engineering approach
We need a greater emphasis on the Systems Engineering aspects of Data Science. I am exploring these ideas as part of my course "Data Science for Internet of Things" at the University of Oxford.https://www.kdnuggets.com/2017/10/data-science-systems-engineering-approach.html
-
Find Out What Celebrities Tweet About the Most
Word cloud is a popular data visualisation method. Here we show how to use R to create twitter word cloud of celebrities and politicians.https://www.kdnuggets.com/2017/10/what-celebrities-tweet-about-most.html
-
Using Machine Learning to Predict and Explain Employee Attrition">
Employee attrition (churn) is a major cost to an organization. We recently used two new techniques to predict and explain employee turnover: automated ML with H2O and variable importance analysis with LIME.
Using Machine Learning to Predict and Explain Employee Attrition
https://www.kdnuggets.com/2017/10/machine-learning-predict-employee-attrition.html
-
Top 10 Videos on Machine Learning in Finance">
Talks, tutorials and playlists – you could not get a more gentle introduction to Machine Learning (ML) in Finance. Got a quick 4 minutes or ready to study for hours on end? These videos cover all skill levels and time constraints!
Top 10 Videos on Machine Learning in Finance
https://www.kdnuggets.com/2017/09/top-10-videos-machine-learning-finance.html
-
Meet Lucy: Creating a Chatbot Prototype
This article walks you through a step by step process and comes with starter code for building your own chatbot. In the end we also provide some pointers for folks looking to take this proof of concept to production stage.https://www.kdnuggets.com/2017/09/meet-lucy-chatbot-prototype.html
-
Top 10 Active Big Data, Data Science, Machine Learning Influencers on LinkedIn, Updated">
Looking for advice? Guidance? Stories? We’ve put a list of the top ten LinkedIn influencers of the last three months, follow them and stay up-to-date with the latest news in Big Data, Data Science, Analytics, Machine Learning and AI.
Top 10 Active Big Data, Data Science, Machine Learning Influencers on LinkedIn, Updated
https://www.kdnuggets.com/2017/09/top-10-big-data-science-machine-learning-influencers-linkedin-updated.html
-
Visualizing High Dimensional Data In Augmented Reality
When Data Scientists first get a data set, they oftne use a matrix of 2D scatter plots to quickly see the contents and relationships between pairs of attributes. But for data with lots of attributes, such analysis does not scale.https://www.kdnuggets.com/2017/09/ibm-visualizing-high-dimensional-data-augmented-reality.html
-
30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets">
This collection of data science cheat sheets is not a cheat sheet dump, but a curated list of reference materials spanning a number of disciplines and tools.
30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets
https://www.kdnuggets.com/2017/09/essential-data-science-machine-learning-deep-learning-cheat-sheets.html
-
How To Lie With Numbers
It takes less effort to lie without numbers, but there are now more numbers and more ways to lie with them than ever before. Poor Reverend Bayes, who understood the true meaning of "evidence".https://www.kdnuggets.com/2017/09/how-lie-with-numbers.html
-
Keras Tutorial: Recognizing Tic-Tac-Toe Winners with Neural Networks
In this tutorial, we will build a neural network with Keras to determine whether or not tic-tac-toe games have been won by player X for given endgame board configurations. Introductory neural network concerns are covered.https://www.kdnuggets.com/2017/09/neural-networks-tic-tac-toe-keras.html
-
Data Science and the Imposter Syndrome">
You are not the only one who wonders how much longer they can get away with pretending to be a data scientist. You are not the only one who has nightmares about being laughed out of your next interview.
Data Science and the Imposter Syndrome
https://www.kdnuggets.com/2017/09/data-science-imposter-syndrome.html
-
Visualizing Cross-validation Code
Cross-validation helps to improve your prediction using the K-Fold strategy. What is K-Fold you asked? Check out this post for a visualized explanation.https://www.kdnuggets.com/2017/09/visualizing-cross-validation-code.html
-
Machine Learning vs. Statistics: The Texas Death Match of Data Science">
Throughout its history, Machine Learning (ML) has coexisted with Statistics uneasily, like an ex-boyfriend accidentally seated with the groom’s family at a wedding reception: both uncertain where to lead the conversation, but painfully aware of the potential for awkwardness.
Machine Learning vs. Statistics: The Texas Death Match of Data Science
https://www.kdnuggets.com/2017/08/machine-learning-vs-statistics.html
-
37 Reasons why your Neural Network is not working">
Over the course of many debugging sessions, I’ve compiled my experience along with the best ideas around in this handy list. I hope they would be useful to you.
37 Reasons why your Neural Network is not working
https://www.kdnuggets.com/2017/08/37-reasons-neural-network-not-working.html
-
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets
In this blog, I explore three sets of APIs—RDDs, DataFrames, and Datasets—available in a pre-release preview of Apache Spark 2.0; why and when you should use each set; outline their performance and optimization benefits; and enumerate scenarios when to use DataFrames and Datasets instead of RDDs.https://www.kdnuggets.com/2017/08/three-apache-spark-apis-rdds-dataframes-datasets.html
-
The Rise of GPU Databases">
The recent but noticeable shift from CPUs to GPUs is mainly due to the unique benefits they bring to sectors like AdTech, finance, telco, retail, or security/IT . We examine where GPU databases shine.
The Rise of GPU Databases
https://www.kdnuggets.com/2017/08/rise-gpu-databases.html
-
Comparing Distance Measurements with Python and SciPy
This post introduces five perfectly valid ways of measuring distances between data points. We will also perform simple demonstration and comparison with Python and the SciPy library.https://www.kdnuggets.com/2017/08/comparing-distance-measurements-python-scipy.html
-
Google Analytics Audit Checklist and Tools
In this post, a Google Analytics & Google AdWords expert shares his tips and tools of intelligent Google Analytics auditing. Read on for some practical insight.https://www.kdnuggets.com/2017/08/google-analytics-audit-checklist-tools.html
-
Insights from Data mining of Airbnb Listings
AirBnB has 2 million listings and operates in 65,000 cities. Here we look at insights related to vacation rental space in the sharing economy using the property listings data for Texas, US.https://www.kdnuggets.com/2017/08/insights-data-mining-airbnb.html
-
Visualizing Convolutional Neural Networks with Open-source Picasso
Toolkits for standard neural network visualizations exist, along with tools for monitoring the training process, but are often tied to the deep learning framework. Could a general, easy-to-setup tool for generating standard visualizations provide a sanity check on the learning process?https://www.kdnuggets.com/2017/08/visualizing-convolutional-neural-networks-open-source-picasso.html
-
Summary of Unintuitive Properties of Neural Networks
Neural networks work really well on many problems, including language, image and speech recognition. However understanding how they work is not simple, and here is a summary of unusual and counter intuitive properties they have.https://www.kdnuggets.com/2017/07/unintuitive-properties-neural-networks.html
-
Exploratory Data Analysis in Python
We view EDA very much like a tree: there is a basic series of steps you perform every time you perform EDA (the main trunk of the tree) but at each step, observations will lead you down other avenues (branches) of exploration by raising questions you want to answer or hypotheses you want to test.https://www.kdnuggets.com/2017/07/exploratory-data-analysis-python.html
-
What Advice Would You Give Your Younger Data Scientist Self?">
I was asked this question recently via LinkedIn message: "What advice would you give your younger data scientist self?" The best piece of advice I honestly think I can give is this: Forget about "data science."
What Advice Would You Give Your Younger Data Scientist Self?
https://www.kdnuggets.com/2017/07/advice-younger-data-scientist-self.html
-
Applying Deep Learning to Real-world Problems">
In this blog post I shared three learnings that are important to us at Merantix when applying deep learning to real-world problems. I hope that these ideas are helpful for other people who plan to use deep learning in their business.
Applying Deep Learning to Real-world Problems
https://www.kdnuggets.com/2017/06/applying-deep-learning-real-world-problems.html
-
The world’s first protein database for Machine Learning and AI">
dSPP is the world first interactive database of proteins for AI and Machine Learning, and is fully integrated with Keras and Tensorflow. You can access the database at peptone.io/dspp
The world’s first protein database for Machine Learning and AI
https://www.kdnuggets.com/2017/06/dspp-protein-database-machine-learning-ai.html
-
K-means Clustering with Tableau – Call Detail Records Example
We show how to use Tableau 10 clustering feature to create statistically-based segments that provide insights about similarities in different groups and performance of the groups when compared to each other.https://www.kdnuggets.com/2017/06/kmeans-clustering-tableau-call-detail-records.html
-
Top 15 Python Libraries for Data Science in 2017">
Since all of the libraries are open sourced, we have added commits, contributors count and other metrics from Github, which could be served as a proxy metrics for library popularity.
Top 15 Python Libraries for Data Science in 2017
https://www.kdnuggets.com/2017/06/top-15-python-libraries-data-science.html
-
7 Steps to Mastering Data Preparation with Python">
Follow these 7 steps for mastering data preparation, covering the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem.
7 Steps to Mastering Data Preparation with Python
https://www.kdnuggets.com/2017/06/7-steps-mastering-data-preparation-python.html
-
Machine Learning Workflows in Python from Scratch Part 1: Data Preparation">
This post is the first in a series of tutorials for implementing machine learning workflows in Python from scratch, covering the coding of algorithms and related tools from the ground up. The end result will be a handcrafted ML toolkit. This post starts things off with data preparation.
Machine Learning Workflows in Python from Scratch Part 1: Data Preparation
https://www.kdnuggets.com/2017/05/machine-learning-workflows-python-scratch-part-1.html
-
DataScience.com Releases Python Package for Interpreting the Decision-Making Processes of Predictive Models
DataScience.com new Python library, Skater, uses a combination of model interpretation algorithms to identify how models leverage data to make predictions.https://www.kdnuggets.com/2017/05/datascience-skater-python-package-interpreting-predictive-models.html
-
Text Mining 101: Mining Information From A Resume">
We show a framework for mining relevant entities from a text resume, and how to separation parsing logic from entity specification.
Text Mining 101: Mining Information From A Resume
https://www.kdnuggets.com/2017/05/text-mining-information-resume.html
-
Machine Learning Crash Course: Part 1
This post, the first in a series of ML tutorials, aims to make machine learning accessible to anyone willing to learn. We’ve designed it to give you a solid understanding of how ML algorithms work as well as provide you the knowledge to harness it in your projects.https://www.kdnuggets.com/2017/05/machine-learning-crash-course-part-1.html
-
Natural Language Generation overview – is NLG is worth a thousand pictures ?
NLG tools automate the analysis and enhance traditional BI platforms by explaining in plain English the significance of visualizations and findings – here is an overview of the market.https://www.kdnuggets.com/2017/05/nlg-natural-language-generation-overview.html
-
The Quant Crunch: The demand for data science skills
This report, created by analyzing millions of job postings using advanced technology, divides Data Science and Analytics roles into 6 broad categories, and answers many questions, including cities, industries, job roles with most growth.https://www.kdnuggets.com/2017/05/quant-crunch-demand-data-science-skills.html
-
5 Machine Learning Projects You Can No Longer Overlook, May
In this month's installment of Machine Learning Projects You Can No Longer Overlook, we find some data preparation and exploration tools, a (the?) reinforcement learning "framework," a new automated machine learning library, and yet another distributed deep learning library.https://www.kdnuggets.com/2017/05/five-machine-learning-projects-cant-overlook-may.html
-
Must-Know: How to determine the most useful number of clusters?
Without knowing the ground truth of a dataset, then, how do we know what the optimal number of data clusters are? We will have a look at 2 particular popular methods for attempting to answer this question: the elbow method and the silhouette method.https://www.kdnuggets.com/2017/05/must-know-most-useful-number-clusters.html
-
How to Build a Recurrent Neural Network in TensorFlow
This is a no-nonsense overview of implementing a recurrent neural network (RNN) in TensorFlow. Both theory and practice are covered concisely, and the end result is running TensorFlow RNN code.https://www.kdnuggets.com/2017/04/build-recurrent-neural-network-tensorflow.html
-
The Value of Exploratory Data Analysis
In this post, we will give a high level overview of what exploratory data analysis (EDA) typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.https://www.kdnuggets.com/2017/04/value-exploratory-data-analysis.html
-
Time Series Analysis with Generalized Additive Models
In this tutorial, we will see an example of how a Generative Additive Model (GAM) is used, learn how functions in a GAM are identified through backfitting, and learn how to validate a time series model.https://www.kdnuggets.com/2017/04/time-series-analysis-generalized-additive-models.html
-
Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions">
Who leads in Data Science, Machine Learning, and Predictive Analytics? We compare the latest Forrester and Gartner reports for this industry for 2017 Q1, identify gainers and losers, and strong leaders vs contenders.
Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions
https://www.kdnuggets.com/2017/04/forrester-gartner-data-science-platforms-machine-learning.html
-
Top mistakes data scientists make when dealing with business people">
There are no cover articles praising the fails of the many data scientists that don’t live up to the hype. Here we examine 3 typical mistakes and how to avoid them.
Top mistakes data scientists make when dealing with business people
https://www.kdnuggets.com/2017/04/top-mistakes-data-scientists-make-business.html
-
5 Machine Learning Projects You Can No Longer Overlook, April">
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
5 Machine Learning Projects You Can No Longer Overlook, April
https://www.kdnuggets.com/2017/04/five-machine-learning-projects-cant-overlook-april.html
-
The 42 V’s of Big Data and Data Science">
It's 2017 now, and we now operate in an ever more sophisticated world of analytics. To keep up with the times, we present our updated 2017 list: The 42 V's of Big Data and Data Science.
The 42 V’s of Big Data and Data Science
https://www.kdnuggets.com/2017/04/42-vs-big-data-data-science.html
-
Top /r/MachineLearning Posts, March: A Super Harsh Guide to Machine Learning; Is it Gaggle or Koogle?!?
A Super Harsh Guide to Machine Learning; Google is acquiring data science community Kaggle; Suggestion by Salesforce chief data scientist; Andrew Ng resigning from Baidu; Distill: An Interactive, Visual Journal for Machine Learning Researchhttps://www.kdnuggets.com/2017/04/top-reddit-machine-learning-march.html
-
Key Takeaways from Strata + Hadoop World 2017 San Jose, Day 1">
The focus is increasingly shifting from storing and processing Big Data in an efficient way, to applying traditional and new machine learning techniques to drive higher value from the data at hand.
Key Takeaways from Strata + Hadoop World 2017 San Jose, Day 1
https://www.kdnuggets.com/2017/03/strata-hadoop-san-jose-key-takeaways.html
-
How to think like a data scientist to become one
The author went from securities analyst to Head of Data Science at Amazon. He describes what he learned in his journey and gives 4 useful rules based on his experience.https://www.kdnuggets.com/2017/03/think-like-data-scientist-become-one.html
-
What Is Data Science, and What Does a Data Scientist Do?">
This article is intended to help define the data scientist role, including typical skills, qualifications, education, experience, and responsibilities. This definition is somewhat loose, and given that the ideal experience and skill set is relatively rare to find in one individual.
What Is Data Science, and What Does a Data Scientist Do?
https://www.kdnuggets.com/2017/03/data-science-data-scientist-do.html
-
17 More Must-Know Data Science Interview Questions and Answers, Part 3">
The third and final part of 17 new must-know Data Science interview questions and answers covers A/B testing, data visualization, Twitter influence evaluation, and Big Data quality.
17 More Must-Know Data Science Interview Questions and Answers, Part 3
https://www.kdnuggets.com/2017/03/17-data-science-interview-questions-answers-part-3.html
-
Visualizing Time-Series Change
When creating time-series line charts, it’s important to consider which of the following messages you would like to communicate: Actual value of units? Change in absolute units? Percent change? Change from a specific point in time?https://www.kdnuggets.com/2017/03/visualizing-time-series-change.html
-
K-Means & Other Clustering Algorithms: A Quick Intro with Python
In this intro cluster analysis tutorial, we'll check out a few algorithms in Python so you can get a basic understanding of the fundamentals of clustering on a real dataset.https://www.kdnuggets.com/2017/03/k-means-clustering-algorithms-intro-python.html
-
Gartner Data Science Platforms – A Deeper Look
Thomas Dinsmore critical examination of Gartner 2017 MQ of Data Science Platforms, including vendors who out, in, have big changes, Hadoop and Spark integration, open source software, and what Data Scientists actually use.https://www.kdnuggets.com/2017/03/thomaswdinsmore-gartner-data-science-platforms.html
-
Every Intro to Data Science Course on the Internet, Ranked">
For this guide, I spent 10+ hours trying to identify every online intro to data science course offered as of January 2017, extracting key bits of information from their syllabi and reviews, and compiling their ratings.
Every Intro to Data Science Course on the Internet, Ranked
https://www.kdnuggets.com/2017/03/every-intro-data-science-course-ranked.html
-
An Overview of Python Deep Learning Frameworks">
Read this concise overview of leading Python deep learning frameworks, including Theano, Lasagne, Blocks, TensorFlow, Keras, MXNet, and PyTorch.
An Overview of Python Deep Learning Frameworks
https://www.kdnuggets.com/2017/02/python-deep-learning-frameworks-overview.html
-
Moving from R to Python: The Libraries You Need to Know
Are you considering making a move from R to Python? Here are the libraries you need to know, how they stack up to their R contemporaries, and why you should learn them.https://www.kdnuggets.com/2017/02/moving-r-python-libraries.html
-
17 More Must-Know Data Science Interview Questions and Answers, Part 2
The second part of 17 new must-know Data Science Interview questions and answers covers overfitting, ensemble methods, feature selection, ground truth in unsupervised learning, the curse of dimensionality, and parallel algorithms.https://www.kdnuggets.com/2017/02/17-data-science-interview-questions-answers-part-2.html
-
The Origins of Big Data
Big Data has truly come of age in 2013 when OED introduced the term “Big Data” for the first time. But when was the term Big Data first used and Why? Here are the results of our investigation.https://www.kdnuggets.com/2017/02/origins-big-data.html
-
So What is Big Data?">
We examine what experts say about Big Data – is it like teenage sex? Is it more than just a large and complex collection of data? And how many Vs are there?
So What is Big Data?
https://www.kdnuggets.com/2017/02/what-is-big-data.html
-
5 Career Paths in Big Data and Data Science, Explained">
Sexiest job... massive shortage... blah blah blah. Are you looking to get a real handle on the career paths available in "Data Science" and "Big Data?" Read this article for insight on where to look to sharpen the required entry-level skills.
5 Career Paths in Big Data and Data Science, Explained
https://www.kdnuggets.com/2017/02/5-career-paths-data-science-big-data-explained.html
-
Top R Packages for Machine Learning
What are the most popular ML packages? Let's look at a ranking based on package downloads and social website activity.https://www.kdnuggets.com/2017/02/top-r-packages-machine-learning.html
-
Pandas Cheat Sheet: Data Science and Data Wrangling in Python">
The Pandas library can seem very elaborate and it might be hard to find a single point of entry to the material: with other learning materials focusing on different aspects of this library, you can definitely use a reference sheet to help you get the hang of it.
Pandas Cheat Sheet: Data Science and Data Wrangling in Python
https://www.kdnuggets.com/2017/01/pandas-cheat-sheet.html
-
Bad Data + Good Models = Bad Results
No matter how advanced is your Machine Learning algorithm, the results will be bad if the input data
is bad. We examine one popular IMDB dataset and discuss how an analyst can deal with such data.https://www.kdnuggets.com/2017/01/bad-data-good-models-bad-results.html
-
90 Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning (updated)
Stay up-to-date in the data science with active blogs. This is a list of 90 recently active blogs on Big Data, Data Science, Data Mining, Machine Learning, and Artificial intelligence.https://www.kdnuggets.com/2017/01/blogs-analytics-big-data-mining-data-science-machine-learning.html
-
Tidying Data in Python
This post summarizes some tidying examples Hadley Wickham used in his 2014 paper on Tidy Data in R, but will demonstrate how to do so using the Python pandas library.https://www.kdnuggets.com/2017/01/tidying-data-python.html
-
Laying the Foundation for a Data Team
Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.https://www.kdnuggets.com/2016/12/laying-foundation-data-team.html
-
Academic, Research Positions in Big Data, Data Mining, Data Science, Machine Learning
To add here a short entry for an academic or research position related to AI, Big Data, Data Science, or Machine Learning, email 5 Read more »https://www.kdnuggets.com/academic/index.html
-
50+ Data Science, Machine Learning Cheat Sheets, updated">
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
50+ Data Science, Machine Learning Cheat Sheets, updated
https://www.kdnuggets.com/2016/12/data-science-machine-learning-cheat-sheets-updated.html
-
Data Science, Predictive Analytics Main Developments in 2016 and Key Trends for 2017">
Key themes included the polling failures in 2016 US Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.
Data Science, Predictive Analytics Main Developments in 2016 and Key Trends for 2017
https://www.kdnuggets.com/2016/12/data-science-predictive-analytics-main-developments-trends.html
-
Random Forests® in Python
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.https://www.kdnuggets.com/2016/12/random-forests-python.html
-
Top 10 Amazon Books in Artificial Intelligence & Machine Learning, 2016 Edition
Given the ongoing explosion in interest for all things Data Science, Artificial Intelligence, Machine Learning, etc., we have updated our Amazon top books lists from last year. Here are the 10 most popular titles in the AI & Machine Learning category.https://www.kdnuggets.com/2016/11/top-10-amazon-books-ai-machine-learning.html
-
Deep Learning Research Review: Reinforcement Learning
This edition of Deep Learning Research Review explains recent research papers in Reinforcement Learning (RL). If you don't have the time to read the top papers yourself, or need an overview of RL in general, this post has you covered.https://www.kdnuggets.com/2016/11/deep-learning-research-review-reinforcement-learning.html
-
Linear Regression, Least Squares & Matrix Multiplication: A Concise Technical Overview
Linear regression is a simple algebraic tool which attempts to find the “best” line fitting 2 or more attributes. Read here to discover the relationship between linear regression, the least squares method, and matrix multiplication.https://www.kdnuggets.com/2016/11/linear-regression-least-squares-matrix-multiplication-concise-technical-overview.html
-
Predictive Science vs Data Science
Is Predictive Science accurately represented by the term Data Science? As a matter of fact, are any of Data Science's constituent sciences well-represented by the umbrella term? This post discusses a few of these points at a high level.https://www.kdnuggets.com/2016/11/predictive-science-vs-data-science.html
-
Top 20 Python Machine Learning Open Source Projects, updated">
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
Top 20 Python Machine Learning Open Source Projects, updated
https://www.kdnuggets.com/2016/11/top-20-python-machine-learning-open-source-updated.html
-
Data Science and Big Data, Explained">
This article is meant to give the non-data scientist a solid overview of the many concepts and terms behind data science and big data. While related terms will be mentioned at a very high level, the reader is encouraged to explore the references and other resources for additional detail.
Data Science and Big Data, Explained
https://www.kdnuggets.com/2016/11/big-data-data-science-explained.html
-
An Intuitive Explanation of Convolutional Neural Networks
This article provides a easy to understand introduction to what convolutional neural networks are and how they work.https://www.kdnuggets.com/2016/11/intuitive-explanation-convolutional-neural-networks.html
-
A Quick Introduction to Neural Networks
This article provides a beginner level introduction to multilayer perceptron and backpropagation.https://www.kdnuggets.com/2016/11/quick-introduction-neural-networks.html
-
How to Rank 10% in Your First Kaggle Competition
This post presents a pathway to achieving success in Kaggle competitions as a beginner. The path generalizes beyond competitions, however. Read on for insight into succeeding while approaching any data science project.https://www.kdnuggets.com/2016/11/rank-ten-precent-first-kaggle-competition.html
-
Data Science 101: How to get good at R
Everybody talks about R programming, how to learn, how to be good at it. But in this article, Ari Lamstein tells us his story about why and how he started with R along with how to publish, market and monetise R projects.https://www.kdnuggets.com/2016/11/data-science-101-good-at-r.html
-
Learn Data Science in 8 (Easy) Steps
Want to learn data science? Check out these 8 (easy) steps to set out in the right direction!https://www.kdnuggets.com/2016/10/learn-data-science-8-steps.html
-
A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18!">
This post outlines setting up a neural network in Python using Scikit-learn, the latest version of which now has built in support for Neural Network models.
A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18!
https://www.kdnuggets.com/2016/10/beginners-guide-neural-networks-python-scikit-learn.html
-
Top 10 Data Science Videos on Youtube">
Learning and the future are the key topics in the recent Youtube videos on Data Science. The main questions revolve around: “how to become a Data Scientist”, “what is a data scientist”, and “where data science is going”. But why there is so little explanation of data science to the masses?
Top 10 Data Science Videos on Youtube
https://www.kdnuggets.com/2016/10/top-10-data-science-videos-youtube.html
-
Top 12 Interesting Careers to Explore in Big Data
From data driven strategies to decision making, the true worth of Big Data has been realized, and has led to opening up of amazing career choices. Check out these 12 interesting careers to explore in Big Data.https://www.kdnuggets.com/2016/10/top-12-interesting-careers-explore-big-data.html
-
Adversarial Validation, Explained
This post proposes and outlines adversarial validation, a method for selecting training examples most similar to test examples and using them as a validation set, and provides a practical scenario for its usefulness.https://www.kdnuggets.com/2016/10/adversarial-validation-explained.html
-
Battle of the Data Science Venn Diagrams">
First came Drew Conway's data science Venn diagram. Then came all the rest. Read this comparative overview of data science Venn diagrams for both the insight into the profession and the humor that comes along for free.
Battle of the Data Science Venn Diagrams
https://www.kdnuggets.com/2016/10/battle-data-science-venn-diagrams.html
-
Embedded Analytics: The Future of Business Intelligence
An overview of the evolution of Business Intelligence, and some insight into where its future lie: embedded analytics.https://www.kdnuggets.com/2016/09/embedded-analytics-future-business-intelligence.html
-
Data Science Basics: Data Mining vs. Statistics
As a beginner I was confused at the relationship between data mining and statistics. This is my attempt to help straighten out this connection for others who may now be in my old shoes.https://www.kdnuggets.com/2016/09/data-science-basics-data-mining-statistics.html
-
9 Key Deep Learning Papers, Explained">
If you are interested in understanding the current state of deep learning, this post outlines and thoroughly summarizes 9 of the most influential contemporary papers in the field.
9 Key Deep Learning Papers, Explained
https://www.kdnuggets.com/2016/09/9-key-deep-learning-papers-explained.html
-
The Great Algorithm Tutorial Roundup
This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!https://www.kdnuggets.com/2016/09/great-algorithm-tutorial-roundup.html
-
SlangSD: A Sentiment Dictionary for Slang Words
The Slang Sentiment Dictionary (SlangSD) includes over 90,000 slang words together with their sentiment scores, facilitating sentiment analysis in user-generated contents.https://www.kdnuggets.com/2016/09/slangsd-sentiment-dictionary-slang-words.html
Data Science at the Command Line: Exploring Data
5 Fantastic Practical Machine Learning Resources
Want to Become a Data Scientist? Try Feynman Technique
Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI
70 Amazing Free Data Sources You Should Know
Data Science, Machine Learning: Main Developments in 2017 and Key Trends in 2018
Understanding Deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras
Best Online Masters in Data Science and Analytics – a comprehensive, unbiased survey
6 Books Every Data Scientist Should Keep Nearby
7 Types of Artificial Neural Networks for Natural Language Processing
Top 10 Videos on Machine Learning in Finance
30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets
Data Science and the Imposter Syndrome
Machine Learning vs. Statistics: The Texas Death Match of Data Science
The Rise of GPU Databases
What Advice Would You Give Your Younger Data Scientist Self?
Applying Deep Learning to Real-world Problems
The world’s first protein database for Machine Learning and AI
Top 15 Python Libraries for Data Science in 2017
Machine Learning Workflows in Python from Scratch Part 1: Data Preparation
Text Mining 101: Mining Information From A Resume
Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions
Top mistakes data scientists make when dealing with business people
Key Takeaways from Strata + Hadoop World 2017 San Jose, Day 1
What Is Data Science, and What Does a Data Scientist Do?
An Overview of Python Deep Learning Frameworks
So What is Big Data?
Pandas Cheat Sheet: Data Science and Data Wrangling in Python
50+ Data Science, Machine Learning Cheat Sheets, updated
Top 20 Python Machine Learning Open Source Projects, updated
Data Science and Big Data, Explained
A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18!
Top 10 Data Science Videos on Youtube
9 Key Deep Learning Papers, Explained