- Removing Outliers Using Standard Deviation in Python - Feb 16, 2017.
Standard Deviation is one of the most underrated statistical tools out there. It’s an extremely useful metric that most people know how to calculate but very few know how to use effectively.
- Apache Arrow and Apache Parquet: Why We Needed Different Projects for Columnar Data, On Disk and In-Memory - Feb 16, 2017.
Apache Parquet and Apache Arrow both focus on improving performance and efficiency of data analytics. These two projects optimize performance for on disk and in-memory processing
- KDnuggets™ News 17:n06, Feb 15: So What is Big Data? 52 Useful Machine Learning APIs; Data Science finds Perfect Valentines Dates - Feb 15, 2017.
Also Making Python Speak SQL with pandasql; 52 Useful Machine Learning & Prediction APIs, updated; New Poll: Do you support Trump Immigration Ban?
- Web Scraping for Dataset Curation, Part 2: Tidying Craft Beer Data - Feb 14, 2017.
This is the second part in a 2 part series on curating data from the web. The first part focused on web scraping, while this post details the process of tidying scraped data after the fact.
- Web Scraping for Dataset Curation, Part 1: Collecting Craft Beer Data - Feb 13, 2017.
This post is the first in a 2 part series on scraping and cleaning data from the web using Python. This first part is concerned with the scraping aspect, while the second part while focus on the cleaning. A concrete example is presented.
- Making Python Speak SQL with pandasql - Feb 8, 2017.
Want to wrangle Pandas data like you would SQL using Python? This post serves as an introduction to pandasql, and details how to get it up and running inside of Rodeo.
- Top KDnuggets tweets, Jan 25-31: Python implementations of Andrew Ng #MachineLearning MOOC exercises - Feb 1, 2017.
#Python implementations of Andrew Ng #MachineLearning MOOC exercises; This repository contains the entire #Python #DataScience Handbook; What are the best #visualizations of #MachineLearning algorithms? Learn #TensorFlow and #DeepLearning, without a PhD.
- Domino Data Science Popup, San Francisco, Feb 22 – KDnuggets Offer - Jan 31, 2017.
Learn about the latest trends in data science applications in technology from the top experts in the industry. Register by Feb 8 and save with code KDNuggetsVIP.
- Pandas Cheat Sheet: Data Science and Data Wrangling in Python - Jan 27, 2017.
The Pandas library can seem very elaborate and it might be hard to find a single point of entry to the material: with other learning materials focusing on different aspects of this library, you can definitely use a reference sheet to help you get the hang of it.
- Great Collection of Minimal and Clean Implementations of Machine Learning Algorithms - Jan 25, 2017.
Interested in learning machine learning algorithms by implementing them from scratch? Need a good set of examples to work from? Check out this post with links to minimal and clean implementations of various algorithms.
- Learn how to Develop and Deploy a Gradient Boosting Machine Model - Jan 20, 2017.
GBM is one the hottest machine learning methods. Learn how to create GBM using SciKit-Learn and Python and
understand the steps required to transform features, train, and deploy a GBM.
- Clean Data Science: Evaluating The Cleanliness of NYC Craft Beer Bar Kitchens - Jan 13, 2017.
An analysis of NYC Open Data health inspections showing that craft beer bar kitchens in Manhattan are cleaner than the average establishment by a statistically significant margin. An encouraging finding for Dry January.
- The Most Popular Language For Machine Learning and Data Science Is … - Jan 11, 2017.
When it comes to choosing programming language for Data Analytics projects or job prospects, people have different opinions depending on their career backgrounds and domains they worked in. Here is the analysis of data from indeed.com with respect to choice of programming language for machine learning and data science.
- Creating Data Visualization in Matplotlib - Jan 5, 2017.
Matplotlib is the most widely used data visualization library for Python; it's very powerful, but with a steep learning curve. This overview covers a selection of plots useful for a wide range of data analysis problems and discusses how to best deploy each one so you can tell your data story.
- Tidying Data in Python - Jan 4, 2017.
This post summarizes some tidying examples Hadley Wickham used in his 2014 paper on Tidy Data in R, but will demonstrate how to do so using the Python pandas library.
- Supercharge Your Data Science Team with AnacondaCON Team Discount, till Jan 16 - Jan 3, 2017.
AnacondaCON '17 will help you conquer your biggest data science challenges. Learn from industry experts sharing what #OpenDataScienceMeans and their best practices. Get 2 for 1 ticket price thru Jan 16, 2017.
- 5 Machine Learning Projects You Can No Longer Overlook, January - Jan 2, 2017.
There are a lot of popular machine learning projects out there, but many more that are not. Which of these are actively developed and worth checking out? Here is an offering of 5 such projects, the most recent in an ongoing series.
- Over 600 data science, machine learning, Big Data eBooks/videos for only $5 (until Jan 9) - Dec 22, 2016.
Packt have more than 600 data science, analysis, machine learning and Big Data eBooks and video courses. And until Jan 9, 2017 every single one is available for just $5.
- 3 ways to learn Data Science at Statistics.com - Dec 21, 2016.
Get the personal touch you need to deepen your learning with Statistics.com classes that are small, rich and engaging with readings, videos, quizzes, homework, and practical projects, taught online by leading instructors.
- Top KDnuggets tweets, Dec 7-13: Want to learn Numpy? A Github repo of Numpy learning exercises - Dec 14, 2016.
Also Deep Learning Roadmap: "Which paper should I start reading from?"; Free ebooks: #MachineLearning with #Python and Practical Data Analysis; Daily plan for studying to become a Google software engineer.
- 50+ Data Science, Machine Learning Cheat Sheets, updated - Dec 14, 2016.
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
- Introduction to K-means Clustering: A Tutorial - Dec 9, 2016.
A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.
- KDnuggets Top Blogs and Bloggers in November 2016 - Dec 8, 2016.
We recognize the best KDnuggets Bloggers who had the most popular blogs by views or social media shares in November 2016.
- R-Brain Platform for Data Science: R, Python, sharing, security, and marketplace - Dec 7, 2016.
R-Brain IDE enables data scientists to use both R and Python with full language support. It enables sharing and has a marketplace for models. Try it free.
- Free ebooks: Machine Learning with Python and Practical Data Analysis - Dec 5, 2016.
Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.
- Random Forests in Python - Dec 2, 2016.
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
- Top KDnuggets tweets, Nov 23-29: The Entire #Python Language in a Single Image ; Great list of Data Science, Machine Learning, AI Resources - Nov 30, 2016.
The Entire #Python Language in a Single Image; Cartoon: Thanksgiving, #BigData, and Turkey #DataScience; 50% of Data Scientists have under 10 GB databases, not #BigData; Machine Learning Algorithms: A Concise Technical Overview
- KDnuggets™ News 16:n42, Nov 30: Python Machine Learning Open Source Projects; Facebook Groups for Big Data & Data Science - Nov 30, 2016.
Python Machine Learning Open Source Projects; Facebook Groups for Big Data & Data Science; Combining Different Methods to Create Advanced Time Series Prediction; Tips for Beginner Machine Learning/Data Scientists Feeling Overwhelmed; Continuous improvement for IoT through AI / Continuous learning
- Introduction to Machine Learning for Developers - Nov 28, 2016.
Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning.
Pages: 1 2
- Neighbors Know Best: (Re) Classifying an Underappreciated Beer - Nov 24, 2016.
A look at beer features to determine whether a specific brew might be better served (pun intended) by being classified under a different style. kNN analysis supported with in-post plots and linked iPython notebook.
- Top KDnuggets tweets, Nov 16-22: Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning - Nov 23, 2016.
Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning; What is the Difference Between #DeepLearning and Regular #MachineLearning?; Questions To Ask When Moving #MachineLearning From Practice to Production; How to Choose the Right #Database System
- Top 20 Python Machine Learning Open Source Projects, updated - Nov 21, 2016.
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
- Join Us for the first AnacondaCON in February 2017 - Nov 11, 2016.
We're excited to announce that registration for AnacondaCON 2017, the first conference for Open Data Science leaders around the world, is now OPEN and limited to 500!
- How to Rank 10% in Your First Kaggle Competition - Nov 9, 2016.
This post presents a pathway to achieving success in Kaggle competitions as a beginner. The path generalizes beyond competitions, however. Read on for insight into succeeding while approaching any data science project.
Pages: 1 2 3 4
- KDnuggets™ News 16:n40, Nov 9: Trump, Failure of Prediction, and Lessons for Data Scientists; 8 Frustrating Things For R Users When Learning Python - Nov 9, 2016.
We examine the lessons for Data Scientists from the shocking and surprising win of Donald Trump; 8 frustrating things for R user when learning Python; Using Predictive Algorithms to Track Real Time Health Trends.
- Eight Things an R user Will Find Frustrating When Trying to Learn Python - Nov 2, 2016.
Are you an R user considering learning Python? Here's some insight into what you may be up against, and what, specifically, you may find frustrating. But don't worry, it's not all terrible.
- Using Machine Learning to Detect Malicious URLs - Oct 28, 2016.
This is a write-up of an experiment employing a machine learning model to identify malicious URLs. The author provides a link to the code for you to try yourself.
- Automated Machine Learning: An Interview with Randy Olson, TPOT Lead Developer - Oct 28, 2016.
Read an insightful interview with Randy Olson, Senior Data Scientist at University of Pennsylvania Institute for Biomedical Informatics, and lead developer of TPOT, an open source Python tool that intelligently automates the entire machine learning process.
- KDnuggets™ News 16:n38, Oct 26: Free Machine Learning EBooks; Neural Networks in Python with Scikit-learn - Oct 26, 2016.
5 EBooks to Read Before Getting into A Machine Learning Career; A Beginner's Guide to Neural Networks with Python and Scikit-learn 0.18!; New Poll: What was the largest dataset you analyzed / data mined?; Jupyter Notebook Best Practices for Data Science
- Jupyter Notebook Best Practices for Data Science - Oct 20, 2016.
Check out this overview of Jupyter notebook best practices as pertains to data science. Novice or expert, you may find something of use here.
- A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18! - Oct 20, 2016.
This post outlines setting up a neural network in Python using Scikit-learn, the latest version of which now has built in support for Neural Network models.
Pages: 1 2
- K2 Data Science Bootcamp - Oct 14, 2016.
This online, part-time immersive data science bootcamp is geared to help working professional become data scientists in 24 weeks, with live lectures, one-on-one supports, group study sessions, and more. Next session starts Jan 9, 2017.
- 2 must-have tools for blazing fast Python performance - Sep 15, 2016.
Intel has two must-have, highly optimized tools to help you get faster performance out of the box - with the least amount of effort.
- Webinar: Breaking Data Science Open, Sep 15 - Sep 12, 2016.
Learn how to drive collaboration and data science teamwork; how to mitigate legal risk through open source assurance and appropriate package selection, and how to democratize innovation through broad access to open data science tools.
- Introducing Dask for Parallel Programming: An Interview with Project Lead Developer - Sep 7, 2016.
Introducing Dask, a flexible parallel computing library for analytics. Learn more about this project built with interactive data science in mind in an interview with its lead developer.
- Top /r/MachineLearning Posts, August: Google Brain AMA, Image Completion with TensorFlow, Japanese Cucumber Farming - Sep 5, 2016.
Google Brain AMA; Image Completion with Deep Learning in TensorFlow; Japanese Cucumber Farming; Andrew Ng's machine learning class in Python; Google Brain datasets for robotics research
- HPE Haven OnDemand: Powerful Data Connectors for the Cloud and Enterprise - Sep 1, 2016.
HPE Haven OnDemand simplifies how you can interact with data, allowing it to be transformed into an asset anytime, anywhere. Find out how the Connector APIs can facilitate this interaction.
Pages: 1 2
- A Gentle Introduction to Bloom Filter - Aug 24, 2016.
The Bloom Filter is a probabilistic data structure which can make a tradeoff between space and false positive rate. Read more, and see an implementation from scratch, in this post.
- Visualizing 1 Billion Points of Data: Doing It Right – Aug 18 Webinar - Aug 11, 2016.
Join Continuum Analytics on August 18 for a webinar on Big Data visualization with the datashader library. Save your spot today!
- 7 Steps to Understanding Computer Vision - Aug 9, 2016.
A starting point for Computer Vision and how to get going deeper. Dive into this post for some overview of the right resources and a little bit of advice.
- Top KDnuggets tweets, Jul 27 – Aug 2: Understanding neural networks with Google TensorFlow Playground; Getting Started with Data Science in Python - Aug 3, 2016.
Understanding neural networks with Google TensorFlow Playground; The 100 Best-Funded #Analytics #DataScience #Startups; Great tutorial: Getting Started with #DataScience - #Python; #MachineLearning over 1M hotel reviews: interesting insights.
- Getting Started with Data Science – Python - Aug 1, 2016.
A great introductory post from DataRobot on getting started with data science in the Python ecosystem, including cleaning data and performing predictive modeling.
Pages: 1 2
- Data Science of Visiting Famous Movie Locations in San Francisco - Jul 30, 2016.
Using the Google Places API and IMDb API, we selected movie locations in The Golden City which every movie fan should visit while they are in town, and optimize sightseeing by solving the travelling salesman problem.
- Deep Learning For Chatbots, Part 2 – Implementing A Retrieval-Based Model In TensorFlow - Jul 29, 2016.
Check out part 2 of this tutorial on building chatbots with deep neural networks. This part gets practical, and using Python and TensorFlow to implement.
Pages: 1 2 3
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 3 - Jul 27, 2016.
This is the final part of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Pages: 1 2
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 2 - Jul 26, 2016.
This is part 2 of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Pages: 1 2
- Barley, Hops, and Bayes: Predicting The World Beer Cup - Jul 26, 2016.
This post covers predicting award counts by the United States in an international beer competition. Exploratory data analysis and Bayes methods are also supported.
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 1 - Jul 25, 2016.
Check out the first of a 3 part introductory series on machine learning in Python, fueled by the Titanic dataset. This is a great place to start for a machine learning newcomer.
- Building a Data Science Portfolio: Machine Learning Project Part 3 - Jul 22, 2016.
The final installment of this comprehensive overview on building an end-to-end data science portfolio project focuses on bringing it all together, and concludes the project quite nicely.
Pages: 1 2
- SAS vs R vs Python: Which Tool Do Analytics Pros Prefer? - Jul 22, 2016.
There are lots of flame wars involving different data science and analytics tools... but this isn't one of them. Check out the quantitative results and analysis of a Burtch Works survey on the subject.
- Building a Data Science Portfolio: Machine Learning Project Part 2 - Jul 21, 2016.
The second part of this comprehensive overview on building an end-to-end data science portfolio project concentrates on data exploration and preparation.
Pages: 1 2
- Interesting Things I Learned at SciPy 2016 - Jul 21, 2016.
Learn about some interesting projects featured at SciPy 2016, brought to you by an attendee who put in the work to bring you this great list of projects.
- Building a Data Science Portfolio: Machine Learning Project Part 1 - Jul 20, 2016.
Dataquest's founder has put together a fantastic resource on building a data science portfolio. This first of three parts lays the groundwork, with subsequent posts over the following 2 days. Very comprehensive!
Pages: 1 2
- Top KDnuggets tweets, Jul 13 – Jul 19: Bayesian #MachineLearning, Explained; Introducing JupyterLab - Jul 20, 2016.
Bayesian #MachineLearning, Explained; JupyterLab: the next generation of the #Jupyter Notebook; On the importance of democratizing #ArtificialIntelligence
- 10 Great Talks From SciPy 2016 - Jul 20, 2016.
Here's a curated short list of interesting and insightful talks to watch from SciPy 2016 to help guide your search through the volume of great video material emerging from the conference.
- Statistical Data Analysis in Python - Jul 18, 2016.
This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects, taking the form of a set of IPython notebooks.
- America’s Next Topic Model - Jul 15, 2016.
Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.
- Top KDnuggets tweets, Jul 6 – Jul 12: Statistical Data Analysis #Python #Jupyter Notebooks; Modern Pandas Notebooks - Jul 13, 2016.
Statistical Data Analysis in #Python (#Jupyter Notebooks); Modern Pandas: idiomatic Pandas notebook collection; New (free) book by @rdpeng: #rstats Programming for #DataScience
- 5 Deep Learning Projects You Can No Longer Overlook - Jul 12, 2016.
There are a number of "mainstream" deep learning projects out there, but many more niche projects flying under the radar. Have a look at 5 such projects worth checking out.
- Interview: Florian Douetteau, Dataiku Founder, on Empowering Data Scientists - Jul 7, 2016.
Here is an interview with Florian Douetteau, founder of Dataiku, on how their tools empower data scientists, and how data science itself is evolving.
- Deep Residual Networks for Image Classification with Python + NumPy - Jul 7, 2016.
This post outlines the results of an innovative Deep Residual Network implementation for Image Classification using Python and NumPy.
- Top KDnuggets tweets, Jun 29 – Jul 5: Big Data Ecosystem is Too Damn Big!; Deep Learning Intro with Caffe and Python - Jul 6, 2016.
The #BigData Ecosystem is Too Damn Big!; A Practical Introduction to #DeepLearning with Caffe and #Python; What do Postgres, Kafka, and Bitcoin have in common?
- Mining Twitter Data with Python Part 7: Geolocation and Interactive Maps - Jul 6, 2016.
The final part of this 7 part series explores using geolocation and interactive maps with Twitter data.
- Mining Twitter Data with Python Part 6: Sentiment Analysis Basics - Jul 5, 2016.
Part 6 of this series builds on the previous installments by exploring the basics of sentiment analysis on Twitter data.
- Mining Twitter Data with Python Part 5: Data Visualisation Basics - Jun 29, 2016.
Part 5 of this series takes on data visualization, as we look to make sense of our data and highlight interesting insights.
- 5 More Machine Learning Projects You Can No Longer Overlook - Jun 28, 2016.
There are a lot of popular machine learning projects out there, but many more that are not. Which of these are actively developed and worth checking out? Here is an offering of 5 such projects.
- Mining Twitter Data with Python Part 4: Rugby and Term Co-occurrences - Jun 27, 2016.
Part 4 of this series employs some of the lessons learned thus far to analyze tweets related to rugby matches and term co-occurrences.
- Doing Data Science: A Kaggle Walkthrough Part 6 – Creating a Model - Jun 24, 2016.
In the final part of this 6 part series on the process of data science, and applying it to a Kaggle competition, building the predictive models is covered, and multiple algorithms are discussed.
Pages: 1 2
- Mining Twitter Data with Python Part 3: Term Frequencies - Jun 22, 2016.
Part 3 of this 7 part series focusing on mining Twitter data discusses the analysis of term frequencies for meaningful term extraction.
- HPE Haven OnDemand Text Extraction API Cheat Sheet for Developers - Jun 21, 2016.
HPE Haven OnDemand provides a native API based on cURL calls, as well as numerous language-specific APIs, providing maximum flexibility for developers. This cheat sheet will cover the native and Python text extraction APIs.
- Mining Twitter Data with Python Part 2: Text Pre-processing - Jun 20, 2016.
Part 2 of this 7 part series on mining Twitter data for a variety of use cases focuses on the pre-processing of tweet text.
- Doing Data Science: A Kaggle Walkthrough Part 5 – Adding New Data - Jun 17, 2016.
Here is part 5 of the weekly 6 part series on doing data science in the context of a Kaggle competition, which concentrates on adding in new data.
Pages: 1 2
- Mining Twitter Data with Python Part 1: Collecting Data - Jun 15, 2016.
Part 1 of a 7 part series focusing on mining Twitter data for a variety of use cases. This first post lays the groundwork, and focuses on data collection.
- What Big Data, Data Science, Deep Learning software goes together? - Jun 14, 2016.
We analyze the associations between top Data Science tools, Commercial vs Free/Open Source, rank tools on R vs Python bias, find tools more associated with Big Data, those more associated with Deep Learning, and uncover strong regional differences.
Pages: 1 2 3
- 10 Useful Python Data Visualization Libraries for Any Discipline - Jun 14, 2016.
A great overview of 10 useful Python data visualization tools. It covers some of the big ones, like matplotlib and Seaborn, but also explores some more obscure libraries, like Gleam, Leather, and missingno.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 4 – Data Transformation and Feature Extraction - Jun 10, 2016.
Part 4 of this fantastic 6 part series covering the process of data science, and its application to a Kaggle competition, focuses on feature extraction and data transformation.
Pages: 1 2
- An Introduction to Scientific Python (and a Bit of the Maths Behind It) – Matplotlib - Jun 9, 2016.
An introductory overview of Matplotlib, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
Pages: 1 2
- Top KDnuggets tweets, Jun 1-7: “Deep” vs “Regular” Machine Learning; Introduction to Scientific Python – NumPy - Jun 8, 2016.
How to Build Your Own #DeepLearning Box; What is the Difference Between #DeepLearning and "Regular" #MachineLearning? Data Science of #Variable Selection: A Review; Why choose #Python for #MachineLearning?
- R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results - Jun 6, 2016.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 3 – Cleaning Data - Jun 3, 2016.
This is part three in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. In this episode, data cleaning and preparation is covered.
Pages: 1 2
- Top KDnuggets tweets, May 25-31: 19 Free eBooks to learn #programming with #Python; Awesome collection of public datasets on Github - Jun 1, 2016.
Introducing Hybrid lda2vec Algorithm via Stitch Fix; #DeepLearning and Deep #Gaussian Processes - explainer; Awesome collection of public #datasets on Github; #DataScience foundations: 19 Free eBooks to learn #programming with #Python.
- An Introduction to Scientific Python (and a Bit of the Maths Behind It) – NumPy - Jun 1, 2016.
An introductory overview of NumPy, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 2 – Understanding the Data - May 27, 2016.
This is the second post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Read on for a great overview of practicing data science.
Pages: 1 2
- ntent: Senior Software Engineer, Python - May 26, 2016.
Seeking a Senior Software Engineer, Python, to provide code support for an in-house ontology engineering and knowledge acquisition infrastructure, and to work closely with, and support, ontologists and linguists.
- Top KDnuggets tweets, May 18-24: Google supercharges #MachineLearning, #DeepLearning tasks with TPU (Tensor Processing Unit) - May 25, 2016.
Stanford Crowd Course Initiative: #MachineLearning with #Python course; Practical Guide to Matrix Calculus for #DeepLearning; Build your own #DeepLearning Box < $1.5K
- Doing Data Science: A Kaggle Walkthrough Part 1 – Introduction - May 19, 2016.
This is the first post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Very thorough, and very insightful.
- 5 Machine Learning Projects You Can No Longer Overlook - May 19, 2016.
We all know the big machine learning projects out there: Scikit-learn, TensorFlow, Theano, etc. But what about the smaller niche projects that are actively developed, providing useful services to users? Here are 5 such projects.
- High Performance Python for Open Data Science (Whitepaper) - May 17, 2016.
In this whitepaper, you'll learn from our seasoned experts about the approaches to scaling your data science models, review the various options, and learn how to easily accomplish them using Anaconda, the leading Open Data Science platform powered by Python.
- TPOT: A Python Tool for Automating Data Science - May 13, 2016.
TPOT is an open-source Python data science automation tool, which operates by optimizing a series of feature preprocessors and models, in order to maximize cross-validation accuracy on data sets.
Pages: 1 2
- Top Talks and Tutorials From PyData London - May 11, 2016.
Get some insight into the most recent Python data science talks and presentations with this eclectic mix of videos from PyData London 2016.
- Dealing with Unbalanced Classes, SVMs, Random Forests, and Decision Trees in Python - Apr 29, 2016.
An overview of dealing with unbalanced classes, and implementing SVMs, Random Forests, and Decision Trees in Python.
Pages: 1 2 3
- Webinar: High Performance Hadoop With Python, May 5th - Apr 28, 2016.
On May 5th, Dr. Kristopher Overholt and Dr. Matthew Rocklin of Continuum Analytics will present a webinar on High Performance Hadoop with Python. Reserve your spot today!
- Top KDnuggets tweets, Apr 12-26: The Race For AI: Google, Facebook, Amazon, Apple; Comprehensive Guide to Learning #Python - Apr 27, 2016.
Data Science helps see where your country will stand in WW 3; Recommender Systems: New Comprehensive Textbook; Good read: Deep Learning in Neural Networks - extreme summary; The Race For #AI: Google, Facebook, Amazon, Apple rush to grab #AI startups.
- Top Data Science Courses on Udemy - Apr 27, 2016.
An overview of the very best that Udemy has to offer in data science education. Includes courses covering machine learning, Python, Hadoop, visualization, and more.
Pages: 1 2 3
- KDnuggets™ News 16:n15, Apr 27: Deep Learning vs. SVMs, Random Forests; Python Guide for Data Science - Apr 27, 2016.
When Does Deep Learning Work Better Than SVMs or Random Forests; Comprehensive Guide to Learning Python for Data Science; Top 10 IPython Notebook Tutorials for Data Science and Machine Learning; 5,000 KDnuggets Posts - Examining Our Most Popular Content
- Top 10 IPython Notebook Tutorials for Data Science and Machine Learning - Apr 22, 2016.
A list of 10 useful Github repositories made up of IPython (Jupyter) notebooks, focused on teaching data science and machine learning. Python is the clear target here, but general principles are transferable.
- Black Box Challenge Machine Learning Competition - Apr 21, 2016.
Take part in an unusual machine learning competition — program an agent (in Python) that can play a game with unknown rules.
- Comprehensive Guide to Learning Python for Data Analysis and Data Science - Apr 20, 2016.
Want to make a career change to Data Science using python? Well learning anything on your own can be a challenge & a little guidance could be a great help, that is exactly what this article will provide you with.
Pages: 1 2
- The MBA Data Science Toolkit: 8 resources to go from the spreadsheet to the command line - Apr 18, 2016.
A great guide for the MBA, or any relatively non-technical convert, for getting comfortable with the command line and other technical skills required to excel in data science.
Pages: 1 2
- From Science to Data Science, a Comprehensive Guide for Transition - Apr 12, 2016.
An in-depth, multifaceted, and all-around very helpful roadmap for making the switch from 'science' to 'data science,' yet generally useful for data science beginners or anyone looking to get into data science.
Pages: 1 2 3
- Top KDnuggets tweets, Mar 22-29: If Hollywood Made Movies About MachineLearning; Data Scientist on Every @AirBNB Leadership Team - Mar 30, 2016.
If Hollywood Made Movies About Machine Learning; Why Airbnb Has a Data Scientist on Every Leadership Team; Very useful guide for Data Cleaning in Python; Data scientist Hilary Mason wants to show you the (near) future.
- PocketConfidant AI: Computational Linguist (NLP/AI) - Mar 26, 2016.
Rely on the user behavior data to design and implement Machine Learning algorithms and methods of Natural Language Processing to build smart conversational robots. Make user experience personal, proactive and empathetic.
- Data Science Tools – Are Proprietary Vendors Still Relevant? - Mar 25, 2016.
We examine and quantify the dramatic impact of open source tools like R and Python on SAS, IBM, Microsoft, and other proprietary Data Science vendors. We also investigate how open source tools were faring against each other, which are growing, which are falling, and look R versus Python debate.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough – Cleaning Data - Mar 23, 2016.
Gain insight into the process of cleaning data for a specific Kaggle competition, including a step by step overview.
Pages: 1 2
- New KDnuggets Tutorials Page: Learn R, Python, Data Visualization, Data Science, and more - Mar 16, 2016.
Introducing new KDnuggets Tutorials page with useful resources for learning about Business Analytics, Big Data, Data Science, Data Mining, R, Python, Data Visualization, Spark, Deep Learning and more.
- Journey to Open Data Science, March 23 Webinar - Mar 15, 2016.
Learn how to drive collaboration and teamwork through open data science; mitigate legal risk through indemnification and appropriate package selection; bring advanced analytics to Excel-loving analysts with AnacondaXL.
- R or Python? Consider learning both - Mar 8, 2016.
The key to become a data science professional is in understanding the underlying data science concepts and work towards expanding your programming toolbox as much as you can. Hence, one should understand when to use Python and when to pick R, rather mastering just one language.
Pages: 1 2
- scikit-feature: Open-Source Feature Selection Repository in Python - Mar 3, 2016.
scikit-feature is an open-source feature selection repository in python, with around 40 popular algorithms in feature selection research. It is developed by Data Mining and Machine Learning Lab at Arizona State University.
- Conversation with data scientist Sebastian Raschka: A New Podcast Episode - Feb 24, 2016.
In this post we present a interview of Sebastian Raschka, data scientist and author of Python Machine Learning. Who discussed about machine learning, data science, current and future trends.
- Ensemble Methods: Elegant Techniques to Produce Improved Machine Learning Results - Feb 12, 2016.
Get a handle on ensemble methods from voting and weighting to stacking and boosting, with this well-written overview that includes numerous Python-style pseudocode examples for reinforcement.
Pages: 1 2
- Scikit Flow: Easy Deep Learning with TensorFlow and Scikit-learn - Feb 12, 2016.
Scikit Learn is a new easy-to-use interface for TensorFlow from Google based on the Scikit-learn fit/predict model. Does it succeed in making deep learning more accessible?
- Data Science Skills for 2016 - Feb 12, 2016.
As demand for the hottest job is getting hotter in new year, the skill set required for them is getting larger. Here, we are discussing the skills which will be in high demand for data scientist which include data visualization, Apache Spark, R, python and many more.
- Top 10 tweets Jan 25-31: DataViz: how a decision tree works; Nice and Brief Tutorial on Python - Feb 1, 2016.
DataViz - how a decision tree makes classifications; Very Nice and Brief Tutorial on #Python #DataScience #DataViz; Per Einstein, time flows slower in Meetings than in empty space #hum; Top 10 Skills for #DataScience professionals.
- Python Data Science with Pandas vs Spark DataFrame: Key Differences - Jan 29, 2016.
A post describing the key differences between Pandas and Spark's DataFrame format, including specifics on important regular processing features, with code samples.
- Useful Data Science: Feature Hashing - Jan 28, 2016.
Feature engineering plays major role while solving the data science problems. Here, we will learn Feature Hashing, or the hashing trick which is a method for turning arbitrary features into a sparse binary vector.
- Implementing Your Own k-Nearest Neighbour Algorithm Using Python - Jan 27, 2016.
A detailed explanation of one of the most used machine learning algorithms, k-Nearest Neighbors, and its implementation from scratch in Python. Enhance your algorithmic understanding with this hands-on coding exercise.
Pages: 1 2 3
- Learning to Code Neural Networks - Jan 22, 2016.
Learn how to code a neural network, by taking advantage of someone else's experiences learning how to code a neural network.
Pages: 1 2
- Scikit-learn and Python Stack Tutorials: Introduction, Implementing Classifiers - Jan 18, 2016.
A small collection of introductory scikit-learn and Python stack tutorials for those with an existing understanding of machine learning looking to jump right into using a new set of tools.
- MassiveAnalytic: Scala/Python Data Scientist - Jan 18, 2016.
Oscar AP is the world first precognitive analytics platform. Be involved in critical research and innovation projects, customer prototyping and proofs-of-concept.
- A Look Back on the 1st Three Months of Becoming a Data Scientist - Jan 13, 2016.
A person new to the Data Science field summarizes his surprising findings after a few months on the job.