- Top KDnuggets tweets, Mar 22-28: Big #DataScience: Expectation vs. Reality - Mar 29, 2017.
Also A Gentle Introduction To Graph Theory; An Overview of #Python #DeepLearning Frameworks; The Great Algorithm Tutorial Roundup.
- A Beginner’s Guide to Tweet Analytics with Pandas - Mar 29, 2017.
Unlike a lot of other tutorials which often pull from the real-time Twitter API, we will be using the downloadable Twitter Analytics data, and most of what we do will be done in Pandas.
- Email Spam Filtering: An Implementation with Python and Scikit-learn - Mar 17, 2017.
This post is an overview of a spam filtering implementation using Python and Scikit-learn. The results of 2 classifiers are contrasted and compared: multinomial Naive Bayes and support vector machines.
- Top KDnuggets tweets, Mar 08-14: In-depth introduction to Machine Learning in 15 hours of expert videos - Mar 15, 2017.
Also: #ICYMI The #DataScience Project Playbook; Every Intro to #DataScience Course on the Internet, Ranked; Quick reference to #Python in a single script.
- KDnuggets™ News 17:n10, Mar 15: Becoming a Data Science Unicorn; What Makes a Good Data Visualization? - Mar 15, 2017.
6 Business Concepts you need to become a Data Science Unicorn; What Makes a Good Data Visualization?; Best Data Science Courses from Udemy (only $19 till Mar 31); K-Means & Other Clustering Algorithms: A Quick Intro with Python; Free Online Data Science & Big Data Books
- Open Source Toolkits for Speech Recognition - Mar 14, 2017.
This article reviews the main options for free speech recognition toolkits that use traditional Hidden Markov Models and n-gram language models.
- Best Data Science Courses from Udemy (only $19 till Mar 31) - Mar 10, 2017.
Here a list of the best courses in data science from Udemy, covering Data Science, Machine Learning, Python, Spark, Tableau, and Hadoop - only $19 until March 31, 2017.
- Working With Numpy Matrices: A Handy First Reference - Mar 10, 2017.
This introductory tutorial does a great job of outlining the most common Numpy array creation and manipulation functionality. A good post to keep handy while taking your first steps in Numpy, or to use as a handy reminder.
- K-Means & Other Clustering Algorithms: A Quick Intro with Python - Mar 8, 2017.
In this intro cluster analysis tutorial, we'll check out a few algorithms in Python so you can get a basic understanding of the fundamentals of clustering on a real dataset.
- KDnuggets™ News 17:n09, Mar 8: 7 More Steps to Mastering Machine Learning w. Python; Every Intro to Data Science Course, Ranked - Mar 8, 2017.
Also The Data Science Project Playbook; Hadoop Is Falling - Why? Bokeh Cheat Sheet: Data Visualization in Python
- A Simple XGBoost Tutorial Using the Iris Dataset - Mar 7, 2017.
This is an overview of the XGBoost machine learning algorithm, which is fast and shows good results. This example uses multiclass prediction with the Iris dataset from Scikit-learn.
- Bokeh Cheat Sheet: Data Visualization in Python - Mar 3, 2017.
Bokeh is the Python data visualization library that enables high-performance visual presentation of large datasets in modern web browsers. The package is flexible and offers lots of possibilities to visualize your data in a compelling way, but can be overwhelming.
- Gartner Data Science Platforms – A Deeper Look - Mar 3, 2017.
Thomas Dinsmore critical examination of Gartner 2017 MQ of Data Science Platforms, including vendors who out, in, have big changes, Hadoop and Spark integration, open source software, and what Data Scientists actually use.
- Building a Bot to Answer FAQs: Predicting Text Similarity - Mar 2, 2017.
In this post, learn to build a bot to answer frequently asked questions, reducing lag time for more customers and taking the load off of engineers, ensuring they can concentrate on building products!
- Top KDnuggets tweets, Feb 22-28: 50 Companies Leading the #AI Revolution; #AI Nanodegree Program Syllabus - Mar 1, 2017.
50 Companies Leading the #AI Revolution; #AI Nanodegree Program Syllabus: Term 1, In Depth; What is a Support Vector Machine, and Why Would I Use it?; 6 Easy Steps to Learn Naive #Bayes Algorithm (with code in #Python).
- 7 More Steps to Mastering Machine Learning With Python - Mar 1, 2017.
This post is a follow-up to last year's introductory Python machine learning post, which includes a series of tutorials for extending your knowledge beyond the original.
Pages: 1 2
- KDnuggets™ News 17:n08, Mar 1: Deep Learning Frameworks Dissected; Gartner Magic Quadrant for Data Science Platforms - Mar 1, 2017.
The Anatomy of Deep Learning Frameworks; Gartner 2017 Magic Quadrant for Data Science Platforms: gainers and losers; Data Science vs Fake News Contest; An Overview of Python Deep Learning Frameworks; Moving from R to Python: The Libraries You Need to Know
- What I Learned Implementing a Classifier from Scratch in Python - Feb 28, 2017.
In this post, the author implements a machine learning algorithm from scratch, without the use of a library such as scikit-learn, and instead writes all of the code in order to have a working binary classifier algorithm.
- An Overview of Python Deep Learning Frameworks - Feb 27, 2017.
Read this concise overview of leading Python deep learning frameworks, including Theano, Lasagne, Blocks, TensorFlow, Keras, MXNet, and PyTorch.
- The 6 Best Data Science Courses from Udemy (only $10 till Feb 28) - Feb 25, 2017.
Here a list of the best courses in data science from Udemy, covering Data Science, Machine Learning, Python, Spark, Tableau, and Hadoop - only $10 until Feb 28, 2017.
- Moving from R to Python: The Libraries You Need to Know - Feb 24, 2017.
Are you considering making a move from R to Python? Here are the libraries you need to know, how they stack up to their R contemporaries, and why you should learn them.
- What is a Support Vector Machine, and Why Would I Use it? - Feb 23, 2017.
Support Vector Machine has become an extremely popular algorithm. In this post I try to give a simple explanation for how it works and give a few examples using the the Python Scikits libraries.
- Introduction to Correlation - Feb 22, 2017.
Correlation is one of the most widely used (and widely misunderstood) statistical concepts. We provide the definitions and intuition behind several types of correlation and illustrate how to calculate correlation using the Python pandas library.
- Removing Outliers Using Standard Deviation in Python - Feb 16, 2017.
Standard Deviation is one of the most underrated statistical tools out there. It’s an extremely useful metric that most people know how to calculate but very few know how to use effectively.
- Apache Arrow and Apache Parquet: Why We Needed Different Projects for Columnar Data, On Disk and In-Memory - Feb 16, 2017.
Apache Parquet and Apache Arrow both focus on improving performance and efficiency of data analytics. These two projects optimize performance for on disk and in-memory processing
- KDnuggets™ News 17:n06, Feb 15: So What is Big Data? 52 Useful Machine Learning APIs; Data Science finds Perfect Valentines Dates - Feb 15, 2017.
Also Making Python Speak SQL with pandasql; 52 Useful Machine Learning & Prediction APIs, updated; New Poll: Do you support Trump Immigration Ban?
- Web Scraping for Dataset Curation, Part 2: Tidying Craft Beer Data - Feb 14, 2017.
This is the second part in a 2 part series on curating data from the web. The first part focused on web scraping, while this post details the process of tidying scraped data after the fact.
- Web Scraping for Dataset Curation, Part 1: Collecting Craft Beer Data - Feb 13, 2017.
This post is the first in a 2 part series on scraping and cleaning data from the web using Python. This first part is concerned with the scraping aspect, while the second part while focus on the cleaning. A concrete example is presented.
- Making Python Speak SQL with pandasql - Feb 8, 2017.
Want to wrangle Pandas data like you would SQL using Python? This post serves as an introduction to pandasql, and details how to get it up and running inside of Rodeo.
- Top KDnuggets tweets, Jan 25-31: Python implementations of Andrew Ng #MachineLearning MOOC exercises - Feb 1, 2017.
#Python implementations of Andrew Ng #MachineLearning MOOC exercises; This repository contains the entire #Python #DataScience Handbook; What are the best #visualizations of #MachineLearning algorithms? Learn #TensorFlow and #DeepLearning, without a PhD.
- Domino Data Science Popup, San Francisco, Feb 22 – KDnuggets Offer - Jan 31, 2017.
Learn about the latest trends in data science applications in technology from the top experts in the industry. Register by Feb 8 and save with code KDNuggetsVIP.
- Pandas Cheat Sheet: Data Science and Data Wrangling in Python - Jan 27, 2017.
The Pandas library can seem very elaborate and it might be hard to find a single point of entry to the material: with other learning materials focusing on different aspects of this library, you can definitely use a reference sheet to help you get the hang of it.
- Great Collection of Minimal and Clean Implementations of Machine Learning Algorithms - Jan 25, 2017.
Interested in learning machine learning algorithms by implementing them from scratch? Need a good set of examples to work from? Check out this post with links to minimal and clean implementations of various algorithms.
- Learn how to Develop and Deploy a Gradient Boosting Machine Model - Jan 20, 2017.
GBM is one the hottest machine learning methods. Learn how to create GBM using SciKit-Learn and Python and
understand the steps required to transform features, train, and deploy a GBM.
- Clean Data Science: Evaluating The Cleanliness of NYC Craft Beer Bar Kitchens - Jan 13, 2017.
An analysis of NYC Open Data health inspections showing that craft beer bar kitchens in Manhattan are cleaner than the average establishment by a statistically significant margin. An encouraging finding for Dry January.
- The Most Popular Language For Machine Learning and Data Science Is … - Jan 11, 2017.
When it comes to choosing programming language for Data Analytics projects or job prospects, people have different opinions depending on their career backgrounds and domains they worked in. Here is the analysis of data from indeed.com with respect to choice of programming language for machine learning and data science.
- Creating Data Visualization in Matplotlib - Jan 5, 2017.
Matplotlib is the most widely used data visualization library for Python; it's very powerful, but with a steep learning curve. This overview covers a selection of plots useful for a wide range of data analysis problems and discusses how to best deploy each one so you can tell your data story.
- Tidying Data in Python - Jan 4, 2017.
This post summarizes some tidying examples Hadley Wickham used in his 2014 paper on Tidy Data in R, but will demonstrate how to do so using the Python pandas library.
- Supercharge Your Data Science Team with AnacondaCON Team Discount, till Jan 16 - Jan 3, 2017.
AnacondaCON '17 will help you conquer your biggest data science challenges. Learn from industry experts sharing what #OpenDataScienceMeans and their best practices. Get 2 for 1 ticket price thru Jan 16, 2017.
- 5 Machine Learning Projects You Can No Longer Overlook, January - Jan 2, 2017.
There are a lot of popular machine learning projects out there, but many more that are not. Which of these are actively developed and worth checking out? Here is an offering of 5 such projects, the most recent in an ongoing series.
- Over 600 data science, machine learning, Big Data eBooks/videos for only $5 (until Jan 9) - Dec 22, 2016.
Packt have more than 600 data science, analysis, machine learning and Big Data eBooks and video courses. And until Jan 9, 2017 every single one is available for just $5.
- 3 ways to learn Data Science at Statistics.com - Dec 21, 2016.
Get the personal touch you need to deepen your learning with Statistics.com classes that are small, rich and engaging with readings, videos, quizzes, homework, and practical projects, taught online by leading instructors.
- Top KDnuggets tweets, Dec 7-13: Want to learn Numpy? A Github repo of Numpy learning exercises - Dec 14, 2016.
Also Deep Learning Roadmap: "Which paper should I start reading from?"; Free ebooks: #MachineLearning with #Python and Practical Data Analysis; Daily plan for studying to become a Google software engineer.
- 50+ Data Science, Machine Learning Cheat Sheets, updated - Dec 14, 2016.
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
- Introduction to K-means Clustering: A Tutorial - Dec 9, 2016.
A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.
- KDnuggets Top Blogs and Bloggers in November 2016 - Dec 8, 2016.
We recognize the best KDnuggets Bloggers who had the most popular blogs by views or social media shares in November 2016.
- R-Brain Platform for Data Science: R, Python, sharing, security, and marketplace - Dec 7, 2016.
R-Brain IDE enables data scientists to use both R and Python with full language support. It enables sharing and has a marketplace for models. Try it free.
- Free ebooks: Machine Learning with Python and Practical Data Analysis - Dec 5, 2016.
Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.
- Random Forests in Python - Dec 2, 2016.
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
- Top KDnuggets tweets, Nov 23-29: The Entire #Python Language in a Single Image ; Great list of Data Science, Machine Learning, AI Resources - Nov 30, 2016.
The Entire #Python Language in a Single Image; Cartoon: Thanksgiving, #BigData, and Turkey #DataScience; 50% of Data Scientists have under 10 GB databases, not #BigData; Machine Learning Algorithms: A Concise Technical Overview
- KDnuggets™ News 16:n42, Nov 30: Python Machine Learning Open Source Projects; Facebook Groups for Big Data & Data Science - Nov 30, 2016.
Python Machine Learning Open Source Projects; Facebook Groups for Big Data & Data Science; Combining Different Methods to Create Advanced Time Series Prediction; Tips for Beginner Machine Learning/Data Scientists Feeling Overwhelmed; Continuous improvement for IoT through AI / Continuous learning
- Introduction to Machine Learning for Developers - Nov 28, 2016.
Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning.
Pages: 1 2
- Neighbors Know Best: (Re) Classifying an Underappreciated Beer - Nov 24, 2016.
A look at beer features to determine whether a specific brew might be better served (pun intended) by being classified under a different style. kNN analysis supported with in-post plots and linked iPython notebook.
- Top KDnuggets tweets, Nov 16-22: Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning - Nov 23, 2016.
Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning; What is the Difference Between #DeepLearning and Regular #MachineLearning?; Questions To Ask When Moving #MachineLearning From Practice to Production; How to Choose the Right #Database System
- Top 20 Python Machine Learning Open Source Projects, updated - Nov 21, 2016.
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
- Join Us for the first AnacondaCON in February 2017 - Nov 11, 2016.
We're excited to announce that registration for AnacondaCON 2017, the first conference for Open Data Science leaders around the world, is now OPEN and limited to 500!
- How to Rank 10% in Your First Kaggle Competition - Nov 9, 2016.
This post presents a pathway to achieving success in Kaggle competitions as a beginner. The path generalizes beyond competitions, however. Read on for insight into succeeding while approaching any data science project.
Pages: 1 2 3 4
- KDnuggets™ News 16:n40, Nov 9: Trump, Failure of Prediction, and Lessons for Data Scientists; 8 Frustrating Things For R Users When Learning Python - Nov 9, 2016.
We examine the lessons for Data Scientists from the shocking and surprising win of Donald Trump; 8 frustrating things for R user when learning Python; Using Predictive Algorithms to Track Real Time Health Trends.
- Eight Things an R user Will Find Frustrating When Trying to Learn Python - Nov 2, 2016.
Are you an R user considering learning Python? Here's some insight into what you may be up against, and what, specifically, you may find frustrating. But don't worry, it's not all terrible.
- Using Machine Learning to Detect Malicious URLs - Oct 28, 2016.
This is a write-up of an experiment employing a machine learning model to identify malicious URLs. The author provides a link to the code for you to try yourself.
- Automated Machine Learning: An Interview with Randy Olson, TPOT Lead Developer - Oct 28, 2016.
Read an insightful interview with Randy Olson, Senior Data Scientist at University of Pennsylvania Institute for Biomedical Informatics, and lead developer of TPOT, an open source Python tool that intelligently automates the entire machine learning process.
- KDnuggets™ News 16:n38, Oct 26: Free Machine Learning EBooks; Neural Networks in Python with Scikit-learn - Oct 26, 2016.
5 EBooks to Read Before Getting into A Machine Learning Career; A Beginner's Guide to Neural Networks with Python and Scikit-learn 0.18!; New Poll: What was the largest dataset you analyzed / data mined?; Jupyter Notebook Best Practices for Data Science
- Jupyter Notebook Best Practices for Data Science - Oct 20, 2016.
Check out this overview of Jupyter notebook best practices as pertains to data science. Novice or expert, you may find something of use here.
- A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18! - Oct 20, 2016.
This post outlines setting up a neural network in Python using Scikit-learn, the latest version of which now has built in support for Neural Network models.
Pages: 1 2
- K2 Data Science Bootcamp - Oct 14, 2016.
This online, part-time immersive data science bootcamp is geared to help working professional become data scientists in 24 weeks, with live lectures, one-on-one supports, group study sessions, and more. Next session starts Jan 9, 2017.
- 2 must-have tools for blazing fast Python performance - Sep 15, 2016.
Intel has two must-have, highly optimized tools to help you get faster performance out of the box - with the least amount of effort.
- Webinar: Breaking Data Science Open, Sep 15 - Sep 12, 2016.
Learn how to drive collaboration and data science teamwork; how to mitigate legal risk through open source assurance and appropriate package selection, and how to democratize innovation through broad access to open data science tools.
- Introducing Dask for Parallel Programming: An Interview with Project Lead Developer - Sep 7, 2016.
Introducing Dask, a flexible parallel computing library for analytics. Learn more about this project built with interactive data science in mind in an interview with its lead developer.
- Top /r/MachineLearning Posts, August: Google Brain AMA, Image Completion with TensorFlow, Japanese Cucumber Farming - Sep 5, 2016.
Google Brain AMA; Image Completion with Deep Learning in TensorFlow; Japanese Cucumber Farming; Andrew Ng's machine learning class in Python; Google Brain datasets for robotics research
- HPE Haven OnDemand: Powerful Data Connectors for the Cloud and Enterprise - Sep 1, 2016.
HPE Haven OnDemand simplifies how you can interact with data, allowing it to be transformed into an asset anytime, anywhere. Find out how the Connector APIs can facilitate this interaction.
Pages: 1 2
- A Gentle Introduction to Bloom Filter - Aug 24, 2016.
The Bloom Filter is a probabilistic data structure which can make a tradeoff between space and false positive rate. Read more, and see an implementation from scratch, in this post.
- Visualizing 1 Billion Points of Data: Doing It Right – Aug 18 Webinar - Aug 11, 2016.
Join Continuum Analytics on August 18 for a webinar on Big Data visualization with the datashader library. Save your spot today!
- 7 Steps to Understanding Computer Vision - Aug 9, 2016.
A starting point for Computer Vision and how to get going deeper. Dive into this post for some overview of the right resources and a little bit of advice.
- Top KDnuggets tweets, Jul 27 – Aug 2: Understanding neural networks with Google TensorFlow Playground; Getting Started with Data Science in Python - Aug 3, 2016.
Understanding neural networks with Google TensorFlow Playground; The 100 Best-Funded #Analytics #DataScience #Startups; Great tutorial: Getting Started with #DataScience - #Python; #MachineLearning over 1M hotel reviews: interesting insights.
- Getting Started with Data Science – Python - Aug 1, 2016.
A great introductory post from DataRobot on getting started with data science in the Python ecosystem, including cleaning data and performing predictive modeling.
Pages: 1 2
- Data Science of Visiting Famous Movie Locations in San Francisco - Jul 30, 2016.
Using the Google Places API and IMDb API, we selected movie locations in The Golden City which every movie fan should visit while they are in town, and optimize sightseeing by solving the travelling salesman problem.
- Deep Learning For Chatbots, Part 2 – Implementing A Retrieval-Based Model In TensorFlow - Jul 29, 2016.
Check out part 2 of this tutorial on building chatbots with deep neural networks. This part gets practical, and using Python and TensorFlow to implement.
Pages: 1 2 3
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 3 - Jul 27, 2016.
This is the final part of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Pages: 1 2
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 2 - Jul 26, 2016.
This is part 2 of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Pages: 1 2
- Barley, Hops, and Bayes: Predicting The World Beer Cup - Jul 26, 2016.
This post covers predicting award counts by the United States in an international beer competition. Exploratory data analysis and Bayes methods are also supported.
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 1 - Jul 25, 2016.
Check out the first of a 3 part introductory series on machine learning in Python, fueled by the Titanic dataset. This is a great place to start for a machine learning newcomer.
- Building a Data Science Portfolio: Machine Learning Project Part 3 - Jul 22, 2016.
The final installment of this comprehensive overview on building an end-to-end data science portfolio project focuses on bringing it all together, and concludes the project quite nicely.
Pages: 1 2
- SAS vs R vs Python: Which Tool Do Analytics Pros Prefer? - Jul 22, 2016.
There are lots of flame wars involving different data science and analytics tools... but this isn't one of them. Check out the quantitative results and analysis of a Burtch Works survey on the subject.
- Building a Data Science Portfolio: Machine Learning Project Part 2 - Jul 21, 2016.
The second part of this comprehensive overview on building an end-to-end data science portfolio project concentrates on data exploration and preparation.
Pages: 1 2
- Interesting Things I Learned at SciPy 2016 - Jul 21, 2016.
Learn about some interesting projects featured at SciPy 2016, brought to you by an attendee who put in the work to bring you this great list of projects.
- Building a Data Science Portfolio: Machine Learning Project Part 1 - Jul 20, 2016.
Dataquest's founder has put together a fantastic resource on building a data science portfolio. This first of three parts lays the groundwork, with subsequent posts over the following 2 days. Very comprehensive!
Pages: 1 2
- Top KDnuggets tweets, Jul 13 – Jul 19: Bayesian #MachineLearning, Explained; Introducing JupyterLab - Jul 20, 2016.
Bayesian #MachineLearning, Explained; JupyterLab: the next generation of the #Jupyter Notebook; On the importance of democratizing #ArtificialIntelligence
- 10 Great Talks From SciPy 2016 - Jul 20, 2016.
Here's a curated short list of interesting and insightful talks to watch from SciPy 2016 to help guide your search through the volume of great video material emerging from the conference.
- Statistical Data Analysis in Python - Jul 18, 2016.
This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects, taking the form of a set of IPython notebooks.
- America’s Next Topic Model - Jul 15, 2016.
Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.
- Top KDnuggets tweets, Jul 6 – Jul 12: Statistical Data Analysis #Python #Jupyter Notebooks; Modern Pandas Notebooks - Jul 13, 2016.
Statistical Data Analysis in #Python (#Jupyter Notebooks); Modern Pandas: idiomatic Pandas notebook collection; New (free) book by @rdpeng: #rstats Programming for #DataScience
- 5 Deep Learning Projects You Can No Longer Overlook - Jul 12, 2016.
There are a number of "mainstream" deep learning projects out there, but many more niche projects flying under the radar. Have a look at 5 such projects worth checking out.
- Interview: Florian Douetteau, Dataiku Founder, on Empowering Data Scientists - Jul 7, 2016.
Here is an interview with Florian Douetteau, founder of Dataiku, on how their tools empower data scientists, and how data science itself is evolving.
- Deep Residual Networks for Image Classification with Python + NumPy - Jul 7, 2016.
This post outlines the results of an innovative Deep Residual Network implementation for Image Classification using Python and NumPy.
- Top KDnuggets tweets, Jun 29 – Jul 5: Big Data Ecosystem is Too Damn Big!; Deep Learning Intro with Caffe and Python - Jul 6, 2016.
The #BigData Ecosystem is Too Damn Big!; A Practical Introduction to #DeepLearning with Caffe and #Python; What do Postgres, Kafka, and Bitcoin have in common?
- Mining Twitter Data with Python Part 7: Geolocation and Interactive Maps - Jul 6, 2016.
The final part of this 7 part series explores using geolocation and interactive maps with Twitter data.
- Mining Twitter Data with Python Part 6: Sentiment Analysis Basics - Jul 5, 2016.
Part 6 of this series builds on the previous installments by exploring the basics of sentiment analysis on Twitter data.
- Mining Twitter Data with Python Part 5: Data Visualisation Basics - Jun 29, 2016.
Part 5 of this series takes on data visualization, as we look to make sense of our data and highlight interesting insights.
- 5 More Machine Learning Projects You Can No Longer Overlook - Jun 28, 2016.
There are a lot of popular machine learning projects out there, but many more that are not. Which of these are actively developed and worth checking out? Here is an offering of 5 such projects.
- Mining Twitter Data with Python Part 4: Rugby and Term Co-occurrences - Jun 27, 2016.
Part 4 of this series employs some of the lessons learned thus far to analyze tweets related to rugby matches and term co-occurrences.
- Doing Data Science: A Kaggle Walkthrough Part 6 – Creating a Model - Jun 24, 2016.
In the final part of this 6 part series on the process of data science, and applying it to a Kaggle competition, building the predictive models is covered, and multiple algorithms are discussed.
Pages: 1 2
- Mining Twitter Data with Python Part 3: Term Frequencies - Jun 22, 2016.
Part 3 of this 7 part series focusing on mining Twitter data discusses the analysis of term frequencies for meaningful term extraction.
- HPE Haven OnDemand Text Extraction API Cheat Sheet for Developers - Jun 21, 2016.
HPE Haven OnDemand provides a native API based on cURL calls, as well as numerous language-specific APIs, providing maximum flexibility for developers. This cheat sheet will cover the native and Python text extraction APIs.
- Mining Twitter Data with Python Part 2: Text Pre-processing - Jun 20, 2016.
Part 2 of this 7 part series on mining Twitter data for a variety of use cases focuses on the pre-processing of tweet text.
- Doing Data Science: A Kaggle Walkthrough Part 5 – Adding New Data - Jun 17, 2016.
Here is part 5 of the weekly 6 part series on doing data science in the context of a Kaggle competition, which concentrates on adding in new data.
Pages: 1 2
- Mining Twitter Data with Python Part 1: Collecting Data - Jun 15, 2016.
Part 1 of a 7 part series focusing on mining Twitter data for a variety of use cases. This first post lays the groundwork, and focuses on data collection.
- What Big Data, Data Science, Deep Learning software goes together? - Jun 14, 2016.
We analyze the associations between top Data Science tools, Commercial vs Free/Open Source, rank tools on R vs Python bias, find tools more associated with Big Data, those more associated with Deep Learning, and uncover strong regional differences.
Pages: 1 2 3
- 10 Useful Python Data Visualization Libraries for Any Discipline - Jun 14, 2016.
A great overview of 10 useful Python data visualization tools. It covers some of the big ones, like matplotlib and Seaborn, but also explores some more obscure libraries, like Gleam, Leather, and missingno.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 4 – Data Transformation and Feature Extraction - Jun 10, 2016.
Part 4 of this fantastic 6 part series covering the process of data science, and its application to a Kaggle competition, focuses on feature extraction and data transformation.
Pages: 1 2
- An Introduction to Scientific Python (and a Bit of the Maths Behind It) – Matplotlib - Jun 9, 2016.
An introductory overview of Matplotlib, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
Pages: 1 2
- Top KDnuggets tweets, Jun 1-7: “Deep” vs “Regular” Machine Learning; Introduction to Scientific Python – NumPy - Jun 8, 2016.
How to Build Your Own #DeepLearning Box; What is the Difference Between #DeepLearning and "Regular" #MachineLearning? Data Science of #Variable Selection: A Review; Why choose #Python for #MachineLearning?
- R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results - Jun 6, 2016.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 3 – Cleaning Data - Jun 3, 2016.
This is part three in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. In this episode, data cleaning and preparation is covered.
Pages: 1 2
- Top KDnuggets tweets, May 25-31: 19 Free eBooks to learn #programming with #Python; Awesome collection of public datasets on Github - Jun 1, 2016.
Introducing Hybrid lda2vec Algorithm via Stitch Fix; #DeepLearning and Deep #Gaussian Processes - explainer; Awesome collection of public #datasets on Github; #DataScience foundations: 19 Free eBooks to learn #programming with #Python.
- An Introduction to Scientific Python (and a Bit of the Maths Behind It) – NumPy - Jun 1, 2016.
An introductory overview of NumPy, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 2 – Understanding the Data - May 27, 2016.
This is the second post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Read on for a great overview of practicing data science.
Pages: 1 2
- ntent: Senior Software Engineer, Python - May 26, 2016.
Seeking a Senior Software Engineer, Python, to provide code support for an in-house ontology engineering and knowledge acquisition infrastructure, and to work closely with, and support, ontologists and linguists.
- Top KDnuggets tweets, May 18-24: Google supercharges #MachineLearning, #DeepLearning tasks with TPU (Tensor Processing Unit) - May 25, 2016.
Stanford Crowd Course Initiative: #MachineLearning with #Python course; Practical Guide to Matrix Calculus for #DeepLearning; Build your own #DeepLearning Box < $1.5K
- Doing Data Science: A Kaggle Walkthrough Part 1 – Introduction - May 19, 2016.
This is the first post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Very thorough, and very insightful.
- 5 Machine Learning Projects You Can No Longer Overlook - May 19, 2016.
We all know the big machine learning projects out there: Scikit-learn, TensorFlow, Theano, etc. But what about the smaller niche projects that are actively developed, providing useful services to users? Here are 5 such projects.
- High Performance Python for Open Data Science (Whitepaper) - May 17, 2016.
In this whitepaper, you'll learn from our seasoned experts about the approaches to scaling your data science models, review the various options, and learn how to easily accomplish them using Anaconda, the leading Open Data Science platform powered by Python.
- TPOT: A Python Tool for Automating Data Science - May 13, 2016.
TPOT is an open-source Python data science automation tool, which operates by optimizing a series of feature preprocessors and models, in order to maximize cross-validation accuracy on data sets.
Pages: 1 2
- Top Talks and Tutorials From PyData London - May 11, 2016.
Get some insight into the most recent Python data science talks and presentations with this eclectic mix of videos from PyData London 2016.
- Dealing with Unbalanced Classes, SVMs, Random Forests, and Decision Trees in Python - Apr 29, 2016.
An overview of dealing with unbalanced classes, and implementing SVMs, Random Forests, and Decision Trees in Python.
Pages: 1 2 3
- Webinar: High Performance Hadoop With Python, May 5th - Apr 28, 2016.
On May 5th, Dr. Kristopher Overholt and Dr. Matthew Rocklin of Continuum Analytics will present a webinar on High Performance Hadoop with Python. Reserve your spot today!
- Top KDnuggets tweets, Apr 12-26: The Race For AI: Google, Facebook, Amazon, Apple; Comprehensive Guide to Learning #Python - Apr 27, 2016.
Data Science helps see where your country will stand in WW 3; Recommender Systems: New Comprehensive Textbook; Good read: Deep Learning in Neural Networks - extreme summary; The Race For #AI: Google, Facebook, Amazon, Apple rush to grab #AI startups.
- Top Data Science Courses on Udemy - Apr 27, 2016.
An overview of the very best that Udemy has to offer in data science education. Includes courses covering machine learning, Python, Hadoop, visualization, and more.
Pages: 1 2 3
- KDnuggets™ News 16:n15, Apr 27: Deep Learning vs. SVMs, Random Forests; Python Guide for Data Science - Apr 27, 2016.
When Does Deep Learning Work Better Than SVMs or Random Forests; Comprehensive Guide to Learning Python for Data Science; Top 10 IPython Notebook Tutorials for Data Science and Machine Learning; 5,000 KDnuggets Posts - Examining Our Most Popular Content
- Top 10 IPython Notebook Tutorials for Data Science and Machine Learning - Apr 22, 2016.
A list of 10 useful Github repositories made up of IPython (Jupyter) notebooks, focused on teaching data science and machine learning. Python is the clear target here, but general principles are transferable.
- Black Box Challenge Machine Learning Competition - Apr 21, 2016.
Take part in an unusual machine learning competition — program an agent (in Python) that can play a game with unknown rules.