2019 Jun Tutorials, Overviews
All (79) | Courses, Education (1) | Meetings (2) | News (6) | Opinions (25) | Top Stories, Tweets (9) | Tutorials, Overviews (35) | Webcasts & Webinars (1)
- Make your Data Talk!
- Jun 28, 2019.
Matplotlib and Seaborn are two of the most powerful and popular data visualization libraries in Python. Read on to learn how to create some of the most frequently used graphs and charts using Matplotlib and Seaborn.
- An Overview of Outlier Detection Methods from PyOD – Part 1
- Jun 27, 2019.
PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.
- Optimization with Python: How to make the most amount of money with the least amount of risk?
- Jun 26, 2019.
Learn how to apply Python data science libraries to develop a simple optimization problem based on a Nobel-prize winning economic theory for maximizing investment profits while minimizing risk.
- 10 Gradient Descent Optimisation Algorithms + Cheat Sheet
- Jun 26, 2019.
Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.
-
Understanding Cloud Data Services - Jun 24, 2019.
Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers. -
7 Steps to Mastering Data Preparation for Machine Learning with Python — 2019 Edition - Jun 24, 2019.
Interested in mastering data preparation with Python? Follow these 7 steps which cover the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem. - Modelplotr v1.0 now on CRAN: Visualize the Business Value of your Predictive Models
- Jun 21, 2019.
Explaining the business value of your predictive models to your business colleagues is a challenging task. Using Modelplotr, an R package, you can easily create stunning visualizations that clearly communicate the business value of your models.
- Natural Language Interface to DataTable
- Jun 21, 2019.
You have to write SQL queries to query data from a relational database. Sometimes, you even have to write complex queries to do that. Won't it be amazing if you could use a chatbot to retrieve data from a database using simple English? That's what this tutorial is all about.
- The Emergence of Cooperative and Competitive AI Agents
- Jun 19, 2019.
Without specific training in collaboration or competition, a recent AI model from DeepMind uses reinforcement learning to evolve these behaviors in game-playing agents. Learn how this emergent collective intelligence outperforms their human counterparts in 3D multiplayer games.
-
How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat - Jun 19, 2019.
Subset selection is one of the most frequently performed tasks while manipulating data. Pandas provides different ways to efficiently select subsets of data from your DataFrame. - One Simple Trick for Speeding up your Python Code with Numpy
- Jun 19, 2019.
Looping over Python arrays, lists, or dictionaries, can be slow. Thus, vectorized operations in Numpy are mapped to highly optimized C code, making them much faster than their standard Python counterparts.
-
Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise - Jun 18, 2019.
The Spark NLP library has become a popular AI framework that delivers speed and scalability to your projects. Check out what's under the hood and learn about how to getting started leveraging Spark NLP from John Snow Labs. - K-means Clustering with Dask: Image Filters for Cat Pictures
- Jun 18, 2019.
How to recreate an original cat image with least possible colors. An interesting use case of Unsupervised Machine Learning with K Means Clustering in Python.
- Evolving Deep Neural Networks
- Jun 18, 2019.
This article reviews how evolutionary algorithms have been proposed and tested as a competitive alternative to address a number of issues related to neural network design.
-
Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS - Jun 17, 2019.
Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years. - How to Use Python’s datetime
- Jun 17, 2019.
Python's datetime package is a convenient set of tools for working with dates and times. With just the five tricks that I’m about to show you, you can handle most of your datetime processing needs.
-
The Machine Learning Puzzle, Explained - Jun 17, 2019.
Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together. - Become a Pro at Pandas, Python’s Data Manipulation Library
- Jun 13, 2019.
Pandas is one of the most popular Python libraries for cleaning, transforming, manipulating and analyzing data. Learn how to efficiently handle large amounts of data using Pandas.
- Scalable Python Code with Pandas UDFs: A Data Science Application
- Jun 13, 2019.
There is still a gap between the corpus of libraries that developers want to apply in a scalable runtime and the set of libraries that support distributed execution. This post discusses how to bridge this gap using the the functionality provided by Pandas UDFs in Spark 2.3+
- All Models Are Wrong – What Does It Mean?
- Jun 12, 2019.
During your adventures in data science, you may have heard “all models are wrong.” Let’s unpack this famous quote to understand how we can still make models that are useful.
- Overview of Different Approaches to Deploying Machine Learning Models in Production
- Jun 12, 2019.
Learn the different methods for putting machine learning models into production, and to determine which method is best for which use case.
- How to Automate Hyperparameter Optimization
- Jun 12, 2019.
A step-by-step guide into performing a hyperparameter optimization task on a deep learning model by employing Bayesian Optimization that uses the Gaussian Process. We used the gp_minimize package provided by the Scikit-Optimize (skopt) library to perform this task.
- 3 Main Approaches to Machine Learning Models
- Jun 11, 2019.
Machine learning encompasses a vast set of conceptual approaches. We classify the three main algorithmic methods based on mathematical foundations to guide your exploration for developing models.
-
If you’re a developer transitioning into data science, here are your best resources - Jun 11, 2019.
This article will provide a background on the data scientist role and why your background might be a good fit for data science, plus tangible stepwise actions that you, as a developer, can take to ramp up on data science. - 5 Ways to Deal with the Lack of Data in Machine Learning
- Jun 10, 2019.
Effective solutions exist when you don't have enough data for your models. While there is no perfect approach, five proven ways will get your model to production.
- Choosing an Error Function
- Jun 10, 2019.
The error function expresses how much we care about a deviation of a certain size. The choice of error function depends entirely on how our model will be used.
-
Random Forests® vs Neural Networks: Which is Better, and When? - Jun 7, 2019.
Random Forests and Neural Network are the two widely used machine learning algorithms. What is the difference between the two approaches? When should one use Neural Network or Random Forest? -
PyViz: Simplifying the Data Visualisation Process in Python - Jun 6, 2019.
There are python libraries suitable for basic data visualizations but not for complicated ones, and there are libraries suitable only for complex visualizations. Is there a single library that handles both these tasks efficiently? The answer is yes. It's PyViz -
Jupyter Notebooks: Data Science Reporting - Jun 6, 2019.
Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks. - Mongo DB Basics
- Jun 5, 2019.
Mongo DB is a document oriented NO SQL database unlike HBASE which has a wide column store. The advantage of Document oriented over relation type is the columns can be changed as an when required for each case as opposed to the same column name for all the rows.
- The Whole Data Science World in Your Hands
- Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
-
How to choose a visualization - Jun 4, 2019.
Visualizations based on the structure of data are needed during analysis, which might be different than for the end user. A new guide for choosing the right visualization helps you flexibly understand the data first. - Separating signal from noise
- Jun 4, 2019.
When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.
- The Hitchhiker’s Guide to Feature Extraction
- Jun 3, 2019.
Check out this collection of tricks and code for Kaggle and everyday work.
-
7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition - Jun 3, 2019.
This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!