-
Plotnine: Python Alternative to ggplot2
Python's plotting libraries such as matplotlib and seaborn does allow the user to create elegant graphics as well, but lack of a standardized syntax for implementing the grammar of graphics compared to the simple, readable and layering approach of ggplot2 in R makes it more difficult to implement in Python.
-
Open Source Projects by Google, Uber and Facebook for Data Science and AI
By Asel Mendis, KDnuggets on November 28, 2019 in Advice, AI, Data Science, Data Scientist, Data Visualization, Deep Learning, Facebook, Google, Open Source, Python, UberOpen source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public.
-
Python, Selenium & Google for Geocoding Automation: Free and Paid
This tutorial will take you through two options that have automated the geocoding process for the user using Python, Selenium and Google Geocoding API.
-
How Bad is Multicollinearity?
For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.
-
Types of Bias in Machine Learning
The sample data used for training has to be as close a representation of the real scenario as possible. There are many factors that can bias a sample from the beginning and those reasons differ from each domain (i.e. business, security, medical, education etc.)
-
Statistical Modelling vs Machine Learning
At times it may seem Machine Learning can be done these days without a sound statistical background but those people are not really understanding the different nuances. Code written to make it easier does not negate the need for an in-depth understanding of the problem.
-
Is Bias in Machine Learning all Bad?
We have been taught over our years of predictive model building that bias will harm our model. Bias control needs to be in the hands of someone who can differentiate between the right kind and wrong kind of bias.
-
What’s wrong with the approach to Data Science?
The job ‘Data Scientist’ has been around for decades, it was just not called “Data Scientist”. Statisticians have used their knowledge and skills using machine learning techniques such as Logistic Regression and Random Forest for prediction and insights for longer than people actually realize.
-
An Overview of Outlier Detection Methods from PyOD – Part 1
PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.
-
Jupyter Notebooks: Data Science Reporting
Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks.
|