2019 Jun

All (53) | News (1) | Opinions (20) | Top Stories, Tweets (1) | Tutorials, Overviews (31)

An Overview of Human Pose Estimation with Deep Learning

Human Pose Estimation is one of the main research areas in computer vision. The reason for its importance is the abundance of applications that can benefit from such a technology. Here's an introduction to the different techniques used in Human Pose Estimation based on Deep Learning.

on Jun 28, 2019 in Computer Vision, Convolutional Neural Networks, Deep Learning, Image Recognition, Object Detection
How To Get Funding For AI Startups

What are the biggest challenges AI startups have when pitching to investors? Learn how to grab their attention with these recommendations on how to start building your AI company.

on Jun 27, 2019 in AI, Startups, VC
PySyft and the Emergence of Private Deep Learning

PySyft is an open-source framework that enables secured, private computations in deep learning, by combining federated learning and differential privacy in a single programming model integrated into different deep learning frameworks such as PyTorch, Keras or TensorFlow.

on Jun 27, 2019 in Deep Learning, Differential Privacy, Privacy, Python, Security
An Overview of Outlier Detection Methods from PyOD – Part 1

PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.

on Jun 27, 2019 in Algorithms, Big Data, Outliers, Python
Octoparse: A Revolutionary Web Scraping Software

Octoparse is the ultimate tool for data extraction (web crawling, data crawling and data scraping), which lets you turn the whole internet into a structured format. The newly launched Web Scraping Template makes it very easy even for people with no technical training.

on Jun 26, 2019 in Octoparse, Web Scraping
Optimization with Python: How to make the most amount of money with the least amount of risk?

Learn how to apply Python data science libraries to develop a simple optimization problem based on a Nobel-prize winning economic theory for maximizing investment profits while minimizing risk.

on Jun 26, 2019 in Finance, Investment, Optimization, Python, Risk Modeling, Stocks
10 Gradient Descent Optimisation Algorithms + Cheat Sheet

Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.

on Jun 26, 2019 in Algorithms, Deep Learning, Gradient Descent, Optimization
How to Make a Success Story of your Data Science Team

Today, data science is a crucial component for an organization's growth. Given how important data science has grown, it’s important to think about what data scientists add to an organization, how they fit in, and how to hire and build effective data science teams.

on Jun 25, 2019 in Culture, Data Science Team, Hiring, Team
The Data Fabric for Machine Learning – Part 2: Building a Knowledge-Graph

Before being able to develop a Data Fabric we need to build a Knowledge-Graph. In this article I’ll set up the basis on how to create it, in the next article we’ll go to the practice on how to do this.

on Jun 25, 2019 in Advice, Data Science, Data Scientist, Graphs, Machine Learning
Understanding Cloud Data Services

Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers.

on Jun 24, 2019 in AWS, Cloud Computing, Google Cloud, Microsoft Azure
10 New Things I Learnt from fast.ai Course V3

Fastai offers some really good courses in machine learning and deep learning for programmers. I recently took their "Practical Deep Learning for Coders" course and found it really interesting. Here are my learnings from the course.

on Jun 24, 2019 in Deep Learning, fast.ai, Jeremy Howard, Machine Learning, MOOC
7 Steps to Mastering Data Preparation for Machine Learning with Python — 2019 Edition

Interested in mastering data preparation with Python? Follow these 7 steps which cover the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem.

on Jun 24, 2019 in 7 Steps, Data Preparation, Data Preprocessing, Data Science, Data Wrangling, Machine Learning, Pandas, Python
How Google uses Reinforcement Learning to Train AI Agents in the Most Popular Sport in the World

Researchers from the Google Brain team open sourced Google Research Football, a new environment that leverages reinforcement learning to teach AI agents how to master the most popular sport in the world.

on Jun 21, 2019 in Agents, AI, Football, Google, Reinforcement Learning, Soccer
Natural Language Interface to DataTable

You have to write SQL queries to query data from a relational database. Sometimes, you even have to write complex queries to do that. Won't it be amazing if you could use a chatbot to retrieve data from a database using simple English? That's what this tutorial is all about.

on Jun 21, 2019 in AI, Chatbot, Natural Language Processing, NLP
Data Literacy: Using the Socratic Method

How can organizations and individuals promote Data Literacy? Data literacy is all about critical thinking, so the time-tested method of Socratic questioning can stimulate high-level engagement with data.

on Jun 20, 2019 in Advice, Data Science
Examining the Transformer Architecture: The OpenAI GPT-2 Controversy

GPT-2 is a generative model, created by OpenAI, trained on 40GB of Internet to predict the next word. And OpenAI found this model to be SO good that they did not release the fully trained model due to their concerns about malicious applications of the technology.

on Jun 20, 2019 in AI, Architecture, GPT-2, NLP, OpenAI, Transformer
Ten random useful things in R that you might not know about

Because the R ecosystem is so rich and constantly growing, people can often miss out on knowing about something that can really help them in a task that they have to complete

on Jun 20, 2019 in Advice, Analytics, Data Science, R
The Emergence of Cooperative and Competitive AI Agents

Without specific training in collaboration or competition, a recent AI model from DeepMind uses reinforcement learning to evolve these behaviors in game-playing agents. Learn how this emergent collective intelligence outperforms their human counterparts in 3D multiplayer games.

on Jun 19, 2019 in Agents, AI, DeepMind, Reinforcement Learning
One Simple Trick for Speeding up your Python Code with Numpy

Looping over Python arrays, lists, or dictionaries, can be slow. Thus, vectorized operations in Numpy are mapped to highly optimized C code, making them much faster than their standard Python counterparts.

on Jun 19, 2019 in Big Data, numpy, Python
Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise

The Spark NLP library has become a popular AI framework that delivers speed and scalability to your projects. Check out what's under the hood and learn about how to getting started leveraging Spark NLP from John Snow Labs.

on Jun 18, 2019 in Apache Spark, Enterprise, John Snow Labs, NLP, Spark NLP
K-means Clustering with Dask: Image Filters for Cat Pictures

How to recreate an original cat image with least possible colors. An interesting use case of Unsupervised Machine Learning with K Means Clustering in Python.

on Jun 18, 2019 in Clustering, Dask, Image Classification, Image Recognition, K-means, Python, Unsupervised Learning
Evolving Deep Neural Networks

This article reviews how evolutionary algorithms have been proposed and tested as a competitive alternative to address a number of issues related to neural network design.

on Jun 18, 2019 in Architecture, Automated Machine Learning, Evolutionary Algorithm, Neural Networks
Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS

Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years.

on Jun 17, 2019 in Data Science, indeed, Jobs, Python, R, SAS, TensorFlow
How to Use Python’s datetime

Python's datetime package is a convenient set of tools for working with dates and times. With just the five tricks that I’m about to show you, you can handle most of your datetime processing needs.

on Jun 17, 2019 in Programming, Python, Time Series
The Machine Learning Puzzle, Explained

Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together.

on Jun 17, 2019 in Algorithms, Explained, Machine Learning, Modeling
How to Learn Python for Data Science the Right Way

The biggest mistake you can make while learning Python for data science is to learn Python programming from courses meant for programmers. Avoid this mistake, and learn Python the right way by following this approach.

By Manu Jeevan on Jun 14, 2019 in Advice, Data Science, Jupyter, Matplotlib, Pandas, Python, scikit-learn, StatsModels
5 Useful Statistics Data Scientists Need to Know

A data scientist should know how to effectively use statistics to gain insights from data. Here are five useful and practical statistical concepts that every data scientist must know.

on Jun 14, 2019 in Data Science, Data Scientist, Statistics
Why Machine Learning is vulnerable to adversarial attacks and how to fix it

Machine learning can process data imperceptible to humans to produce expected results. These inconceivable patterns are inherent in the data but may make models vulnerable to adversarial attacks. How can developers harness these features to not lose control of AI?

on Jun 13, 2019 in Adversarial, Machine Learning, Safety, Security
Become a Pro at Pandas, Python’s Data Manipulation Library

Pandas is one of the most popular Python libraries for cleaning, transforming, manipulating and analyzing data. Learn how to efficiently handle large amounts of data using Pandas.

on Jun 13, 2019 in Matplotlib, numpy, Pandas, Python, SQL
Scalable Python Code with Pandas UDFs: A Data Science Application

There is still a gap between the corpus of libraries that developers want to apply in a scalable runtime and the set of libraries that support distributed execution. This post discusses how to bridge this gap using the the functionality provided by Pandas UDFs in Spark 2.3+

on Jun 13, 2019 in Apache Spark, Big Data, Pandas, Python
All Models Are Wrong – What Does It Mean?

During your adventures in data science, you may have heard “all models are wrong.” Let’s unpack this famous quote to understand how we can still make models that are useful.

on Jun 12, 2019 in Advice, Linear Regression, Modeling, Statistics
Overview of Different Approaches to Deploying Machine Learning Models in Production

Learn the different methods for putting machine learning models into production, and to determine which method is best for which use case.

on Jun 12, 2019 in Deployment, Jupyter, Machine Learning, Production, Training Data
How to Automate Hyperparameter Optimization

A step-by-step guide into performing a hyperparameter optimization task on a deep learning model by employing Bayesian Optimization that uses the Gaussian Process. We used the gp_minimize package provided by the Scikit-Optimize (skopt) library to perform this task.

on Jun 12, 2019 in Bayesian, Deep Learning, Hyperparameter, Machine Learning, Neural Networks, Optimization, Python, TensorFlow
3 Main Approaches to Machine Learning Models

Machine learning encompasses a vast set of conceptual approaches. We classify the three main algorithmic methods based on mathematical foundations to guide your exploration for developing models.

on Jun 11, 2019 in Decision Trees, Linear Regression, Machine Learning, Naive Bayes
If you’re a developer transitioning into data science, here are your best resources

This article will provide a background on the data scientist role and why your background might be a good fit for data science, plus tangible stepwise actions that you, as a developer, can take to ramp up on data science.

on Jun 11, 2019 in Advice, Career, Data Science, Data Scientist
What you need to know: The Modern Open-Source Data Science/Machine Learning Ecosystem

We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and Big Data.

on Jun 10, 2019 in Anaconda, Apache Spark, Big Data Software, Deep Learning, Excel, Keras, Poll, Python, R, RapidMiner, scikit-learn, Software, SQL, Tableau, TensorFlow
Choosing an Error Function

The error function expresses how much we care about a deviation of a certain size. The choice of error function depends entirely on how our model will be used.

on Jun 10, 2019 in Cost Function, Machine Learning
The Infinity Stones of Data Science

Do you love data science 3000? Don't want to be embarrassed in front of the other analytics wizards? Aspire to be one of Earth's mightiest heroes, like Kevin Bacon? Help make data science a snap with these simple insights.

on Jun 10, 2019 in Comic, Data Science
A Step-by-Step Guide to Transitioning your Career to Data Science – Part 2

How do you identify the technical skills a hiring manager is looking for? How do you build a data science project that draws the attention of a hiring manager?

on Jun 7, 2019 in Career Advice, Data Science, Skills, SQL, Tableau
Top 10 Statistics Mistakes Made by Data Scientists

The following are some of the most common statistics mistakes made by data scientists. Check this list often to make sure you are not making any of these while applying statistics to data science.

on Jun 7, 2019 in Data Science, Data Scientist, GitHub, Mistakes, Statistics
Random Forests® vs Neural Networks: Which is Better, and When?

Random Forests and Neural Network are the two widely used machine learning algorithms. What is the difference between the two approaches? When should one use Neural Network or Random Forest?

on Jun 7, 2019 in Decision Trees, Neural Networks, random forests algorithm
Using the ‘What-If Tool’ to investigate Machine Learning models

The machine learning practitioner must be a detective, and this tool from teams at Google enables you to investigate and understand your models.

on Jun 6, 2019 in Advice, Data Science Tools, Data Visualization, Machine Learning, TensorFlow
PyViz: Simplifying the Data Visualisation Process in Python

There are python libraries suitable for basic data visualizations but not for complicated ones, and there are libraries suitable only for complex visualizations. Is there a single library that handles both these tasks efficiently? The answer is yes. It's PyViz

on Jun 6, 2019 in Data Visualization, GitHub, Matplotlib, Python
Jupyter Notebooks: Data Science Reporting

Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks.

on Jun 6, 2019 in Anaconda, Data Science, Jupyter
NLP and Computer Vision Integrated

Computer vision and NLP developed as separate fields, and researchers are now combining these tasks to solve long-standing problems across multiple disciplines.

on Jun 5, 2019 in Computer Vision, NLP, Sciforce
Mongo DB Basics

Mongo DB is a document oriented NO SQL database unlike HBASE which has a wide column store. The advantage of Document oriented over relation type is the columns can be changed as an when required for each case as opposed to the same column name for all the rows.

on Jun 5, 2019 in Big Data, Data Engineering, Data Science, MongoDB
The Whole Data Science World in Your Hands

Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.

on Jun 5, 2019 in Data Science, Data Scientist, Julia, Jupyter, MatrixDS, Python, R
How to choose a visualization

Visualizations based on the structure of data are needed during analysis, which might be different than for the end user. A new guide for choosing the right visualization helps you flexibly understand the data first.

on Jun 4, 2019 in Advice, Data Visualization, Tableau
Separating signal from noise

When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.

on Jun 4, 2019 in Noise, Regression, Statistics, Time Series
Top Stories, May 27 – Jun 2: A Step-by-Step Guide to Transitioning your Career to Data Science – Part 1; Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis

Understanding Backpropagation as Applied to LSTM; How the Lottery Ticket Hypothesis is Challenging Everything we Knew About Training Neural Networks; AI in the Family: how to teach machine learning to your kids

on Jun 3, 2019 in Top stories
Clearing air around “Boosting”

We explain the reasoning behind the massive success of boosting algorithms, how it came to be and what we can expect from them in the future.

on Jun 3, 2019 in Boosting, Gradient Boosting, Machine Learning, XGBoost
The Hitchhiker’s Guide to Feature Extraction

Check out this collection of tricks and code for Kaggle and everyday work.

on Jun 3, 2019 in Feature Engineering, Feature Extraction, Feature Selection, Kaggle, Python
7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition

This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!

on Jun 3, 2019 in 7 Steps, Classification, Cross-validation, Dimensionality Reduction, Feature Engineering, Feature Selection, Image Classification, K-nearest neighbors, Machine Learning, Modeling, Naive Bayes, numpy, Pandas, PCA, Python, scikit-learn, Transfer Learning

2019 Jun

Latest Posts

Top Posts