2019 Jun
All (77) | Courses, Education (1) | Meetings (2) | News (6) | Opinions (25) | Top Stories, Tweets (9) | Tutorials, Overviews (33) | Webcasts & Webinars (1)
- Make your Data Talk! - Jun 28, 2019.
Matplotlib and Seaborn are two of the most powerful and popular data visualization libraries in Python. Read on to learn how to create some of the most frequently used graphs and charts using Matplotlib and Seaborn.
- An Overview of Human Pose Estimation with Deep Learning - Jun 28, 2019.
Human Pose Estimation is one of the main research areas in computer vision. The reason for its importance is the abundance of applications that can benefit from such a technology. Here's an introduction to the different techniques used in Human Pose Estimation based on Deep Learning.
- How To Get Funding For AI Startups - Jun 27, 2019.
What are the biggest challenges AI startups have when pitching to investors? Learn how to grab their attention with these recommendations on how to start building your AI company.
- PySyft and the Emergence of Private Deep Learning - Jun 27, 2019.
PySyft is an open-source framework that enables secured, private computations in deep learning, by combining federated learning and differential privacy in a single programming model integrated into different deep learning frameworks such as PyTorch, Keras or TensorFlow.
- An Overview of Outlier Detection Methods from PyOD – Part 1 - Jun 27, 2019.
PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.
- Top KDnuggets Tweets, Jun 19 – 25: Learn how to efficiently handle large amounts of data using #Pandas; The biggest mistake while learning #Python for #datascience - Jun 26, 2019.
Also: Data Science Jobs Report 2019; Harvard CS109 #DataScience Course, Resources #Free and Online; Google launches TensorFlow; Mastering SQL for Data Science
- Octoparse: A Revolutionary Web Scraping Software - Jun 26, 2019.
Octoparse is the ultimate tool for data extraction (web crawling, data crawling and data scraping), which lets you turn the whole internet into a structured format. The newly launched Web Scraping Template makes it very easy even for people with no technical training.
- Optimization with Python: How to make the most amount of money with the least amount of risk? - Jun 26, 2019.
Learn how to apply Python data science libraries to develop a simple optimization problem based on a Nobel-prize winning economic theory for maximizing investment profits while minimizing risk.
- 10 Gradient Descent Optimisation Algorithms + Cheat Sheet - Jun 26, 2019.
Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.
- Why do we need AWS SageMaker? - Jun 26, 2019.
Today, there are several platforms available in the industry that aid software developers, data scientists as well as a layman in developing and deploying machine learning models within no time.
- Do Conv-nets Dream of Psychedelic Sheep? - Jun 25, 2019.
In deep learning, understanding your model well enough to interpret its behavior will help improve model performance and reduce the black-box mystique of neural networks.
- How to Make a Success Story of your Data Science Team - Jun 25, 2019.
Today, data science is a crucial component for an organization's growth. Given how important data science has grown, it’s important to think about what data scientists add to an organization, how they fit in, and how to hire and build effective data science teams.
- The Data Fabric for Machine Learning – Part 2: Building a Knowledge-Graph - Jun 25, 2019.
Before being able to develop a Data Fabric we need to build a Knowledge-Graph. In this article I’ll set up the basis on how to create it, in the next article we’ll go to the practice on how to do this.
-
Understanding Cloud Data Services - Jun 24, 2019.
Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers. - Top Stories, Jun 17 – 23: Data Science Jobs Report 2019; Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise - Jun 24, 2019.
5 Useful Statistics Data Scientists Need to Know; How to Learn Python for Data Science the Right Way; The Machine Learning Puzzle, Explained; How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat
- 10 New Things I Learnt from fast.ai Course V3 - Jun 24, 2019.
Fastai offers some really good courses in machine learning and deep learning for programmers. I recently took their "Practical Deep Learning for Coders" course and found it really interesting. Here are my learnings from the course.
- Natural Language Processing Q&A - Jun 24, 2019.
In this Q&A, Jos Martin, Senior Engineering Manager at MathWorks, discusses recent NLP developments and the applications that are benefitting from the technology.
-
7 Steps to Mastering Data Preparation for Machine Learning with Python — 2019 Edition - Jun 24, 2019.
Interested in mastering data preparation with Python? Follow these 7 steps which cover the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem. - Modelplotr v1.0 now on CRAN: Visualize the Business Value of your Predictive Models - Jun 21, 2019.
Explaining the business value of your predictive models to your business colleagues is a challenging task. Using Modelplotr, an R package, you can easily create stunning visualizations that clearly communicate the business value of your models.
- How Google uses Reinforcement Learning to Train AI Agents in the Most Popular Sport in the World - Jun 21, 2019.
Researchers from the Google Brain team open sourced Google Research Football, a new environment that leverages reinforcement learning to teach AI agents how to master the most popular sport in the world.
- Natural Language Interface to DataTable - Jun 21, 2019.
You have to write SQL queries to query data from a relational database. Sometimes, you even have to write complex queries to do that. Won't it be amazing if you could use a chatbot to retrieve data from a database using simple English? That's what this tutorial is all about.
- Data Literacy: Using the Socratic Method - Jun 20, 2019.
How can organizations and individuals promote Data Literacy? Data literacy is all about critical thinking, so the time-tested method of Socratic questioning can stimulate high-level engagement with data.
- Examining the Transformer Architecture: The OpenAI GPT-2 Controversy - Jun 20, 2019.
GPT-2 is a generative model, created by OpenAI, trained on 40GB of Internet to predict the next word. And OpenAI found this model to be SO good that they did not release the fully trained model due to their concerns about malicious applications of the technology.
- Ten random useful things in R that you might not know about - Jun 20, 2019.
Because the R ecosystem is so rich and constantly growing, people can often miss out on knowing about something that can really help them in a task that they have to complete
- Top KDnuggets Tweets, Jun 12 – 18: The biggest mistake while learning #Python for #datascience; 5 practical statistical concepts for data scientists - Jun 19, 2019.
Also: Resources for developers transitioning into data science; Best Data Visualization Techniques for small and large data; Top Data Science and Machine Learning Methods Used in 2018, 2019
- The Emergence of Cooperative and Competitive AI Agents - Jun 19, 2019.
Without specific training in collaboration or competition, a recent AI model from DeepMind uses reinforcement learning to evolve these behaviors in game-playing agents. Learn how this emergent collective intelligence outperforms their human counterparts in 3D multiplayer games.
- One Simple Trick for Speeding up your Python Code with Numpy - Jun 19, 2019.
Looping over Python arrays, lists, or dictionaries, can be slow. Thus, vectorized operations in Numpy are mapped to highly optimized C code, making them much faster than their standard Python counterparts.
- Python Users Come From All Sorts of Backgrounds - Jun 18, 2019.
Python users come from all sorts of backgrounds, but computer science skills make the difference between a Python apprentice and a Python master. Save 50% off Classic Computer Science Problems in Python today, using the code kdcsprob50 when you buy from manning.com.
-
Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise - Jun 18, 2019.
The Spark NLP library has become a popular AI framework that delivers speed and scalability to your projects. Check out what's under the hood and learn about how to getting started leveraging Spark NLP from John Snow Labs. - K-means Clustering with Dask: Image Filters for Cat Pictures - Jun 18, 2019.
How to recreate an original cat image with least possible colors. An interesting use case of Unsupervised Machine Learning with K Means Clustering in Python.
- Evolving Deep Neural Networks - Jun 18, 2019.
This article reviews how evolutionary algorithms have been proposed and tested as a competitive alternative to address a number of issues related to neural network design.
- Data-driven to Model-driven: The Strategic Shift Being Made by Leading Organizations - Jun 17, 2019.
You can have all the data you want, do all the machine learning you want, but if you aren’t running your business on models, you’ll soon be left behind. In this webinar, we will demystify the model-driven business.
-
Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS - Jun 17, 2019.
Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years. - Top Stories, Jun 10 – 16: Best resources for developers transitioning into data science; 5 Useful Statistics Data Scientists Need to Know - Jun 17, 2019.
The Infinity Stones of Data Science; What you need to know about the Modern Open-Source Data Science ecosystem; Scalable Python Code with Pandas UDFs: A Data Science Application; Become a Pro at Pandas
- How to Use Python’s datetime - Jun 17, 2019.
Python's datetime package is a convenient set of tools for working with dates and times. With just the five tricks that I’m about to show you, you can handle most of your datetime processing needs.
-
The Machine Learning Puzzle, Explained - Jun 17, 2019.
Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together. -
How to Learn Python for Data Science the Right Way, by Manu Jeevan - Jun 14, 2019.
The biggest mistake you can make while learning Python for data science is to learn Python programming from courses meant for programmers. Avoid this mistake, and learn Python the right way by following this approach. - Show off your Data Science skills with Kaggle Kernels - Jun 14, 2019.
Kaggle is not just about data science competitions. They also have a platform called Kaggle Kernels, using which you can build a stellar data science portfolio.
-
5 Useful Statistics Data Scientists Need to Know - Jun 14, 2019.
A data scientist should know how to effectively use statistics to gain insights from data. Here are five useful and practical statistical concepts that every data scientist must know. - First hand experience from Uber, Microsoft & more at PAW in London - Jun 13, 2019.
Hear top practitioners describe the design, deployment and business impact of their machine learning projects at Predictive Analytics World London, 16-17 Oct 2019!
- Why Machine Learning is vulnerable to adversarial attacks and how to fix it - Jun 13, 2019.
Machine learning can process data imperceptible to humans to produce expected results. These inconceivable patterns are inherent in the data but may make models vulnerable to adversarial attacks. How can developers harness these features to not lose control of AI?
- Become a Pro at Pandas, Python’s Data Manipulation Library - Jun 13, 2019.
Pandas is one of the most popular Python libraries for cleaning, transforming, manipulating and analyzing data. Learn how to efficiently handle large amounts of data using Pandas.
- Scalable Python Code with Pandas UDFs: A Data Science Application - Jun 13, 2019.
There is still a gap between the corpus of libraries that developers want to apply in a scalable runtime and the set of libraries that support distributed execution. This post discusses how to bridge this gap using the the functionality provided by Pandas UDFs in Spark 2.3+
- Top KDnuggets Tweets, Jun 5 – 11: A New Extension to Organize your Code on Jupyter Notebooks; Data Science Cheat Sheet - Jun 12, 2019.
Also: Cognitive Biases are Making Sure You Aren’t So Smart; 3 Machine Learning Books that Helped me Level Up as a Data Scientist; Mastering Intermediate Machine Learning with Python
- All Models Are Wrong – What Does It Mean? - Jun 12, 2019.
During your adventures in data science, you may have heard “all models are wrong.” Let’s unpack this famous quote to understand how we can still make models that are useful.
- Overview of Different Approaches to Deploying Machine Learning Models in Production - Jun 12, 2019.
Learn the different methods for putting machine learning models into production, and to determine which method is best for which use case.
- Crowdsourcing vs. Managed Teams: A Study in Data Labeling Quality - Jun 12, 2019.
You need data labeled for ML. You can do it in-house, crowdsource it, or hire a managed service. If data quality matters, read this.
- How to Automate Hyperparameter Optimization - Jun 12, 2019.
A step-by-step guide into performing a hyperparameter optimization task on a deep learning model by employing Bayesian Optimization that uses the Gaussian Process. We used the gp_minimize package provided by the Scikit-Optimize (skopt) library to perform this task.
- Top May Stories: A Step-by-Step Guide to Transitioning your Career to Data Science; 7 Steps to Mastering SQL for Data Science – 2019 Edition - Jun 11, 2019.
Also: The Third Wave Data Scientist; Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis
- 3 Main Approaches to Machine Learning Models - Jun 11, 2019.
Machine learning encompasses a vast set of conceptual approaches. We classify the three main algorithmic methods based on mathematical foundations to guide your exploration for developing models.
- The Data Fabric for Machine Learning Part 1-b – Deep Learning on Graphs - Jun 11, 2019.
Deep learning on graphs is taking more importance by the day. Here I’ll show the basics of thinking about machine learning and deep learning on graphs with the library Spektral and the platform MatrixDS.
-
If you’re a developer transitioning into data science, here are your best resources - Jun 11, 2019.
This article will provide a background on the data scientist role and why your background might be a good fit for data science, plus tangible stepwise actions that you, as a developer, can take to ramp up on data science. - First Speakers Announced For Data Driven Government This Fall - Jun 10, 2019.
A line-up of world-class speakers at Data Driven Government, Sep 25 in Washington, DC, will reveal you how to use data and analytics to more effectively accomplish your mission, increase efficiency, and improve evidence-based policymaking.
-
What you need to know: The Modern Open-Source Data Science/Machine Learning Ecosystem - Jun 10, 2019.
We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and Big Data. - Top Stories, Jun 3-9: 7 Steps to Mastering Intermediate Machine Learning with Python 2019 Edition; How to choose a visualization - Jun 10, 2019.
A Step-by-Step Guide to Transitioning your Career to Data Science Part 1; Math for Programmers; PyViz: Simplifying the Data Visualisation Process in Python
- Choosing an Error Function - Jun 10, 2019.
The error function expresses how much we care about a deviation of a certain size. The choice of error function depends entirely on how our model will be used.
-
The Infinity Stones of Data Science - Jun 10, 2019.
Do you love data science 3000? Don't want to be embarrassed in front of the other analytics wizards? Aspire to be one of Earth's mightiest heroes, like Kevin Bacon? Help make data science a snap with these simple insights. - A Step-by-Step Guide to Transitioning your Career to Data Science – Part 2 - Jun 7, 2019.
How do you identify the technical skills a hiring manager is looking for? How do you build a data science project that draws the attention of a hiring manager?
-
Math for Programmers. - Jun 7, 2019.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer. -
Top 10 Statistics Mistakes Made by Data Scientists - Jun 7, 2019.
The following are some of the most common statistics mistakes made by data scientists. Check this list often to make sure you are not making any of these while applying statistics to data science. -
Random Forests® vs Neural Networks: Which is Better, and When? - Jun 7, 2019.
Random Forests and Neural Network are the two widely used machine learning algorithms. What is the difference between the two approaches? When should one use Neural Network or Random Forest? - Using the ‘What-If Tool’ to investigate Machine Learning models - Jun 6, 2019.
The machine learning practitioner must be a detective, and this tool from teams at Google enables you to investigate and understand your models.
-
PyViz: Simplifying the Data Visualisation Process in Python - Jun 6, 2019.
There are python libraries suitable for basic data visualizations but not for complicated ones, and there are libraries suitable only for complex visualizations. Is there a single library that handles both these tasks efficiently? The answer is yes. It's PyViz -
Jupyter Notebooks: Data Science Reporting - Jun 6, 2019.
Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks. - Top KDnuggets Tweets, May 29 – Jun 4: Difference between #MachineLearning and #AI; Understanding Backpropagation as Applied to LSTM - Jun 5, 2019.
Also: Animations with Matplotlib; Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis; The 3 Biggest Mistakes on Learning Data Science
- Math for Machine Learning. - Jun 5, 2019.
This ebook explains the math involved and introduces you directly to the foundational topics in machine learning.
-
NLP and Computer Vision Integrated - Jun 5, 2019.
Computer vision and NLP developed as separate fields, and researchers are now combining these tasks to solve long-standing problems across multiple disciplines. - Mongo DB Basics - Jun 5, 2019.
Mongo DB is a document oriented NO SQL database unlike HBASE which has a wide column store. The advantage of Document oriented over relation type is the columns can be changed as an when required for each case as opposed to the same column name for all the rows.
- The Whole Data Science World in Your Hands - Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
- Statistical Thinking for Industrial Problem Solving (STIPS): a free online course. - Jun 4, 2019.
This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.
-
How to choose a visualization - Jun 4, 2019.
Visualizations based on the structure of data are needed during analysis, which might be different than for the end user. A new guide for choosing the right visualization helps you flexibly understand the data first. - Data Scientists Are Thinkers: Execution vs. exploration and what it means for you - Jun 4, 2019.
Data scientists serve a very technical purpose, but one that is vastly different from other individual contributors. Unlike engineers, designers, and project managers, data scientists are exploration-first, rather than execution-first.
- Separating signal from noise - Jun 4, 2019.
When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.
- Top Stories, May 27 – Jun 2: A Step-by-Step Guide to Transitioning your Career to Data Science – Part 1; Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis - Jun 3, 2019.
Understanding Backpropagation as Applied to LSTM; How the Lottery Ticket Hypothesis is Challenging Everything we Knew About Training Neural Networks; AI in the Family: how to teach machine learning to your kids
- Clearing air around “Boosting” - Jun 3, 2019.
We explain the reasoning behind the massive success of boosting algorithms, how it came to be and what we can expect from them in the future.
- The Hitchhiker’s Guide to Feature Extraction - Jun 3, 2019.
Check out this collection of tricks and code for Kaggle and everyday work.
-
7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition - Jun 3, 2019.
This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!