A selection of top tips to obtain great results on Kaggle leaderboards, including useful code examples showing how best to use Latitude and Longitude features.
Looking for papers with code? If so, this GitHub repository, a clearinghouse for research papers and their corresponding implementation code, is definitely worth checking out.
There are two main tasks in speech processing. First one is to transform speech to text. The second is to convert the text into human speech. We will describe the general aspects of each API and then compare their main features in the table.
An extensive look at the history of machine learning models, using historical data from the number of publications of each type to attempt to answer the question: what is the most popular model?
In support of the explainable AI cause, we present a variety of use cases covering operational needs, regulatory compliance and public trust and social acceptance.
A brief rundown of methods/packages/ideas to generate synthetic data for self-driven data science projects and deep diving into machine learning methods.
BERT’s key technical innovation is applying the bidirectional training of Transformer, a popular attention model, to language modelling. It has caused a stir in the Machine Learning community by presenting state-of-the-art results in a wide variety of NLP tasks.
What makes decision trees special in the realm of ML models is really their clarity of information representation. The “knowledge” learned by a decision tree through training is directly formulated into a hierarchical structure.
A brief introduction to feature engineering, covering coordinate transformation, continuous data, categorical features, missing values, normalization, and more.
To prepare data for both analytics and machine learning initiatives teams can accelerate machine learning and data science projects to deliver an immersive business consumer experience that accelerates and automates the data-to-insight pipeline by following six critical steps.
We explain the key differences between explainability and interpretability and why they're so important for machine learning and AI, before taking a look at several techniques and methods for improving machine learning interpretability.
This article provides an overview of recent trends in machine learning and data science automation tools and addresses how those tools will change data science.
This is a collection of data science, machine learning, analytics, and AI predictions for next year from a number of top industry organizations. See what the insiders feel is on the horizon for 2019!
At Figure Eight, we're big believers in active learning. We think it holds the promise to better models, and that it's just about to go mainstream. In our new eBook, An Introduction to Active Learning, we cover the essentials. Download now!
This tutorial helps explain the central limit theorem, covering populations and samples, sampling distribution, intuition, and contains a useful video so you can continue your learning.
Also 5 Data Science Projects That Will Get You Hired in 2018; Top 20 Python AI and Machine Learning Open Source Projects; Neural network AI is simple. So... Stop pretending you are a genius.
But it’s hard to avoid becoming a generalist if you don’t know which common problem classes you could specialize in in the fist place. That’s why I put together a list of the five problem classes that are often lumped together under the “data science” heading.
In this post we summarise some of the key developments in deep learning in the second half of 2018, before briefly discussing the road ahead for the deep learning community.
This article teaches you how to use transfer learning to solve image classification problems. A practical example using Keras and its pre-trained models is given for demonstration purposes.
We discuss several explainability techniques being championed today, including LOCO (leave one column out), permutation impact, and LIME (local interpretable model-agnostic explanations).
In this post, I will show you how you can tune the hyperparameters of your existing keras models using Hyperas and run everything in a Google Colab Notebook.
As we bid farewell to one year and look to ring in another, KDnuggets has solicited opinions from numerous Machine Learning and AI experts as to the most important developments of 2018 and their 2019 key trend predictions.
We clarify some important and often-overlooked distinctions between Machine Learning and Data Science, covering education, scalable vs non-scalable jobs, career paths, and more.
An overview of the current situation for data scientists, from its origins and history, to the recent growth in job postings, and looking at what changes the future might bring.
When I heard about Machine Learning I couldn't contain the amazement. I was not able to get my mind around the fact, that unlike normal software programs - which I was accustomed to - I wouldn't even have to teach a computer the "how" in detail about all the future scenarios up front.
We report on the most popular IDE and Editors, based on our poll. Jupyter is the favorite across all regions and employment types, but there is competition for no. 2 and no. 3 spots.
In an effort to further refine our internal models, this post will present an overview of Aurélien Géron's Machine Learning Project Checklist, as seen in his bestselling book, "Hands-On Machine Learning with Scikit-Learn & TensorFlow."
We examine typical mistakes in Data Science process, including wrong data visualization, incorrect processing of missing values, wrong transformation of categorical variables, and more. Learn what to avoid!
The aim of this article is to give you a good understanding of existing, traditional model interpretation methods, their limitations and challenges. We will also cover the classic model accuracy vs. model interpretability trade-off and finally take a look at the major strategies for model interpretation.
There are many techniques to detect and optionally remove outliers from a dataset. In this blog post, we show an implementation in KNIME Analytics Platform of four of the most frequently used - traditional and novel - techniques for outlier detection.
A demonstration using an analysis of Berlin rental prices, covering how to extract data from the web and clean it, gaining deeper insights, engineering of features using external APIs, and more.
When people want to launch data science careers but haven't made the first move, they're in a scenario that's understandably daunting and full of uncertainty. Here are six steps to get started.
Download this immediately useful book chapter, and learn how to create derived variables, which allow the statistical and Data Science modeling to incorporate human insights.
The best way to create better data science projects that employers want to see is to provide a business impact. This article highlights the process using customer churn prediction in R as a case-study.
It’s important to understand why we should do it so that we can be sure it’s a valuable investment. Class balancing techniques are only really necessary when we actually care about the minority classes.
Review of 2018 and Predictions for 2019 from our panel of experts, including Meta Brown, Tom Davenport, Carla Gentry, Bob E Hayes, Cassie Kozyrkov, Doug Laney, Bill Schmarzo, Kate Strachnyi, Ronald van Loon, Favio Vazquez, and Jen Underwood.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.