- KDnuggets™ News 19:n27, Jul 24: Bayesian deep learning and near-term quantum computers; DeepMind’s CASP13 Protein Folding Upset Summary - Jul 24, 2019.
This week on KDnuggets: Learn how DeepMind dominated the last CASP competition for advancing protein folding models; Bayesian deep learning and near-term quantum computers: A cautionary tale in quantum machine learning; The Evolution of a ggplot; Adapters: A Compact and Extensible Transfer Learning Method for NLP; 12 Things I Learned During My First Year as a Machine Learning Engineer; Things I Learned From the SciPy 2019 Lightning Talks; and much more!
Tags: Bayesian, Deep Learning, DeepMind, Kaggle, Machine Learning Engineer, NLP, Quantum Computing, SciPy, Transfer Learning
- Adapters: A Compact and Extensible Transfer Learning Method for NLP - Jul 18, 2019.
Adapters obtain comparable results to BERT on several NLP tasks while achieving parameter efficiency.
Tags: BERT, NLP, Transfer Learning, Transformer
- Scaling a Massive State-of-the-art Deep Learning Model in Production - Jul 15, 2019.
A new NLP text writing app based on OpenAI's GPT-2 aims to write with you -- whenever you ask. Find out how the developers setup and deployed their model into production from an engineer working on the team.
Tags: Deep Learning, Deployment, NLP, OpenAI, Scalability, Transformer
- Pre-training, Transformers, and Bi-directionality - Jul 12, 2019.
Bidirectional Encoder Representations from Transformers BERT (Devlin et al., 2018) is a language representation model that combines the power of pre-training with the bi-directionality of the Transformer’s encoder (Vaswani et al., 2017). BERT improves the state-of-the-art performance on a wide array of downstream NLP tasks with minimal additional task-specific training.
Tags: AISC, BERT, NLP, Training, Transformer
- Datarama: Sr Machine Learning (NLP) Engineer [Singapore] - Jul 12, 2019.
Datarama is seeking a Senior Machine Learning Engineer in Singapore, to assist their team with technological enhancement, design and develop deep learning demonstrations and solutions, and delivering deep learning expertise to the Data Science Team.
Tags: Datarama, Machine Learning Engineer, NLP, Singapore
- A Gentle Guide to Starting Your NLP Project with AllenNLP - Jul 10, 2019.
For those who aren’t familiar with AllenNLP, I will give a brief overview of the library and let you know the advantages of integrating it to your project.
Tags: Allen Institute, NLP, Python, Sentiment Analysis
- KDnuggets™ News 19:n25, Jul 10: 5 Probability Distributions for Data Scientists; What the Machine Learning Engineer Job is Really Like - Jul 10, 2019.
This edition of the KDnuggets newsletter is double-sized after taking the holiday week off. Learn about probability distributions every data scientist should know, what the machine learning engineering job is like, making the most money with the least amount of risk, the difference between NLP and NLU, get a take on Nvidia's new data science workstation, and much, much more.
Tags: Data Science, Data Scientist, Distribution, Machine Learning, Machine Learning Engineer, NLP, NVIDIA, Probability, Risk Modeling
- Practical Speech Recognition with Python: The Basics - Jul 9, 2019.
Do you fear implementing speech recognition in your Python apps? Read this tutorial for a simple approach to getting practical with speech recognition using open source Python libraries.
Tags: Google, NLP, Python, Speech Recognition
NLP vs. NLU: from Understanding a Language to Its Processing - Jul 3, 2019.
As AI progresses and the technology becomes more sophisticated, we expect existing techniques to evolve. With these changes, will the well-founded natural language processing give way to natural language understanding? Or, are the two concepts subtly distinct to hold their own niche in AI?
Tags: AI, NLP, NLU, Sciforce
- Examining the Transformer Architecture – Part 2: A Brief Description of How Transformers Work - Jul 2, 2019.
As The Transformer may become the new NLP standard, this review explores its architecture along with a comparison to existing approaches by RNN.
Tags: BERT, Deep Learning, Exxact, GPU, NLP, Recurrent Neural Networks, Transfer Learning, Transformer
XLNet Outperforms BERT on Several NLP Tasks - Jul 1, 2019.
XLNet is a new pretraining method for NLP that achieves state-of-the-art results on several NLP tasks.
Tags: BERT, NLP, Performance
- KDnuggets™ News 19:n24, Jun 26: Understand Cloud Services; Pandas Tips & Tricks; Master Data Preparation w/ Python - Jun 26, 2019.
Happy summer! This week on KDnuggets: Understanding Cloud Data Services; How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat; 7 Steps to Mastering Data Preparation for Machine Learning with Python; Examining the Transformer Architecture: The OpenAI GPT-2 Controversy; Data Literacy: Using the Socratic Method; and much more!
Tags: Cloud, Data Preparation, Machine Learning, NLP, OpenAI, Pandas, Python
- Natural Language Processing Q&A - Jun 24, 2019.
In this Q&A, Jos Martin, Senior Engineering Manager at MathWorks, discusses recent NLP developments and the applications that are benefitting from the technology.
Tags: LSTM, Machine Translation, MathWorks, NLP
- Natural Language Interface to DataTable - Jun 21, 2019.
You have to write SQL queries to query data from a relational database. Sometimes, you even have to write complex queries to do that. Won't it be amazing if you could use a chatbot to retrieve data from a database using simple English? That's what this tutorial is all about.
Tags: AI, Chatbot, Natural Language Processing, NLP
- Examining the Transformer Architecture: The OpenAI GPT-2 Controversy - Jun 20, 2019.
GPT-2 is a generative model, created by OpenAI, trained on 40GB of Internet to predict the next word. And OpenAI found this model to be SO good that they did not release the fully trained model due to their concerns about malicious applications of the technology.
Tags: AI, Architecture, GPT-2, NLP, OpenAI, Transformer
Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise - Jun 18, 2019.
The Spark NLP library has become a popular AI framework that delivers speed and scalability to your projects. Check out what's under the hood and learn about how to getting started leveraging Spark NLP from John Snow Labs.
Tags: Apache Spark, Enterprise, John Snow Labs, NLP, Spark NLP
- Monash University: Academic Opportunities in Dialogue Research [Melbourne, Australia] - Jun 18, 2019.
Seeking outstanding academics who want to join this world-class team to deliver the highest quality teaching and research that will shape the future of AI for conversational assistants, human-robot interaction, customer service, and many other application domains.
Tags: Australia, Melbourne, Monash University, NLP, Research
NLP and Computer Vision Integrated - Jun 5, 2019.
Computer vision and NLP developed as separate fields, and researchers are now combining these tasks to solve long-standing problems across multiple disciplines.
Tags: Computer Vision, NLP, Sciforce
- KDnuggets™ News 19:n21, Jun 5: Transitioning your Career to Data Science; 11 top Data Science, Machine Learning platforms; 7 Steps to Mastering Intermediate ML w. Python - Jun 5, 2019.
The results of KDnuggets 20th Annual Software Poll; How to transition to a Data Science career; Mastering Intermediate Machine Learning with Python ; Understanding Natural Language Processing (NLP); Backprop as applied to LSTM, and much more.
Tags: Backpropagation, Data Science Platform, LSTM, Machine Learning, NLP, Python
- Analyzing Tweets with NLP in Minutes with Spark, Optimus and Twint - May 24, 2019.
Social media has been gold for studying the way people communicate and behave, in this article I’ll show you the easiest way of analyzing tweets without the Twitter API and scalable for Big Data.
Pages: 1 2
Tags: Apache Spark, Big Data, Deep Learning, Machine Learning, NLP, Optimus, Python, Twint
- Your Guide to Natural Language Processing (NLP) - May 23, 2019.
This extensive post covers NLP use cases, basic examples, Tokenization, Stop Words Removal, Stemming, Lemmatization, Topic Modeling, the future of NLP, and more.
Tags: AI, Data Science, Machine Learning, Natural Language Processing, NLP, Tokenization
- When Too Likely Human Means Not Human: Detecting Automatically Generated Text - May 23, 2019.
Passably-human automated text generation is a reality. How do we best go about detecting it? As it turns out, being too predictably human may actually be a reasonably good indicator of not being human at all.
Tags: Generative Models, NLP, Text Analytics
- Extracting Knowledge from Knowledge Graphs Using Facebook’s Pytorch-BigGraph - May 22, 2019.
We are using the state-of-the-art Deep Learning tools to build a model for predict a word using the surrounding words as labels.
Pages: 1 2
Tags: Deep Learning, Facebook, Machine Learning, NLP, Python, PyTorch, word2vec
- Brookhaven National Laboratory: Postdoc in Materials Informatics [Upton, NY] - May 14, 2019.
Seeking candidates to develop and apply information retrieval, information extraction, and various Natural Language Processing (NLP) techniques to the scientific literature in materials science and crystallography for the purpose of building prototype computational data systems.
Tags: Brookhaven National Laboratory, Information Retrieval, NLP, NY, Postdoc, Research, Upton
- A Complete Exploratory Data Analysis and Visualization for Text Data: Combine Visualization and NLP to Generate Insights - May 9, 2019.
Visually representing the content of a text document is one of the most important tasks in the field of text mining as a Data Scientist or NLP specialist. However, there are some gaps between visualizing unstructured (text) data and structured data.
Pages: 1 2
Tags: Data Visualization, NLP, Plotly, Python, Text Analytics
- Build Your First Chatbot Using Python & NLTK - May 1, 2019.
Today we will learn to create a simple chat assistant or chatbot using Python’s NLTK library.
Tags: Chatbot, NLP, NLTK, Python
- Building a Flask API to Automatically Extract Named Entities Using SpaCy - Apr 17, 2019.
This article discusses how to use the Named Entity Recognition module in spaCy to identify people, organizations, or locations in text, then deploy a Python API with Flask.
Tags: API, Flask, NLP, Python
- KDnuggets™ News 19:n14, Apr 10: Which Data Science/ML methods and algorithms you used? Predict Age and Gender Using Neural Nets - Apr 10, 2019.
Getting started with NLP using the PyTorch framework; Building a Recommender System; Advice for New Data Scientists; All you need to know about text preprocessing for NLP and Machine Learning; Advanced Keras - Constructing Complex Custom Losses and Metrics; Top 8 Data Science Use Cases in Gaming
Tags: Career Advice, Convolutional Neural Networks, Courses, Data Preprocessing, Neural Networks, NLP, PyTorch, Recommender Systems
- All you need to know about text preprocessing for NLP and Machine Learning - Apr 9, 2019.
We present a comprehensive introduction to text preprocessing, covering the different techniques including stemming, lemmatization, noise removal, normalization, with examples and explanations into when you should use each of them.
Tags: Data Preprocessing, Machine Learning, NLP, Python, Text Analysis, Text Mining

Another 10 Free Must-See Courses for Machine Learning and Data Science - Apr 5, 2019.
Check out another follow-up collection of free machine learning and data science courses to give you some spring study ideas.
Tags: AI, Data Science, Deep Learning, Keras, Machine Learning, NLP, Reinforcement Learning, TensorFlow, U. of Washington, UC Berkeley, Unsupervised Learning
- Getting started with NLP using the PyTorch framework - Apr 3, 2019.
We discuss the classes that PyTorch provides for helping with Natural Language Processing (NLP) and how they can be used for related tasks using recurrent layers.
Tags: Neural Networks, NLP, PyTorch, Recurrent Neural Networks
- What Does GPT-2 Think About the AI Arms Race? - Apr 1, 2019.
It may be April first, but that doesn't mean you will necessarily be fooled by GPT-2's views on the AI arms race. Why not have a read for fun and to see what the language generation model is capable of.
Tags: AI, GPT-2, Natural Language Generation, NLP
- KDnuggets™ News 19:n11, Mar 20: Another 10 Free Must-Read Books for Data Science; 19 Inspiring Women in AI, Big Data, Machine Learning - Mar 20, 2019.
Also: Who is a typical Data Scientist in 2019?; The Pareto Principle for Data Scientists; My favorite mind-blowing Machine Learning/AI breakthroughs; Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision; Advanced Keras - Accurately Resuming a Training Process
Tags: AI, Big Data, Books, Data Science, Keras, Machine Learning, NLP, Transfer Learning, Women
- Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision - Mar 15, 2019.
In this blog, I’ll walk you through a personal project in which I cheaply built a classifier to detect anti-semitic tweets, with no public dataset available, by combining weak supervision and transfer learning.
Pages: 1 2
Tags: Bias, fast.ai, NLP, Python, Text Classification, Transfer Learning, Twitter, ULMFiT
- Beyond news contents: the role of social context for fake news detection - Mar 7, 2019.
Today we’re looking at a more general fake news problem: detecting fake news that is being spread on a social network. This is a summary of a recent paper which demonstrates why we should also look at the social context: the publishers and the users spreading the information!
Tags: Fake News, NLP, Social Media
- Top KDnuggets tweets, Feb 27 – Mar 05: How to Setup a Python Environment for Machine Learning; How to do Everything in Computer Vision - Mar 6, 2019.
Also Python Data Science for Beginners; Deep Learning for Natural Language Processing (NLP) - using RNNs and CNNs.
Tags: NLP, Python, Top tweets
- Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention - Mar 6, 2019.
In this post, the author shows how BERT can mimic a Bag-of-Words model. The visualization tool from Part 1 is extended to probe deeper into the mind of BERT, to expose the neurons that give BERT its shape-shifting superpowers.
Tags: Attention, BERT, NLP, Word Embeddings
- OpenAI’s GPT-2: the model, the hype, and the controversy - Mar 4, 2019.
OpenAI recently released a very large language model called GPT-2. Controversially, they decided not to release the data or the parameters of their biggest model, citing concerns about potential abuse. Read this researcher's take on the issue.
Tags: AI, Ethics, GPT-2, Hype, NLP, OpenAI
- Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters - Feb 27, 2019.
Google’s BERT algorithm has emerged as a sort of “one model to rule them all.” BERT builds on two key ideas that have been responsible for many of the recent advances in NLP: (1) the transformer architecture and (2) unsupervised pre-training.
Tags: Attention, BERT, NLP, Word Embeddings
- Deep Learning for Natural Language Processing (NLP) – using RNNs & CNNs - Feb 21, 2019.
We investigate several Natural Language Processing tasks and explain how Deep Learning can help, looking at Language Modeling, Sentiment Analysis, Language Translation, and more.
Tags: Convolutional Neural Networks, Deep Learning, NLP, Recurrent Neural Networks, Sentiment Analysis
- Word Embeddings in NLP and its Applications - Feb 20, 2019.
Word embeddings such as Word2Vec is a key AI method that bridges the human understanding of language to that of a machine and is essential to solving many NLP problems. Here we discuss applications of Word2Vec to Survey responses, comment analysis, recommendation engines, and more.
Tags: Applications, NLP, Recommender Systems, Word Embeddings, word2vec
- State of the art in AI and Machine Learning – highlights of papers with code - Feb 20, 2019.
We introduce papers with code, the free and open resource of state-of-the-art Machine Learning papers, code and evaluation tables.
Tags: AI, Machine Learning, Multitask Learning, NLP, Papers with code, Recommender Systems, Semantic Segmentation, TensorFlow, Transfer Learning
- Are BERT Features InterBERTible? - Feb 19, 2019.
This is a short analysis of the interpretability of BERT contextual word representations. Does BERT learn a semantic vector representation like Word2Vec?
Tags: BERT, Interpretability, NLP, Word Embeddings
- KDnuggets™ News 19:n07, Feb 13: The Best and Worst Data Visualizations of 2018; Gartner 2019 Magic Quadrant for Data Science Platforms - Feb 13, 2019.
Also: Data-science? Agile? Cycles?; How I used NLP (Spacy) to screen Data Science Resumes; Neural Networks - an Intuition; A Quick Guide to Feature Engineering; Understanding Gradient Boosting Machines
Tags: Agile, Data Science, Data Visualization, Gartner, Machine Learning, Magic Quadrant, NLP
- Natural Language Processing for Social Media - Feb 12, 2019.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about Natural Language Processing and how it is used in social media analytics.
Tags: Interview, NLP, Social Media
- How I used NLP (Spacy) to screen Data Science Resumes - Feb 6, 2019.
A real life example of when using NLP can help filter down a list of candidates for a job opening, with full source code and methodology.
Tags: Data Science, Hiring, NLP
- ELMo: Contextual Language Embedding - Jan 31, 2019.
Create a semantic search engine using deep contextualised language representations from ELMo and why context is everything in NLP.
Tags: Data Visualization, NLP, Plotly, Python, Word Embeddings
- Building an image search service from scratch - Jan 30, 2019.
By the end of this post, you should be able to build a quick semantic search model from scratch, no matter the size of your dataset.
Pages: 1 2
Tags: Computer Vision, Image Recognition, NLP, Search, Search Engine, Word Embeddings
What were the most significant machine learning/AI advances in 2018? - Jan 22, 2019.
2018 was an exciting year for Machine Learning and AI. We saw “smarter” AI, real-world applications, improvements in underlying algorithms and a greater discussion on the impact of AI on human civilization. In this post, we discuss some of the highlights.
Tags: 2019 Predictions, AI, AlphaZero, BERT, Deep Learning, Machine Learning, NLP, Trends
- 10 Exciting Ideas of 2018 in NLP - Jan 16, 2019.
We outline a selection of exciting developments in NLP from the last year, and include useful recent papers and images to help further assist with your learning.
Tags: BERT, Bias, ICLR, Machine Translation, NLP, Transformer, Unsupervised Learning
- Word Embeddings & Self-Supervised Learning, Explained - Jan 16, 2019.
There are many algorithms to learn word embeddings. Here, we consider only one of them: word2vec, and only one version of word2vec called skip-gram, which works well in practice.
Tags: Andriy Burkov, NLP, Word Embeddings, word2vec
- KDnuggets™ News 19:n03, Jan 16: Top 10 Books on NLP and Text Analysis; End To End Guide For Machine Learning Projects - Jan 16, 2019.
Also: Why Vegetarians Miss Fewer Flights - Five Bizarre Insights from Data; 4 Myths of Big Data and 4 Ways to Improve with Deep Data; The Role of the Data Engineer is Changing; How to solve 90% of NLP problems: a step-by-step guide
Tags: Big Data, Data Engineer, Data Science, Insights, Machine Learning, Myths, NLP, Text Analysis
How to solve 90% of NLP problems: a step-by-step guide - Jan 14, 2019.
Read this insightful, step-by-step article on how to use machine learning to understand and leverage text.
Tags: LIME, NLP, Text Analytics, Text Classification, Word Embeddings, word2vec
Top 10 Books on NLP and Text Analysis - Jan 9, 2019.
When it comes to choosing the right book, you become immediately overwhelmed with the abundance of possibilities. In this review, we have collected our Top 10 NLP and Text Analysis Books of all time, ranging from beginners to experts.
Tags: Books, NLP, Text Analysis
NLP Overview: Modern Deep Learning Techniques Applied to Natural Language Processing - Jan 8, 2019.
Trying to keep up with advancements at the overlap of neural networks and natural language processing can be troublesome. That's where the today's spotlighted resource comes in.
Tags: Deep Learning, Neural Networks, NLP
- Comparison of the Text Distance Metrics - Jan 7, 2019.
There are many different approaches of how to compare two texts (strings of characters). Each has its own advantages and disadvantages and is good only for a range of specific use cases.
Tags: Metrics, NLP, Text Analytics
- Approaches to Text Summarization: An Overview - Jan 3, 2019.
This article will present the main approaches to text summarization currently employed, as well as discuss some of their characteristics.
Tags: NLP, Text Analytics, Text Summarization
- Comparison of the Top Speech Processing APIs - Dec 28, 2018.
There are two main tasks in speech processing. First one is to transform speech to text. The second is to convert the text into human speech. We will describe the general aspects of each API and then compare their main features in the table.
Tags: Amazon, API, Google Cloud, IBM Watson, Microsoft Azure, NLP, Speech Recognition
- BERT: State of the Art NLP Model, Explained - Dec 26, 2018.
BERT’s key technical innovation is applying the bidirectional training of Transformer, a popular attention model, to language modelling. It has caused a stir in the Machine Learning community by presenting state-of-the-art results in a wide variety of NLP tasks.
Tags: Explained, Modeling, Neural Networks, NLP, Transformer
10 More Must-See Free Courses for Machine Learning and Data Science - Dec 20, 2018.
Have a look at this follow-up collection of free machine learning and data science courses to give you some winter study ideas.
Tags: AI, Algorithms, Big Data, Data Science, Deep Learning, Machine Learning, MIT, NLP, Reinforcement Learning, U. of Washington, UC Berkeley, Yandex
- KDnuggets™ News 18:n48, Dec 19: Why You Shouldn’t be a Data Science Generalist; Industry Data Science & Machine Learning 2019 Predictions - Dec 19, 2018.
Also: Top Stories of 2018; NLP Breakthrough Imagenet Moment has arrived; Four Approaches to Explaining AI and Machine Learning; Solve any Image Classification Problem Quickly and Easily
Tags: 2019 Predictions, AI, Analytics, Career Advice, Data Science, Machine Learning, NLP
- NLP Breakthrough Imagenet Moment has arrived - Dec 14, 2018.
A comprehensive review of the current state of Natural Language Processing, covering the process from shallow to deep pre-training, what's in an ImageNet, the case for language modelling, and more.
Tags: Deep Learning, ImageNet, NLP, OpenAI, ULMFiT
- State of Deep Learning and Major Advances: H2 2018 Review - Dec 13, 2018.
In this post we summarise some of the key developments in deep learning in the second half of 2018, before briefly discussing the road ahead for the deep learning community.
Tags: Deep Learning, Generative Adversarial Network, NLP, PyTorch, TensorFlow, Trends
- P&G: Data Scientist – Machine Learning/NLP [Cincinnati, OH] - Dec 11, 2018.
P&G is seeking a Data Scientist - Machine Learning/NLP in Cincinnati, OH. In this role you will have multiple projects on which you will leverage machine learning tools to solve these types of problems.
Tags: Cincinnati, Data Scientist, Machine Learning, NLP, OH, Procter and Gamble
- Introduction to Named Entity Recognition - Dec 11, 2018.
Named Entity Recognition is a tool which invariably comes handy when we do Natural Language Processing tasks. Read on to find out how.
Pages: 1 2
Tags: NLP, Python, Text Classification
- Word Morphing – an original idea - Nov 20, 2018.
In this post, we describe how to utilise word2vec's embeddings and A* search algorithm to morph between words.
Tags: NLP, Python, Text Classification
- Sorry I didn’t get that! How to understand what your users want - Nov 16, 2018.
Creating a chatbot is difficult, it involves knowledge in many AI-Hard tasks, such as Natural Language Understanding, Machine Comprehension, Inference, or Automatic Language Generation (in fact, solving these tasks is close to solving AI) and large human effort is required.
Tags: AI, Chatbot, NLP
- Multi-Class Text Classification with Doc2Vec & Logistic Regression - Nov 9, 2018.
Doc2vec is an NLP tool for representing documents as a vector and is a generalizing of the word2vec method. In order to understand doc2vec, it is advisable to understand word2vec approach.
Tags: Logistic Regression, NLP, Python, Text Classification
10 Free Must-See Courses for Machine Learning and Data Science - Nov 8, 2018.
Check out a collection of free machine learning and data science courses to kick off your winter learning season.
Tags: Data Science, Deep Learning, fast.ai, Google, Linear Algebra, Machine Learning, MIT, NLP, Reinforcement Learning, Stanford, Yandex
- KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP - Nov 7, 2018.
Also: Machine Learning Classification: A Dataset-based Pictorial; Quantum Machine Learning: A look at myths, realities, and future projections; Multi-Class Text Classification Model Comparison and Selection; Top 13 Python Deep Learning Libraries
Tags: Classification, Data Science Skills, Deep Learning, Myths, NLP, Python, Text Classification
- Text Preprocessing in Python: Steps, Tools, and Examples - Nov 6, 2018.
We outline the basic steps of text preprocessing, which are needed for transferring text from human language to machine-readable format for further processing. We will also discuss text preprocessing tools.
Pages: 1 2
Tags: Data Preparation, NLP, Python, Text Analysis, Text Mining, Tokenization
- Data Representation for Natural Language Processing Tasks - Nov 2, 2018.
In NLP we must find a way to represent our data (a series of texts) to our systems (e.g. a text classifier). As Yoav Goldberg asks, "How can we encode such categorical data in a way which is amenable for us by a statistical classifier?" Enter the word vector.
Tags: NLP, Representation, Text Mining, Word Embeddings, word2vec
- Multi-Class Text Classification Model Comparison and Selection - Nov 1, 2018.
This is what we are going to do today: use everything that we have presented about text classification in the previous articles (and more) and comparing between the text classification models we trained in order to choose the most accurate one for our problem.
Pages: 1 2
Tags: Modeling, NLP, Python, Text Classification
- Labeling Unstructured Text for Meaning to Achieve Predictive Lift - Oct 31, 2018.
In this post, we examine several advance NLP techniques, including: labeling nouns and noun phrases for meaning, labeling (most often) adverbs and adjectives for sentiment, and labeling verbs for intent.
Tags: NLP, Overfitting, Text Mining, Unstructured data
How Machines Understand Our Language: An Introduction to Natural Language Processing - Oct 31, 2018.
The applications of NLP are endless. This is how a machine classifies whether an email is spam or not, if a review is positive or negative, and how a search engine recognizes what type of person you are based on the content of your query to customize the response accordingly.
Tags: Machine Learning, NLP, NLTK, Python, Tokenization
- KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn - Oct 31, 2018.
Also: Generative Adversarial Networks - Paper Reading Road Map; How I Learned to Stop Worrying and Love Uncertainty; Implementing Automated Machine Learning Systems with Open Source Tools; Notes on Feature Preprocessing: The What, the Why, and the How
Tags: Automated Machine Learning, Data Preprocessing, Deep Learning, Generative Adversarial Network, Keras, NLP, Python, scikit-learn
Named Entity Recognition and Classification with Scikit-Learn - Oct 25, 2018.
Named Entity Recognition and Classification is a process of recognizing information units like names, including person, organization and location names, and numeric expressions from unstructured text. The goal is to develop practical and domain-independent techniques in order to detect named entities with high accuracy automatically.
Pages: 1 2
Tags: NLP, Text Classification, Text Mining
- Monash University: Academic Opportunities in Dialogue Research [Melbourne, Australia] - Oct 25, 2018.
Monash University is seeking to fill multiple academic opportunities in Dialogue Research in Melbourne, Australia: Level B Lecturer (equivalent to Assistant Professor in North America), Level C Senior Lecturer (equivalent to Associate Professor in North America).
Tags: Academics, Australia, Melbourne, Monash University, NLP
- Building a Question-Answering System from Scratch - Oct 24, 2018.
This part will focus on introducing Facebook sentence embeddings and how it can be used in building QA systems. In the future parts, we will try to implement deep learning techniques, specifically sequence modeling for this problem.
Tags: Machine Learning, NLP, Question answering
The Main Approaches to Natural Language Processing Tasks - Oct 17, 2018.
Let's have a look at the main approaches to NLP tasks that we have at our disposal. We will then have a look at the concrete NLP tasks we can tackle with said approaches.
Tags: Machine Learning, Neural Networks, NLP, Text Classification
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy - Oct 16, 2018.
This post spotlights 5 data science projects, all of which are open source and are present on GitHub repositories, focusing on high level machine learning libraries and low level support tools.
Tags: Data Science, Docker, Ensemble Methods, fast.ai, GitHub, Machine Learning, NLP, Python
- Sequence Modeling with Neural Networks – Part I - Oct 3, 2018.
In the context of this post, we will focus on modeling sequences as a well-known data structure and will study its specific learning framework.
Tags: Neural Networks, NLP, Recurrent Neural Networks, Sequences
- KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R - Oct 3, 2018.
Also: Introducing VisualData: A Search Engine for Computer Vision Datasets; Raspberry Pi IoT Projects for Fun and Profit; Recent Advances for a Better Understanding of Deep Learning; Basic Image Data Analysis Using Python - Part 3; Introduction to Deep Learning
Tags: Computer Vision, Deep Learning, Machine Learning, Mathematics, NLP, R, Transfer Learning
- More Effective Transfer Learning for NLP - Oct 1, 2018.
Until recently, the natural language processing community was lacking its ImageNet equivalent — a standardized dataset and training objective to use for training base models.
Tags: Neural Networks, NLP, Transfer Learning, Word Embeddings
- ODSC India Highlights: Deep Learning Revolution in Speech, AI Engineer vs Data Scientist, and Reinforcement Learning for Enterprise - Sep 26, 2018.
Key takeaways and highlights from ODSC India 2018 conference about the latest trends, breakthroughs and revolutions in the field of Data Science and Artificial Intelligence
Tags: AI, Chatbot, Deep Learning, Machine Learning, NLP, ODSC, Recurrent Neural Networks, Reinforcement Learning, Speech Recognition
- Beyond Refuge: Natural Language Understanding Engineer [Remote Position] - Sep 25, 2018.
Beyond Refuge is seeking a Natural Language Understanding Engineer passionate about social change and getting involved on a leadership level with a startup-like idea within an innovative, agile nonprofit.
Tags: Engineer, NLP, United Nations
- Free resources to learn Natural Language Processing - Sep 18, 2018.
An extensive list of free resources to help you learn Natural Language Processing, including explanations on Text Classification, Sequence Labeling, Machine Translation and more.
Tags: Beginners, Machine Learning, Machine Translation, NLP, Sentiment Analysis, Text Classification
- The Data Science of “Someone Like You” or Sentiment Analysis of Adele’s Songs - Sep 13, 2018.
An extensive analysis of Adele's songs using Natural Language Processing (NLP) on the lyrics, to uncover the underlying emotions and sentiments.
Pages: 1 2
Tags: Adele, Music, NLP, Sentiment Analysis
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
Tags: NLP, Python, Text Analytics, Text Classification, Text Mining
Deep Learning for NLP: An Overview of Recent Trends - Sep 5, 2018.
A new paper discusses some of the recent trends in deep learning based natural language processing (NLP) systems and applications. The focus is on the review and comparison of models and methods that have achieved state-of-the-art (SOTA) results on various NLP tasks and some of the current best practices for applying deep learning in NLP.
Pages: 1 2
Tags: Deep Learning, NLP, Word Embeddings, word2vec
Topic Modeling with LSA, PLSA, LDA & lda2Vec - Aug 30, 2018.
This article is a comprehensive overview of Topic Modeling and its associated techniques.
Tags: LDA, NLP, Text Analytics, Topic Modeling
- Word Vectors in Natural Language Processing: Global Vectors (GloVe) - Aug 29, 2018.
A well-known model that learns vectors or words from their co-occurrence information is GlobalVectors (GloVe). While word2vec is a predictive model — a feed-forward neural network that learns vectors to improve the predictive ability, GloVe is a count-based model.
Tags: NLP, Sciforce, Text Analytics, word2vec
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
Tags: NLP, Python, scikit-learn, Text Classification, Text Mining
- Emotion and Sentiment Analysis: A Practitioner’s Guide to NLP - Aug 24, 2018.
Sentiment analysis is widely used, especially as a part of social media analysis for any domain, be it a business, a recent movie, or a product launch, to understand its reception by the people and what they think of it based on their opinions or, you guessed it, sentiment!
Tags: NLP, Text Analytics, Workflow
Comparison of the Most Useful Text Processing APIs - Aug 23, 2018.
There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison.
Tags: NLP, Text Analytics, Text Mining
- Named Entity Recognition: A Practitioner’s Guide to NLP - Aug 17, 2018.
Named entity recognition (NER) , also known as entity chunking/extraction , is a popular technique used in information extraction to identify and segment the named entities and classify or categorize them under various predefined classes.
Tags: NLP, Text Analytics, Workflow
Understanding Language Syntax and Structure: A Practitioner’s Guide to NLP - Aug 10, 2018.
Knowledge about the structure and syntax of language is helpful in many areas like text processing, annotation, and parsing for further operations such as text classification or summarization.
Tags: NLP, Text Analytics, Workflow
GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows - Aug 8, 2018.
This post includes a wide spectrum of data science projects, all of which are open source and are present on GitHub repositories.
Tags: Automated Machine Learning, Data Science, Data Visualization, GitHub, Keras, Machine Learning, MLflow, NLP, Python, Workflow
- Text Wrangling & Pre-processing: A Practitioner’s Guide to NLP - Aug 3, 2018.
I will highlight some of the most important steps which are used heavily in Natural Language Processing (NLP) pipelines and I frequently use them in my NLP projects.
Tags: Data Preprocessing, Data Wrangling, NLP, Text Analytics, Workflow
- Data Retrieval with Web Scraping: A Practitioner’s Guide to NLP - Jul 26, 2018.
Proven and tested hands-on strategies to tackle NLP tasks.
Tags: Data Preprocessing, NLP, Text Analytics, Workflow
Comparison of Top 6 Python NLP Libraries - Jul 23, 2018.
Today, we want to outline and compare the most popular and helpful natural language processing libraries, based on our experience.
Tags: NLP, Python
- Efficient Graph-based Word Sense Induction - Jul 18, 2018.
This paper describes a set of algorithms for Natural Language Processing (NLP) that match or exceed the state of the art on several evaluation tasks, while also being much more computationally efficient.
Tags: AI, Machine Learning, NLP, Word Embeddings
Text Mining on the Command Line - Jul 13, 2018.
In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood.
Tags: Data Preparation, Data Preprocessing, NLP, Text Mining
- Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors - Jul 5, 2018.
In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.
Tags: Convolutional Neural Networks, Keras, LSTM, NLP, Python, Text Classification, Word Embeddings
- Overview and benchmark of traditional and deep learning models in text classification - Jul 3, 2018.
In this post, traditional and deep learning models in text classification will be thoroughly investigated, including a discussion into both Recurrent and Convolutional neural networks.
Tags: Deep Learning, NLP, Text Classification
30 Free Resources for Machine Learning, Deep Learning, NLP & AI - Jun 25, 2018.
Check out this collection of 30 ML, DL, NLP & AI resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
Tags: AI, Deep Learning, Machine Learning, NLP
Detecting Sarcasm with Deep Convolutional Neural Networks - Jun 21, 2018.
Detection of sarcasm is important in other areas such as affective computing and sentiment analysis because such expressions can flip the polarity of a sentence.
Tags: arXiv, Convolutional Neural Networks, NLP, Sentiment Analysis
- KDnuggets™ News 18:n24, Jun 20: Data Lakes – The evolution of data processing; Text Generation with RNNs in 4 Lines of Code - Jun 20, 2018.
How to spot a beginner Data Scientist; How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning; Statistics, Causality, and What Claims are Difficult to Swallow: Judea Pearl debates Kevin Gray; Cartoon: FIFA World Cup Football and Machine Learning
Tags: Beginners, Causality, Data Lake, Data Processing, Data Scientist, NLP, Recurrent Neural Networks, Text Analytics
- Natural Language Processing Nuggets: Getting Started with NLP - Jun 19, 2018.
Check out this collection of NLP resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
Tags: Beginners, Data Preparation, NLP, Text Mining
Generating Text with RNNs in 4 Lines of Code - Jun 14, 2018.
Want to generate text with little trouble, and without building and tuning a neural network yourself? Let's check out a project which allows you to "easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code."
Tags: Donald Trump, LSTM, NLP, Python, Recurrent Neural Networks, Twitter
- How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning - Jun 13, 2018.
An end-to-end example of how to build a system that can search objects semantically.
Pages: 1 2
Tags: Deep Learning, GitHub, Neural Networks, NLP, Semantic Analysis
5 Machine Learning Projects You Should Not Overlook, June 2018 - Jun 12, 2018.
Here is a new installment of 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out!
Tags: Interpretability, Keras, Machine Learning, Model Performance, NLP, Overlook, Recurrent Neural Networks, Visualization
- On the contribution of neural networks and word embeddings in Natural Language Processing - May 31, 2018.
In this post I will try to explain, in a very simplified way, how to apply neural networks and integrate word embeddings in text-based applications, and some of the main implicit benefits of using neural networks and word embeddings in NLP.
Tags: Neural Networks, NLP, Word Embeddings, word2vec
- NLP in Online Courses: an Overview - May 28, 2018.
This article examines several Natural Language Processing (NLP) courses across a variety of online sources and programming languages.
Tags: Coursera, edX, NLP, NLTK, Online Education, Python, Sciforce, Udemy
- Top KDnuggets tweets, May 16-22: Python eats away at R; Data Science Plan 2018 - May 23, 2018.
Also: AI is learning to see in the dark; Introducing state of the art text classification with universal language models; Top 100 Books for Data Scientists.
Tags: Image Recognition, NLP, Python vs R, Top tweets
- If chatbots are to succeed, they need this - May 22, 2018.
Can logic be used to make chatbots intelligent? In the 1960s this was taken for granted. Now we have all but forgotten the logical approach. Is it time for a revival?
Tags: AI, AlphaGo, Chatbot, Logic, NLP
- Getting Started with spaCy for Natural Language Processing - May 2, 2018.
spaCy is a Python natural language processing library specifically designed with the goal of being a useful library for implementing production-ready systems. It is particularly fast and intuitive, making it a top contender for NLP tasks.
Tags: Data Preparation, Data Preprocessing, NLP, Python, Text Analytics, Text Mining
- Implementing Deep Learning Methods and Feature Engineering for Text Data: FastText - May 1, 2018.
Overall, FastText is a framework for learning word representations and also performing robust, fast and accurate text classification. The framework is open-sourced by Facebook on GitHub.
Tags: Facebook, Feature Engineering, NLP, Python
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The GloVe Model - Apr 25, 2018.
The GloVe model stands for Global Vectors which is an unsupervised learning model which can be used to obtain dense word vectors similar to Word2Vec.
Tags: Deep Learning, Feature Engineering, NLP, Python, Text Mining
- KDnuggets™ News 18:n17, Apr 25: Python Regular Expressions Cheat Sheet; Deep Learning With Apache Spark; Building a Question Answering Model - Apr 25, 2018.
Also: Derivation of Convolutional Neural Network from Fully Connected Network Step-By-Step; Presto for Data Scientists - SQL on anything; Why Deep Learning is perfect for NLP (Natural Language Processing); Top 16 Open Source Deep Learning Libraries and Platforms
Tags: Apache Spark, Cheat Sheet, Deep Learning, NLP, Python, Question answering, SQL
Why Deep Learning is perfect for NLP (Natural Language Processing) - Apr 20, 2018.
Deep learning brings multiple benefits in learning multiple levels of representation of natural language. Here we will cover the motivation of using deep learning and distributed representation for NLP, word embeddings and several methods to perform word embeddings, and applications.
Tags: Deep Learning, Neural Networks, NLP, Packt Publishing, word2vec
- NLP – Building a Question Answering Model - Apr 20, 2018.
In this blog, I want to cover the main building blocks of a question answering model.
Tags: Chatbot, NLP, Question answering
- Understanding What is Behind Sentiment Analysis – Part 2 - Apr 20, 2018.
Fine-tuning our sentiment classifier...
Tags: Classification, NLP, Sentiment Analysis
- Let’s Admit It: We’re a Long Way from Using “Real Intelligence” in AI - Apr 19, 2018.
With the growth of AI systems and unstructured data, there is a need for an independent means of data curation, evaluation and measurement of output that does not depend on the natural language constructs of AI and creates a comparative method of how the data is processed.
Tags: AI, Machine Learning, NLP, Unstructured data
- Robust Word2Vec Models with Gensim & Applying Word2Vec Features for Machine Learning Tasks - Apr 17, 2018.
The gensim framework, created by Radim Řehůřek consists of a robust, efficient and scalable implementation of the Word2Vec model.
Tags: Feature Engineering, NLP, Python, Word Embeddings, word2vec
Top 10 Technology Trends of 2018 - Apr 13, 2018.
In this article, we will focus on the modern trends that took off well on the market by the end of 2017 and discuss the major breakthroughs expected in 2018.
Tags: AI, Blockchain, Chief Data Officer, Deep Learning, Ethics, IoT, NLP, Privacy, Top 10, Trends
- Understanding What is Behind Sentiment Analysis – Part 1 - Apr 13, 2018.
Build your first sentiment classifier in 3 steps.
Tags: Classification, NLP, Sentiment Analysis
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The Skip-gram Model - Apr 10, 2018.
Just like we discussed in the CBOW model, we need to model this Skip-gram architecture now as a deep learning classification model such that we take in the target word as our input and try to predict the context words.
Tags: Deep Learning, Feature Engineering, NLP, Python, Text Mining, Word Embeddings
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The Continuous Bag of Words (CBOW) - Apr 3, 2018.
The CBOW model architecture tries to predict the current target word (the center word) based on the source context words (surrounding words).
Tags: Deep Learning, Neural Networks, NLP, word2vec
- Understanding Feature Engineering: Deep Learning Methods for Text Data - Mar 28, 2018.
Newer, advanced strategies for taming unstructured, textual data: In this article, we will be looking at more advanced feature engineering strategies which often leverage deep learning models.
Tags: Deep Learning, Feature Engineering, NLP, Python, Text Mining