- The Best NLP with Deep Learning Course is Free - May 22, 2020.
Stanford's Natural Language Processing with Deep Learning is one of the most respected courses on the topic that you will find anywhere, and the course materials are freely available online.
- Spotting Controversy with NLP - May 21, 2020.
In this article, I’ll introduce you to a hot-topic in financial services and describe how a leading data provider is using data science and NLP to streamline how they find insights in unstructured data.
- Google Unveils TAPAS, a BERT-Based Neural Network for Querying Tables Using Natural Language - May 19, 2020.
The new neural network extends BERT to interact with tabular datasets.
- Easy Text-to-Speech with Python - May 18, 2020.
Python comes with a lot of handy and easily accessible libraries and we’re going to look at how we can deliver text-to-speech with Python in this article.
- Facebook Open Sources Blender, the Largest-Ever Open Domain Chatbot - May 15, 2020.
The new conversational agent exhibit human-like behavior in conversations about almost any topic.
- Text Mining in Python: Steps and Examples - May 12, 2020.
The majority of data exists in the textual form which is a highly unstructured format. In order to produce meaningful insights from the text data then we need to follow a method called Text Analysis.
- Chatbots in a Nutshell - May 7, 2020.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about chatbots and the ways they are used.
- KDnuggets™ News 20:n18, May 6: Five Cool Python Libraries for Data Science; NLP Recipes: Best Practices - May 6, 2020.
5 cool Python libraries for Data Science; NLP Recipes: Best Practices and Examples; Deep Learning: The Free eBook; Demystifying the AI Infrastructure Stack; and more.
- Natural Language Processing Recipes: Best Practices and Examples - May 1, 2020.
Here is an overview of another great natural language processing resource, this time from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.
- Five Cool Python Libraries for Data Science - Apr 30, 2020.
Check out these 5 cool Python libraries that the author has come across during an NLP project, and which have made their life easier.
- KDnuggets™ News 20:n17, Apr 29: The Super Duper NLP Repo; Free Machine Learning & Data Science Books & Courses for Quarantine - Apr 29, 2020.
Also: Should Data Scientists Model COVID19 and other Biological Events; Learning during a crisis (Data Science 90-day learning challenge); Data Transformation: Standardization vs Normalization; DBSCAN Clustering Algorithm in Machine Learning; Find Your Perfect Fit: A Quick Guide for Job Roles in the Data World
- The Super Duper NLP Repo: 100 Ready-to-Run Colab Notebooks - Apr 24, 2020.
Check out this repository of more than 100 freely-accessible NLP notebooks, curated from around the internet, and ready to launch in Colab with a single click.
- Top KDnuggets tweets, Apr 08-14: Mathematics for #MachineLearning: The Free eBook – KDnuggets - Apr 15, 2020.
Also Exploratory Data Analysis for Natural Language Processing: A Complete Guide to Python Tools; A professor with 20 year experience to all high school seniors (and their parents). If you were planning to enroll in college next fall - don't.
- Simple Question Answering (QA) Systems That Use Text Similarity Detection in Python - Apr 7, 2020.
How exactly are smart algorithms able to engage and communicate with us like humans? The answer lies in Question Answering systems that are built on a foundation of Machine Learning and Natural Language Processing. Let's build one here.
- Why you should NOT use MS MARCO to evaluate semantic search - Apr 2, 2020.
If we want to investigate the power and limitations of semantic vectors (pre-trained or not), we should ideally prioritize datasets that are less biased towards term-matching signals. This piece shows that the MS MARCO dataset is more biased towards those signals than we expected and that the same issues are likely present in many other datasets due to similar data collection designs.
- A Comprehensive Data Repository for Fake Health News Detection - Mar 19, 2020.
We introduce the FakeHealth, a new data repository for fake health news detection. Following a preliminary analysis to demonstrate its features, we consider additional potential directions for better identifying fake news.
- Salesforce Open Sources a Framework for Open Domain Question Answering Using Wikipedia - Mar 16, 2020.
The framework uses a multi-hop QA method to answer complex questions by reasoning through Wikipedia’s datasets.
- How To Build Your Own Feedback Analysis Solution - Mar 12, 2020.
Automating the analysis of customer feedback will sound like a great idea after reading a couple hundred reviews. Building an NLP solution to provide in-depth analysis of what your customers are thinking is a serious undertaking, and this guide helps you scope out the entire project.
- Tokenization and Text Data Preparation with TensorFlow & Keras - Mar 6, 2020.
This article will look at tokenizing and further preparing text data for feeding into a neural network using TensorFlow and Keras preprocessing tools.
- The Big Bad NLP Database: Access Nearly 300 Datasets - Feb 28, 2020.
Check out this database of nearly 300 freely-accessible NLP datasets, curated from around the internet.
- Microsoft Open Sources ZeRO and DeepSpeed: The Technologies Behind the Biggest Language Model in History - Feb 24, 2020.
The two efforts enable the training of deep learning models at massive scale.
- Illustrating the Reformer - Feb 12, 2020.
In this post, we will try to dive into the Reformer model and try to understand it with some visual guides.
- Intent Recognition with BERT using Keras and TensorFlow 2 - Feb 10, 2020.
TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!
- Microsoft Open Sources Jericho to Train Reinforcement Learning Using Linguistic Games - Feb 3, 2020.
The new framework provides an OpenAI-like environment for language-based games.
- Top 10 AI, Machine Learning Research Articles to know - Jan 30, 2020.
We’ve seen many predictions for what new advances are expected in the field of AI and machine learning. Here, we review a “data set” based on what researchers were apparently studying at the turn of the decade to take a fresh glimpse into what might come to pass in 2020.
- Generating English Pronoun Questions Using Neural Coreference Resolution - Jan 29, 2020.
This post will introduce a practical method for generating English pronoun questions from any story or article. Learn how to take an additional step toward computationally understanding language.
- A bird’s-eye view of modern AI from NeurIPS 2019 - Jan 28, 2020.
With the explosion of the field of AI/ML impacting so many applications and industries, there is great value coming out of recent progress. This review highlights many research areas covered at the NeurIPS 2019 conference recently held in Vancouver, Canada, and features many important areas of progress we expect to see in the coming year.
- Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in the Market - Jan 27, 2020.
Many of the technologies used by Uber teams have been open sourced and received accolades from the machine learning community. Let’s look at some of my favorites.
- NLP Year in Review — 2019 - Jan 23, 2020.
In this blog post, I want to highlight some of the most important stories related to machine learning and NLP that I came across in 2019.
- The Future of Machine Learning - Jan 17, 2020.
This summary overviews the keynote at TensorFlow World by Jeff Dean, Head of AI at Google, that considered the advancements of computer vision and language models and predicted the direction machine learning model building should follow for the future.
- Top 10 Technology Trends for 2020 - Jan 16, 2020.
With integrations of multiple emerging technologies just in the past year, AI development continues at a fast pace. Following the blueprint of science and technology advancements in 2019, we predict 10 trends we expect to see in 2020 and beyond.
- KDnuggets™ News 20:n02, Jan 15: Top 5 Must-have Data Science Skills; Learn Machine Learning with THIS Book - Jan 15, 2020.
This week: learn the 5 must-have data science skills for the new year; find out which book is THE book to get started learning machine learning; pick up some Python tips and tricks; learn SQL, but learn it the hard way; and find an introductory guide to learning common NLP techniques.
- An Introductory Guide to NLP for Data Scientists with 7 Common Techniques - Jan 9, 2020.
Data Scientists work with tons of data, and many times that data includes natural language text. This guide reviews 7 common techniques with code examples to introduce you the essentials of NLP, so you can begin performing analysis and building models from textual data.
- Top 5 must-have Data Science skills for 2020 - Jan 8, 2020.
The standard job description for a Data Scientist has long highlighted skills in R, Python, SQL, and Machine Learning. With the field evolving, these core competencies are no longer enough to stay competitive in the job market.
- Automatic Text Summarization in a Nutshell - Dec 18, 2019.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about Automatic Text Summarization and the various ways it is used.
- Let’s Build an Intelligent Chatbot - Dec 17, 2019.
Check out this step by step approach to building an intelligent chatbot in Python.
- Xavier Amatriain’s Machine Learning and Artificial Intelligence 2019 Year-end Roundup - Dec 16, 2019.
It is an annual tradition for Xavier Amatriain to write a year-end retrospective of advances in AI/ML, and this year is no different. Gain an understanding of the important developments of the past year, as well as insights into what expect in 2020.
- What just happened in the world of AI? - Dec 12, 2019.
The speed at which AI made advancements and news during 2019 makes it imperative now to step back and place these events into order and perspective. It's important to separate the interest that any one advancement initially attracts, from its actual gravity and its consequential influence on the field. This review unfolds the parallel threads of these AI stories over this year and isolates their significance.
- Deploying a pretrained GPT-2 model on AWS - Dec 12, 2019.
This post attempts to summarize my recent detour into NLP, describing how I exposed a Huggingface pre-trained Language Model (LM) on an AWS-based web application.
- The 4 Hottest Trends in Data Science for 2020 - Dec 9, 2019.
The field of Data Science is growing with new capabilities and reach into every industry. With digital transformations occurring in organizations around the world, 2019 included trends of more companies leveraging more data to make better decisions. Check out these next trends in Data Science expected to take off in 2020.
- Webinar: Natural Language Processing for Digital Transformation of Unstructured Text - Dec 6, 2019.
Learn how pharma and healthcare organizations are using the power of Natural Language Processing (NLP) to transform unstructured text into actionable structured data.
- 10 Free Top Notch Machine Learning Courses - Dec 6, 2019.
Are you interested in studying machine learning over the holidays? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to improving your machine learning skills.
- KDnuggets™ News 19:n46, Dec 4: The Future of Data Science Careers; Which Data Visualization Should I Use? - Dec 4, 2019.
This week: The Future of Careers in Data Science & Analysis; Task-based effectiveness of basic visualizations; Open Source Projects by Google, Uber and Facebook for Data Science and AI; Getting Started with Automated Text Summarization; A Non-Technical Reading List for Data Science; and much more!
- Markov Chains: How to Train Text Generation to Write Like George R. R. Martin - Nov 29, 2019.
Read this article on training Markov chains to generate George R. R. Martin style text.
- Lit BERT: NLP Transfer Learning In 3 Steps - Nov 29, 2019.
PyTorch Lightning is a lightweight framework which allows anyone using PyTorch to scale deep learning code easily while making it reproducible. In this tutorial we’ll use Huggingface's implementation of BERT to do a finetuning task in Lightning.
- Spark NLP 101: LightPipeline - Nov 27, 2019.
A Pipeline is specified as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage. Now let’s see how this can be done in Spark NLP using Annotators and Transformers.
- KDnuggets™ News 19:n45, Nov 27: Interpretable vs black box models; Advice for New and Junior Data Scientists - Nov 27, 2019.
This week: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead; Advice for New and Junior Data Scientists; Python Tuples and Tuple Methods; Can Neural Networks Develop Attention? Google Thinks they Can; Three Methods of Data Pre-Processing for Text Classification
- Content-based Recommender Using Natural Language Processing (NLP) - Nov 26, 2019.
A guide to build a content-based movie recommender model based on NLP.
- Text Encoding: A Review - Nov 22, 2019.
We will focus here exactly on that part of the analysis that transforms words into numbers and texts into number vectors: text encoding.
- Topics Extraction and Classification of Online Chats - Nov 14, 2019.
This article provides covers how to automatically identify the topics within a corpus of textual data by using unsupervised topic modelling, and then apply a supervised classification algorithm to assign topic labels to each textual document by using the result of the previous step as target labels.
- KDnuggets™ News 19:n43, Nov 13: Dynamic Reports in Python and R; Creating NLP Vocabularies; What is Data Science? - Nov 13, 2019.
On KDnuggets this week: Orchestrating Dynamic Reports in Python and R with Rmd Files; How to Create a Vocabulary for NLP Tasks in Python; What is Data Science?; The Complete Data Science LinkedIn Profile Guide; Set Operations Applied to Pandas DataFrames; and much, much more.
- Understanding NLP and Topic Modeling Part 1 - Nov 12, 2019.
In this post, we seek to understand why topic modeling is important and how it helps us as data scientists.
- How to Create a Vocabulary for NLP Tasks in Python - Nov 7, 2019.
This post will walkthrough a Python implementation of a vocabulary class for storing processed text data and related metadata in a manner useful for subsequently performing NLP tasks.
- Research Guide for Transformers - Oct 30, 2019.
The problem with RNNs and CNNs is that they aren’t able to keep up with context and content when sentences are too long. This limitation has been solved by paying attention to the word that is currently being operated on. This guide will focus on how this problem can be addressed by Transformers with the help of deep learning.
- KDnuggets™ News 19:n41, Oct 30: Feature Selection: Beyond feature importance?; Time Series Analysis Using KNIME and Spark - Oct 30, 2019.
This week in KDnuggets: Feature Selection: Beyond feature importance?; Time Series Analysis: A Simple Example with KNIME and Spark; 5 Advanced Features of Pandas and How to Use Them; How to Measure Foot Traffic Using Data Analytics; Introduction to Natural Language Processing (NLP); and much, much more!
- Harnessing Semiotics and Discourse Communities to Understand User Intent - Oct 25, 2019.
Semiotics helps us understand the importance of context to determining the meaning of a term and discourse communities provide us with the background context (mental model) by which to correctly interpret its meaning correctly.
- Introduction to Natural Language Processing (NLP) - Oct 25, 2019.
Have you ever wondered how your personal assistant (e.g: Siri) is built? Do you want to build your own? Perfect! Let’s talk about Natural Language Processing.
- KDnuggets™ News 19:n39, Oct 16: Key Ideas in Document Embedding; The problem with metrics is a big problem for AI - Oct 16, 2019.
This week on KDnuggets: Beyond Word Embedding: Key Ideas in Document Embedding; The problem with metrics is a big problem for AI; Activation maps for deep learning models in a few lines of code; There is No Such Thing as a Free Lunch; 8 Paths to Getting a Machine Learning Job Interview; and much, much more.
- Beyond Word Embedding: Key Ideas in Document Embedding - Oct 11, 2019.
This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.
- Lemma, Lemma, Red Pyjama: Or, doing words with AI - Oct 10, 2019.
If we want a machine learning model to be able to generalize these forms together, we need to map them to a shared representation. But when are two different words the same for our purposes? It depends.
- 10 Free Top Notch Natural Language Processing Courses - Oct 7, 2019.
Are you looking to learn natural language processing? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to learning NLP and its varied topics.
- Multi-Task Learning – ERNIE 2.0: State-of-the-Art NLP Architecture Intuitively Explained - Oct 2, 2019.
The tech giant Baidu unveiled its state-of-the-art NLP architecture ERNIE 2.0 earlier this year, which scored significantly higher than XLNet and BERT on all tasks in the GLUE benchmark. This major breakthrough in NLP takes advantage of a new innovation called “Continual Incremental Multi-Task Learning”.
- KDnuggets™ News 19:n37, Oct 2: The Future of Analytics & Data Science! Starting NLP with spaCy & Python - Oct 2, 2019.
This week, find out what the future of analytics and data science holds; get an introduction to spaCy for natural language processing; find out how to use time series analysis for baseball; get to know your data; read 6 bits of advice for data scientists; and much, much more!
- Sentiment and Emotion Analysis for Beginners: Types and Challenges - Oct 1, 2019.
There are three types of emotion AI, and their combinations. In this article, I’ll briefly go through these three types and the challenges of their real-life applications.
- Natural Language in Python using spaCy: An Introduction - Sep 26, 2019.
This article provides a brief introduction to working with natural language (sometimes called “text analytics”) in Python using spaCy and related libraries.
- A 2019 Guide for Automatic Speech Recognition - Sep 24, 2019.
In this article, we’ll look at a couple of papers aimed at solving the problem of automated speech recognition with machine and deep learning.
- Introducing IceCAPS: Microsoft’s Framework for Advanced Conversation Modeling - Sep 23, 2019.
The new open source framework that brings multi-task learning to conversational agents.
- Reddit Post Classification - Sep 18, 2019.
This article covers the implementation of a data scraping and natural language processing project which had two parts: scrape as many posts from Reddit’s API as allowed &then use classification models to predict the origin of the posts.
- BERT, RoBERTa, DistilBERT, XLNet: Which one to use? - Sep 17, 2019.
Lately, varying improvements over BERT have been shown — and here I will contrast the main similarities and differences so you can choose which one to use in your research or application.
- The State of Transfer Learning in NLP - Sep 13, 2019.
This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP organized by Matthew Peters, Swabha Swayamdipta, Thomas Wolf, and Sebastian Ruder. This post highlights key insights and takeaways and provides updates based on recent work.
- BERT is changing the NLP landscape - Sep 9, 2019.
BERT is changing the NLP landscape and making chatbots much smarter by enabling computers to better understand speech and respond intelligently in real-time.
- A 2019 Guide to Speech Synthesis with Deep Learning - Sep 9, 2019.
In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.
- Build Your First Voice Assistant - Sep 6, 2019.
Hone your practical speech recognition application skills with this overview of building a voice assistant using Python.
- An Overview of Topics Extraction in Python with Latent Dirichlet Allocation - Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
- KDnuggets™ News 19:n33, Sep 4: Data Science Skills Poll; Object-oriented Programming for Data Scientists - Sep 4, 2019.
This week: Object-oriented programming for data scientists; Deep Learning Next Step: Transformers and Attention Mechanism; R Users' Salaries from the 2019 Stackoverflow Survey; Types of Bias in Machine Learning; 4 Tips for Advanced Feature Engineering and Preprocessing; and much more!
- TensorFlow vs PyTorch vs Keras for NLP - Sep 3, 2019.
These three deep learning frameworks are your go-to tools for NLP, so which is the best? Check out this comparative analysis based on the needs of NLP, and find out where things are headed in the future.
- Deep Learning Next Step: Transformers and Attention Mechanism - Aug 29, 2019.
With the pervasive importance of NLP in so many of today's applications of deep learning, find out how advanced translation techniques can be further enhanced by transformers and attention mechanisms.
- KDnuggets™ News 19:n31, Aug 21: Become a Marketable Data Scientist; Data Science Command Line Basics; Chatbots with Keras - Aug 21, 2019.
This week's news: Become More Marketable as a Data Scientist; Command Line Basics Every Data Scientist Should Know; Chatbots with Keras!; Understanding Cancer using Machine Learning; Statistical Modelling vs Machine Learning; Is Kaggle Learn a "Faster Data Science Education?"; and much more!
- Deep Learning for NLP: Creating a Chatbot with Keras! - Aug 19, 2019.
Learn how to use Keras to build a Recurrent Neural Network and create a Chatbot! Who doesn’t like a friendly-robotic personal assistant?
- Introducing the Plato Research Dialogue System: Building Conversational Applications at Uber’s Scale - Aug 15, 2019.
While the process of building simple, domain-specific chatbots has gotten way easier, building large scale, multi-agent conversational applications remains a massive challenge. Recently, the Uber engineering team open sourced the Plato Research Dialogue System, which is the framework powering conversational agents across Uber’s different applications.
- Top KDnuggets tweets, Aug 07-13: Deep Learning Cheat Sheets; 12 NLP Researchers, Practitioners To Follow - Aug 14, 2019.
Deep Learning Cheat Sheets; 12 NLP Researchers, Practitioners & Innovators You Should Be Following; Knowing Your Neighbours: Machine Learning on Graphs.
- Domain-Specific Language Processing Mines Value From Unstructured Data - Aug 14, 2019.
Processing unstructured text data in real-time is challenging when applying NLP or NLU. Find out how Domain-Specific Language Processing can also help mine valuable information from data by following your guidance and using the language of your business.
- KDnuggets™ News 19:n30, Aug 14: Know Your Neighbor: Machine Learning on Graphs; 12 NLP Researchers, Practitioners You Should Follow - Aug 14, 2019.
Machine Learning on Graphs; 12 amazing leaders in NLP; Deep Learning for NLP explained, including ANNs, RNNs and LSTMs; Benford's Law and why is it important for data science; Key concepts in Andrew Ng "Machine Learning Yearning".
- 12 NLP Researchers, Practitioners & Innovators You Should Be Following - Aug 12, 2019.
Check out this list of NLP researchers, practitioners and innovators you should be following, including academics, practitioners, developers, entrepreneurs, and more.
- Deep Learning for NLP: ANNs, RNNs and LSTMs explained! - Aug 7, 2019.
Learn about Artificial Neural Networks, Deep Learning, Recurrent Neural Networks and LSTMs like never before and use NLP to build a Chatbot!
- Neural Code Search: How Facebook Uses Neural Networks to Help Developers Search for Code Snippets - Jul 24, 2019.
Developers are always searching for answers to questions about their code. But how do they ask the right questions? Facebook is creating new NLP neural networks to help search code repositories that may advance information retrieval algorithms.
- KDnuggets™ News 19:n27, Jul 24: Bayesian deep learning and near-term quantum computers; DeepMind’s CASP13 Protein Folding Upset Summary - Jul 24, 2019.
This week on KDnuggets: Learn how DeepMind dominated the last CASP competition for advancing protein folding models; Bayesian deep learning and near-term quantum computers: A cautionary tale in quantum machine learning; The Evolution of a ggplot; Adapters: A Compact and Extensible Transfer Learning Method for NLP; 12 Things I Learned During My First Year as a Machine Learning Engineer; Things I Learned From the SciPy 2019 Lightning Talks; and much more!
- Adapters: A Compact and Extensible Transfer Learning Method for NLP - Jul 18, 2019.
Adapters obtain comparable results to BERT on several NLP tasks while achieving parameter efficiency.
- Scaling a Massive State-of-the-art Deep Learning Model in Production - Jul 15, 2019.
A new NLP text writing app based on OpenAI's GPT-2 aims to write with you -- whenever you ask. Find out how the developers setup and deployed their model into production from an engineer working on the team.
- Pre-training, Transformers, and Bi-directionality - Jul 12, 2019.
Bidirectional Encoder Representations from Transformers BERT (Devlin et al., 2018) is a language representation model that combines the power of pre-training with the bi-directionality of the Transformer’s encoder (Vaswani et al., 2017). BERT improves the state-of-the-art performance on a wide array of downstream NLP tasks with minimal additional task-specific training.
- Datarama: Sr Machine Learning (NLP) Engineer [Singapore] - Jul 12, 2019.
Datarama is seeking a Senior Machine Learning Engineer in Singapore, to assist their team with technological enhancement, design and develop deep learning demonstrations and solutions, and delivering deep learning expertise to the Data Science Team.
- A Gentle Guide to Starting Your NLP Project with AllenNLP - Jul 10, 2019.
For those who aren’t familiar with AllenNLP, I will give a brief overview of the library and let you know the advantages of integrating it to your project.
- KDnuggets™ News 19:n25, Jul 10: 5 Probability Distributions for Data Scientists; What the Machine Learning Engineer Job is Really Like - Jul 10, 2019.
This edition of the KDnuggets newsletter is double-sized after taking the holiday week off. Learn about probability distributions every data scientist should know, what the machine learning engineering job is like, making the most money with the least amount of risk, the difference between NLP and NLU, get a take on Nvidia's new data science workstation, and much, much more.
- Practical Speech Recognition with Python: The Basics - Jul 9, 2019.
Do you fear implementing speech recognition in your Python apps? Read this tutorial for a simple approach to getting practical with speech recognition using open source Python libraries.
- NLP vs. NLU: from Understanding a Language to Its Processing - Jul 3, 2019.
As AI progresses and the technology becomes more sophisticated, we expect existing techniques to evolve. With these changes, will the well-founded natural language processing give way to natural language understanding? Or, are the two concepts subtly distinct to hold their own niche in AI?
- Examining the Transformer Architecture – Part 2: A Brief Description of How Transformers Work - Jul 2, 2019.
As The Transformer may become the new NLP standard, this review explores its architecture along with a comparison to existing approaches by RNN.
- XLNet Outperforms BERT on Several NLP Tasks - Jul 1, 2019.
XLNet is a new pretraining method for NLP that achieves state-of-the-art results on several NLP tasks.
- KDnuggets™ News 19:n24, Jun 26: Understand Cloud Services; Pandas Tips & Tricks; Master Data Preparation w/ Python - Jun 26, 2019.
Happy summer! This week on KDnuggets: Understanding Cloud Data Services; How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat; 7 Steps to Mastering Data Preparation for Machine Learning with Python; Examining the Transformer Architecture: The OpenAI GPT-2 Controversy; Data Literacy: Using the Socratic Method; and much more!
- Natural Language Processing Q&A - Jun 24, 2019.
In this Q&A, Jos Martin, Senior Engineering Manager at MathWorks, discusses recent NLP developments and the applications that are benefitting from the technology.
- Natural Language Interface to DataTable - Jun 21, 2019.
You have to write SQL queries to query data from a relational database. Sometimes, you even have to write complex queries to do that. Won't it be amazing if you could use a chatbot to retrieve data from a database using simple English? That's what this tutorial is all about.
- Examining the Transformer Architecture: The OpenAI GPT-2 Controversy - Jun 20, 2019.
GPT-2 is a generative model, created by OpenAI, trained on 40GB of Internet to predict the next word. And OpenAI found this model to be SO good that they did not release the fully trained model due to their concerns about malicious applications of the technology.
- Spark NLP: Getting Started With The World’s Most Widely Used NLP Library In The Enterprise - Jun 18, 2019.
The Spark NLP library has become a popular AI framework that delivers speed and scalability to your projects. Check out what's under the hood and learn about how to getting started leveraging Spark NLP from John Snow Labs.
- Monash University: Academic Opportunities in Dialogue Research [Melbourne, Australia] - Jun 18, 2019.
Seeking outstanding academics who want to join this world-class team to deliver the highest quality teaching and research that will shape the future of AI for conversational assistants, human-robot interaction, customer service, and many other application domains.
- NLP and Computer Vision Integrated - Jun 5, 2019.
Computer vision and NLP developed as separate fields, and researchers are now combining these tasks to solve long-standing problems across multiple disciplines.
- KDnuggets™ News 19:n21, Jun 5: Transitioning your Career to Data Science; 11 top Data Science, Machine Learning platforms; 7 Steps to Mastering Intermediate ML w. Python - Jun 5, 2019.
The results of KDnuggets 20th Annual Software Poll; How to transition to a Data Science career; Mastering Intermediate Machine Learning with Python ; Understanding Natural Language Processing (NLP); Backprop as applied to LSTM, and much more.
- Analyzing Tweets with NLP in Minutes with Spark, Optimus and Twint - May 24, 2019.
Social media has been gold for studying the way people communicate and behave, in this article I’ll show you the easiest way of analyzing tweets without the Twitter API and scalable for Big Data.
Pages: 1 2
- Your Guide to Natural Language Processing (NLP) - May 23, 2019.
This extensive post covers NLP use cases, basic examples, Tokenization, Stop Words Removal, Stemming, Lemmatization, Topic Modeling, the future of NLP, and more.
- When Too Likely Human Means Not Human: Detecting Automatically Generated Text - May 23, 2019.
Passably-human automated text generation is a reality. How do we best go about detecting it? As it turns out, being too predictably human may actually be a reasonably good indicator of not being human at all.
- Extracting Knowledge from Knowledge Graphs Using Facebook’s Pytorch-BigGraph - May 22, 2019.
We are using the state-of-the-art Deep Learning tools to build a model for predict a word using the surrounding words as labels.
Pages: 1 2
- Brookhaven National Laboratory: Postdoc in Materials Informatics [Upton, NY] - May 14, 2019.
Seeking candidates to develop and apply information retrieval, information extraction, and various Natural Language Processing (NLP) techniques to the scientific literature in materials science and crystallography for the purpose of building prototype computational data systems.
- A Complete Exploratory Data Analysis and Visualization for Text Data: Combine Visualization and NLP to Generate Insights - May 9, 2019.
Visually representing the content of a text document is one of the most important tasks in the field of text mining as a Data Scientist or NLP specialist. However, there are some gaps between visualizing unstructured (text) data and structured data.
Pages: 1 2
- Build Your First Chatbot Using Python & NLTK - May 1, 2019.
Today we will learn to create a simple chat assistant or chatbot using Python’s NLTK library.
- Building a Flask API to Automatically Extract Named Entities Using SpaCy - Apr 17, 2019.
This article discusses how to use the Named Entity Recognition module in spaCy to identify people, organizations, or locations in text, then deploy a Python API with Flask.
- KDnuggets™ News 19:n14, Apr 10: Which Data Science/ML methods and algorithms you used? Predict Age and Gender Using Neural Nets - Apr 10, 2019.
Getting started with NLP using the PyTorch framework; Building a Recommender System; Advice for New Data Scientists; All you need to know about text preprocessing for NLP and Machine Learning; Advanced Keras - Constructing Complex Custom Losses and Metrics; Top 8 Data Science Use Cases in Gaming
- All you need to know about text preprocessing for NLP and Machine Learning - Apr 9, 2019.
We present a comprehensive introduction to text preprocessing, covering the different techniques including stemming, lemmatization, noise removal, normalization, with examples and explanations into when you should use each of them.
- Another 10 Free Must-See Courses for Machine Learning and Data Science - Apr 5, 2019.
Check out another follow-up collection of free machine learning and data science courses to give you some spring study ideas.
- Getting started with NLP using the PyTorch framework - Apr 3, 2019.
We discuss the classes that PyTorch provides for helping with Natural Language Processing (NLP) and how they can be used for related tasks using recurrent layers.
- What Does GPT-2 Think About the AI Arms Race? - Apr 1, 2019.
It may be April first, but that doesn't mean you will necessarily be fooled by GPT-2's views on the AI arms race. Why not have a read for fun and to see what the language generation model is capable of.
- KDnuggets™ News 19:n11, Mar 20: Another 10 Free Must-Read Books for Data Science; 19 Inspiring Women in AI, Big Data, Machine Learning - Mar 20, 2019.
Also: Who is a typical Data Scientist in 2019?; The Pareto Principle for Data Scientists; My favorite mind-blowing Machine Learning/AI breakthroughs; Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision; Advanced Keras - Accurately Resuming a Training Process
- Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision - Mar 15, 2019.
In this blog, I’ll walk you through a personal project in which I cheaply built a classifier to detect anti-semitic tweets, with no public dataset available, by combining weak supervision and transfer learning.
Pages: 1 2
- Beyond news contents: the role of social context for fake news detection - Mar 7, 2019.
Today we’re looking at a more general fake news problem: detecting fake news that is being spread on a social network. This is a summary of a recent paper which demonstrates why we should also look at the social context: the publishers and the users spreading the information!
- Top KDnuggets tweets, Feb 27 – Mar 05: How to Setup a Python Environment for Machine Learning; How to do Everything in Computer Vision - Mar 6, 2019.
Also Python Data Science for Beginners; Deep Learning for Natural Language Processing (NLP) - using RNNs and CNNs.
- Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention - Mar 6, 2019.
In this post, the author shows how BERT can mimic a Bag-of-Words model. The visualization tool from Part 1 is extended to probe deeper into the mind of BERT, to expose the neurons that give BERT its shape-shifting superpowers.
- OpenAI’s GPT-2: the model, the hype, and the controversy - Mar 4, 2019.
OpenAI recently released a very large language model called GPT-2. Controversially, they decided not to release the data or the parameters of their biggest model, citing concerns about potential abuse. Read this researcher's take on the issue.
- Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters - Feb 27, 2019.
Google’s BERT algorithm has emerged as a sort of “one model to rule them all.” BERT builds on two key ideas that have been responsible for many of the recent advances in NLP: (1) the transformer architecture and (2) unsupervised pre-training.
- Deep Learning for Natural Language Processing (NLP) – using RNNs & CNNs - Feb 21, 2019.
We investigate several Natural Language Processing tasks and explain how Deep Learning can help, looking at Language Modeling, Sentiment Analysis, Language Translation, and more.
- Word Embeddings in NLP and its Applications - Feb 20, 2019.
Word embeddings such as Word2Vec is a key AI method that bridges the human understanding of language to that of a machine and is essential to solving many NLP problems. Here we discuss applications of Word2Vec to Survey responses, comment analysis, recommendation engines, and more.
- State of the art in AI and Machine Learning – highlights of papers with code - Feb 20, 2019.
We introduce papers with code, the free and open resource of state-of-the-art Machine Learning papers, code and evaluation tables.
- Are BERT Features InterBERTible? - Feb 19, 2019.
This is a short analysis of the interpretability of BERT contextual word representations. Does BERT learn a semantic vector representation like Word2Vec?
- KDnuggets™ News 19:n07, Feb 13: The Best and Worst Data Visualizations of 2018; Gartner 2019 Magic Quadrant for Data Science Platforms - Feb 13, 2019.
Also: Data-science? Agile? Cycles?; How I used NLP (Spacy) to screen Data Science Resumes; Neural Networks - an Intuition; A Quick Guide to Feature Engineering; Understanding Gradient Boosting Machines