- KDnuggets™ News 19:n45, Nov 27: Interpretable vs black box models; Advice for New and Junior Data Scientists - Nov 27, 2019.
This week: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead; Advice for New and Junior Data Scientists; Python Tuples and Tuple Methods; Can Neural Networks Develop Attention? Google Thinks they Can; Three Methods of Data Pre-Processing for Text Classification
- Three Methods of Data Pre-Processing for Text Classification - Nov 21, 2019.
This blog shows how text data representations can be used to build a classifier to predict a developer’s deep learning framework of choice based on the code that they wrote, via examples of TensorFlow and PyTorch projects.
- What my first Silver Medal taught me about Text Classification and Kaggle in general? - May 13, 2019.
A first-hand account of ideas tried by a competitor at the recent kaggle competition 'Quora Insincere questions classification', with a brief summary of some of the other winning solutions.
- Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision - Mar 15, 2019.
In this blog, I’ll walk you through a personal project in which I cheaply built a classifier to detect anti-semitic tweets, with no public dataset available, by combining weak supervision and transfer learning.
Pages: 1 2
- How to solve 90% of NLP problems: a step-by-step guide - Jan 14, 2019.
Read this insightful, step-by-step article on how to use machine learning to understand and leverage text.
- Introduction to Named Entity Recognition - Dec 11, 2018.
Named Entity Recognition is a tool which invariably comes handy when we do Natural Language Processing tasks. Read on to find out how.
Pages: 1 2
- Word Morphing – an original idea - Nov 20, 2018.
In this post, we describe how to utilise word2vec's embeddings and A* search algorithm to morph between words.
- Multi-Class Text Classification with Doc2Vec & Logistic Regression - Nov 9, 2018.
Doc2vec is an NLP tool for representing documents as a vector and is a generalizing of the word2vec method. In order to understand doc2vec, it is advisable to understand word2vec approach.
- KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP - Nov 7, 2018.
Also: Machine Learning Classification: A Dataset-based Pictorial; Quantum Machine Learning: A look at myths, realities, and future projections; Multi-Class Text Classification Model Comparison and Selection; Top 13 Python Deep Learning Libraries
- Multi-Class Text Classification Model Comparison and Selection - Nov 1, 2018.
This is what we are going to do today: use everything that we have presented about text classification in the previous articles (and more) and comparing between the text classification models we trained in order to choose the most accurate one for our problem.
Pages: 1 2
- Named Entity Recognition and Classification with Scikit-Learn - Oct 25, 2018.
Named Entity Recognition and Classification is a process of recognizing information units like names, including person, organization and location names, and numeric expressions from unstructured text. The goal is to develop practical and domain-independent techniques in order to detect named entities with high accuracy automatically.
Pages: 1 2
- The Main Approaches to Natural Language Processing Tasks - Oct 17, 2018.
Let's have a look at the main approaches to NLP tasks that we have at our disposal. We will then have a look at the concrete NLP tasks we can tackle with said approaches.
- Machine Reading Comprehension: Learning to Ask & Answer - Oct 11, 2018.
Investigating the dual ask-answer network, covering the embedding, encoding, attention and output layer, as well as the loss function, with code examples to help you get started.
- Free resources to learn Natural Language Processing - Sep 18, 2018.
An extensive list of free resources to help you learn Natural Language Processing, including explanations on Text Classification, Sequence Labeling, Machine Translation and more.
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
- Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors - Jul 5, 2018.
In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.
- Overview and benchmark of traditional and deep learning models in text classification - Jul 3, 2018.
In this post, traditional and deep learning models in text classification will be thoroughly investigated, including a discussion into both Recurrent and Convolutional neural networks.
- Creating a simple text classifier using Google CoLaboratory - Mar 15, 2018.
Google CoLaboratory is Google’s latest contribution to AI, wherein users can code in Python using a Chrome browser in a Jupyter-like environment. In this article I have shared a method, and code, to create a simple binary text classifier using Scikit Learn within Google CoLaboratory environment.
- Automated Text Classification Using Machine Learning - Jan 30, 2018.
In this post, we talk about the technology, applications, customization, and segmentation related to our automated text classification API.
- Top KDnuggets tweets, Jul 12-18: 10 Free #MustRead Books for #MachineLearning and #DataScience; Why #AI and Machine Learning? - Jul 19, 2017.
Also top 32 Reasons #DataScience Projects and Teams Fail; Text Classifier Algorithms in #MachineLearning; The 4 Types of #Data #Analytics: Descriptive, Diagnostic ...
- New James Bond is a Data Scientist: Data Science Challenge sponsored by UK MI5 and MI6 - Apr 5, 2017.
Two Data Science challenges were launched by UK Government agencies, including MI5 and MI6. One challenge involves classifying vehicles from aerial images, and another analyzing crisis reports. Can you take part and be the next James Bond?
- Measuring Topic Interpretability with Crowdsourcing - Nov 30, 2016.
Topic modelling is an important statistical modelling technique to discover abstract topics in collection of documents. This article talks about a new measure for assessing the semantic properties of statistical topics and how to use it.
- SAS: Machine Learning Algorithm Research/Developer - Jul 15, 2015.
Developer with a strong analytical background and excellent programming skills to collaborate with a team developing new machine learning algorithms for NLP, text classification, sentiment analysis, and similar tasks.
- Excellent Tutorial on Sequence Learning using Recurrent Neural Networks - Jun 26, 2015.
Excellent tutorial explaining Recurrent Neural Networks (RNNs) which hold great promise for learning general sequences, and have applications for text analysis, handwriting recognition and even machine translation.
- Deep Learning for Text Understanding from Scratch - Mar 13, 2015.
Forget about the meaning of words, forget about grammar, forget about syntax, forget even the very concept of a word. Now let the machine learn everything by itself.
- Text Analysis 101: Document Classification - Jan 24, 2015.
Document classification is an example of Machine Learning (ML) in the form of Natural Language Processing (NLP). By classifying text, we are aiming to assign one or more classes or categories to a document, making it easier to manage and sort.
- Lexalytics: Stop ignoring your text data - Dec 9, 2014.
Have a bunch of open-ended questions that need analyzing? Curious about social media? Want to track online reviews? Try Lexalytics Semantria Web Service - first 20k documents analyzed free.
- WordStat 7: Shifting Text Analytics Into High Gear - Dec 2, 2014.
WordStat 7 lets users get valuable and actionable insights from text faster, connect unstructured and structured information, and provides better help for the creation and validation of accurate text-categorization dictionaries.
- Top KDnuggets tweets, Aug 8-10: Forget SQL vs NoSQL. New trend is HTAP: Hybrid Transaction/Analytical Processing - Aug 11, 2014.
Forget SQL vs NoSQL. New trend is HTAP: Hybrid Transaction/Analytical Processing; Metrics that Matter - The Key to Perfect Dashboards; Machine Learning Tutorial: The Max Entropy Text Classifier ; Six Thinking Hats and the Life of a Data Scientist.
- Exclusive Interview: Richard Socher, founder of etcML, Easy Text Classification - Mar 31, 2014.
An exclusive interview with Richard Socher, co-founder of etcML, a new and free tool for helping users with creating classifiers for text using machine learning.
- etcML Promises to Make Text Classification Easy - Mar 5, 2014.
etcML is a new and free tool that allows even novice user use the power of machine learning and text classification.
- More Data Mining with Weka - Jan 30, 2014.
This online course teaches both principles and practical data mining techniques, lets students work on very big datasets, classify text, experiment with clustering, and much more.