- Multilabel Document Categorization, step by step example - Aug 31, 2021.
This detailed guide explores an unsupervised and supervised learning two-stage approach with LDA and BERT to develop a domain-specific document categorizer on unlabeled documents.
- Topic Modeling with Streamlit - May 26, 2021.
What does it take to create and deploy a topic modeling web application quickly? Read this post to see how the author uses Python NLP packages for topic modeling, Streamlit for the web application framework, and Streamlit Sharing for deployment.
- 6 NLP Techniques Every Data Scientist Should Know - Feb 12, 2021.
Natural language processing has already begun to transform to way humans interact with computers, and its advances are moving rapidly. The field is built on core methods that must first be understood, with which you can then launch your data science projects to a new level of sophistication and value.
- Topic Modeling with BERT - Nov 3, 2020.
Leveraging BERT and TF-IDF to create easily interpretable topics.
- Innovating versus Doing: NLP and CORD19 - Jun 30, 2020.
How I learned to trust the process and find value in the road most traveled.
- An Introductory Guide to NLP for Data Scientists with 7 Common Techniques - Jan 9, 2020.
Data Scientists work with tons of data, and many times that data includes natural language text. This guide reviews 7 common techniques with code examples to introduce you the essentials of NLP, so you can begin performing analysis and building models from textual data.
- Topics Extraction and Classification of Online Chats - Nov 14, 2019.
This article provides covers how to automatically identify the topics within a corpus of textual data by using unsupervised topic modelling, and then apply a supervised classification algorithm to assign topic labels to each textual document by using the result of the previous step as target labels.
- Understanding NLP and Topic Modeling Part 1 - Nov 12, 2019.
In this post, we seek to understand why topic modeling is important and how it helps us as data scientists.
- Beyond Word Embedding: Key Ideas in Document Embedding - Oct 11, 2019.
This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.
- An Overview of Topics Extraction in Python with Latent Dirichlet Allocation - Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
- Towards Automatic Text Summarization: Extractive Methods - Mar 13, 2019.
The basic idea looks simple: find the gist, cut off all opinions and detail, and write a couple of perfect sentences, the task inevitably ended up in toil and turmoil. Here is a short overview of traditional approaches that have beaten a path to advanced deep learning techniques.
- KDnuggets™ News 18:n33, Sep 5: Practical Topic Modeling with Python; Classifying AI Technologies; Data Science Project Inspiration - Sep 5, 2018.
Also: An End-to-End Project on Time Series Analysis and Forecasting with Python; Financial Data Analysis - Data Processing 1: Loan Eligibility Prediction; OLAP queries in SQL: A Refresher; Word Vectors in Natural Language Processing: Global Vectors (GloVe)
- Topic Modeling with LSA, PLSA, LDA & lda2Vec - Aug 30, 2018.
This article is a comprehensive overview of Topic Modeling and its associated techniques.
- 5 Machine Learning Projects You Can No Longer Overlook, April - Apr 13, 2017.
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
- Measuring Topic Interpretability with Crowdsourcing - Nov 30, 2016.
Topic modelling is an important statistical modelling technique to discover abstract topics in collection of documents. This article talks about a new measure for assessing the semantic properties of statistical topics and how to use it.
- America’s Next Topic Model - Jul 15, 2016.
Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.
- Text Mining 101: Topic Modeling - Jul 1, 2016.
We introduce the concept of topic modelling and explain two methods: Latent Dirichlet Allocation and TextRank. The techniques are ingenious in how they work – try them yourself.
- New Book: Mining Latent Entity Structures - Jul 21, 2015.
This collection investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains.
- Webinar: Implementing a Better Search Experience, April 28 - Apr 15, 2015.
Learn how to make SharePoint more than a place where you put documents and start transforming your collected knowledge into your *collective* knowledge.
- Text By the Bay conference, San Francisco, Apr 24-25 - Apr 2, 2015.
The inaugural Text By the Bay conference has an amazing program, with speakers from top universities, Big text data powerhouses, Growing global players, Startups, Text/NLP tech providers, and more. KDnuggets discount.
- Top /r/MachineLearning posts, Jan 25-31 - Feb 6, 2015.
Downsides to jobs in machine learning fields, AI learning materials, novel topic modelling techniques and weekly simple question threads are all topics of discussion this week on Reddit /r/MachineLearning.
- Interview: Thomas Levi, POF on How Online Dating is Improving Matching through Big Data - Jul 29, 2014.
We discuss Big Data use cases at Plenty of Fish, insights from text mining of user profiles, using topic modeling for developing user archetypes, challenges and more.