- KDnuggets™ News 20:n40, Oct 21: fastcore: An Underrated Python Library; Goodhart’s Law for Data Science: what happens when a measure becomes a target? - Oct 21, 2020.
fastcore: An Underrated Python Library; Goodhart's Law for Data Science and what happens when a measure becomes a target?; Text Mining with R: The Free eBook; Free From MIT: Intro to Computational Thinking and Data Science; How to ace the data science coding challenge
- Text Mining with R: The Free eBook - Oct 15, 2020.
This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse.
- eBook: Vocabularies, Text Mining and FAIR Data: The Strategic Role Information Managers Play - Aug 31, 2020.
How can information managers find strategic roles to play in their organization's AI and data analysis projects? Download this book to learn more.
- Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Semantics and Pragmatics - Aug 31, 2020.
Algorithms for text analytics must model how language works to incorporate meaning in language—and so do the people deploying these algorithms. Bender & Lascarides 2019 is an accessible overview of what the field of linguistics can teach NLP about how meaning is encoded in human languages.
- Text Mining in Python: Steps and Examples - May 12, 2020.
The majority of data exists in the textual form which is a highly unstructured format. In order to produce meaningful insights from the text data then we need to follow a method called Text Analysis.
- The Big Bad NLP Database: Access Nearly 300 Datasets - Feb 28, 2020.
Check out this database of nearly 300 freely-accessible NLP datasets, curated from around the internet.
- Domain-Specific Language Processing Mines Value From Unstructured Data - Aug 14, 2019.
Processing unstructured text data in real-time is challenging when applying NLP or NLU. Find out how Domain-Specific Language Processing can also help mine valuable information from data by following your guidance and using the language of your business.
- All you need to know about text preprocessing for NLP and Machine Learning - Apr 9, 2019.
We present a comprehensive introduction to text preprocessing, covering the different techniques including stemming, lemmatization, noise removal, normalization, with examples and explanations into when you should use each of them.
- Towards Automatic Text Summarization: Extractive Methods - Mar 13, 2019.
The basic idea looks simple: find the gist, cut off all opinions and detail, and write a couple of perfect sentences, the task inevitably ended up in toil and turmoil. Here is a short overview of traditional approaches that have beaten a path to advanced deep learning techniques.
- Webinar: Supercharging Search: AI in Information Discovery for the Life Sciences - Mar 5, 2019.
Learn how to cut through the complexity of scientific language; hear how SciBite puts AI techniques in action, and find how to supercharge search at your organization.
- Text Preprocessing in Python: Steps, Tools, and Examples - Nov 6, 2018.
We outline the basic steps of text preprocessing, which are needed for transferring text from human language to machine-readable format for further processing. We will also discuss text preprocessing tools.
Pages: 1 2
- Data Representation for Natural Language Processing Tasks - Nov 2, 2018.
In NLP we must find a way to represent our data (a series of texts) to our systems (e.g. a text classifier). As Yoav Goldberg asks, "How can we encode such categorical data in a way which is amenable for us by a statistical classifier?" Enter the word vector.
- Labeling Unstructured Text for Meaning to Achieve Predictive Lift - Oct 31, 2018.
In this post, we examine several advance NLP techniques, including: labeling nouns and noun phrases for meaning, labeling (most often) adverbs and adjectives for sentiment, and labeling verbs for intent.
- Named Entity Recognition and Classification with Scikit-Learn - Oct 25, 2018.
Named Entity Recognition and Classification is a process of recognizing information units like names, including person, organization and location names, and numeric expressions from unstructured text. The goal is to develop practical and domain-independent techniques in order to detect named entities with high accuracy automatically.
Pages: 1 2
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
- Comparison of the Most Useful Text Processing APIs - Aug 23, 2018.
There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison.
- Affordable online news archives for academic research - Aug 10, 2018.
Many researchers need access to multi-year historical repositories of online news articles. We identified three companies that make such access affordable, and spoke with their CEOs.
- WTF is TF-IDF? - Aug 2, 2018.
Relevant words are not necessarily the most frequent words since stopwords like “the”, “of” or “a” tend to occur very often in many documents.
- KDnuggets™ News 18:n27, Jul 18: Data Scientist was the sexiest job until…; Text Mining on the Command Line; Does PCA Really Work? - Jul 18, 2018.
Also: What is Minimum Viable (Data) Product?; Beating the 4-Year Slump: Mid-Career Growth in Data Science; GDPR after 2 months - What does it mean for Machine Learning?; Basic Image Data Analysis Using Numpy and OpenCV; fast.ai Deep Learning Part 2 Complete Course Notes
- Text Mining on the Command Line - Jul 13, 2018.
In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood.
- Natural Language Processing Nuggets: Getting Started with NLP - Jun 19, 2018.
Check out this collection of NLP resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
- Getting Started with spaCy for Natural Language Processing - May 2, 2018.
spaCy is a Python natural language processing library specifically designed with the goal of being a useful library for implementing production-ready systems. It is particularly fast and intuitive, making it a top contender for NLP tasks.
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The GloVe Model - Apr 25, 2018.
The GloVe model stands for Global Vectors which is an unsupervised learning model which can be used to obtain dense word vectors similar to Word2Vec.
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The Skip-gram Model - Apr 10, 2018.
Just like we discussed in the CBOW model, we need to model this Skip-gram architecture now as a deep learning classification model such that we take in the target word as our input and try to predict the context words.
- Machine Learning for Text - Apr 9, 2018.
This book covers machine learning techniques from text using both bag-of-words and sequence-centric methods. The scope of coverage is vast, and it includes traditional information retrieval methods and also recent methods from neural networks and deep learning.
- Understanding Feature Engineering: Deep Learning Methods for Text Data - Mar 28, 2018.
Newer, advanced strategies for taming unstructured, textual data: In this article, we will be looking at more advanced feature engineering strategies which often leverage deep learning models.
- Text Data Preprocessing: A Walkthrough in Python - Mar 26, 2018.
This post will serve as a practical walkthrough of a text data preprocessing task using some common Python tools.
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
- Training and Visualising Word Vectors - Jan 23, 2018.
In this tutorial I want to show how you can implement a skip gram model in tensorflow to generate word vectors for any text you are working with and then use tensorboard to visualize them.
- Elasticsearch for Dummies - Jan 12, 2018.
In this blog, you’ll get to know the basics of Elasticsearch, its advantages, how to install it and indexing the documents using Elasticsearch.
- OpenMinTED Open Tender Phase II Funding opportunity for text and data mining developers - Jan 11, 2018.
OpenMinTED invites researchers, service providers and SMEs to submit proposals related to the development and integration of existing text mining/NLP applications or software components. Apply by Jan 26, 2018.
- Top KDnuggets tweets, Nov 29 – Dec 5: Teaching the Data Science Process - Dec 6, 2017.
Also An Introduction to Key Data Science Concepts; Using Deep Learning To Extract Knowledge From Job Descriptions; A General Approach to Preprocessing Text Data; keras-text - A Text Classification Library in #Keras.
- A General Approach to Preprocessing Text Data - Dec 1, 2017.
Recently we had a look at a framework for textual data science tasks in their totality. Now we focus on putting together a generalized approach to attacking text data preprocessing, regardless of the specific textual data science task you have in mind.
- KDnuggets™ News 17:n45, Nov 29: New Poll: Data Science Methods Used? Deep Learning Specialization: 21 Lessons Learned - Nov 29, 2017.
Also The 10 Statistical Techniques Data Scientists Need to Master; Did Spark Really Kill Hadoop? A Framework for Textual Data Science.
- Building a Wikipedia Text Corpus for Natural Language Processing - Nov 23, 2017.
Wikipedia is a rich source of well-organized textual data, and a vast collection of knowledge. What we will do here is build a corpus from the set of English Wikipedia articles, which is freely and conveniently available online.
- A Framework for Approaching Textual Data Science Tasks - Nov 22, 2017.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
- Webinar: Taking Semantic Search to Full Text, Nov 7 - Oct 31, 2017.
Learn about content challenges of R&D teams in the life sciences, the benefits of semantic enrichment, and a solution that reduces overhead and adds value to information discovery and innovation initiatives.
- Top 10 Machine Learning with R Videos - Oct 24, 2017.
A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.
- Tackling Unstructured Data With Text Exploration – On-demand webcast - Sep 7, 2017.
Discover how to use a platform to organize unstructured data to see the linkages between word usage and document of origin, see the themes in a word cloud, and use topic extraction and document clustering.
- Search Millions of Documents for Thousands of Keywords in a Flash - Sep 1, 2017.
We present a python library called FlashText that can search or replace keywords / synonyms in documents in O(n) – linear time.
- Text Exploration Info Kit - Aug 4, 2017.
Get the free kit, which includes webcast with text analytics expert on how he helps clients make sense of text data, book chapter on text mining, and more.
- Web Scraping with R: Online Food Blogs Example - Jun 29, 2017.
We consider scraping data from online food blogs to construct a data set of recipes with ingredients, nutritional information and more, and do exploratory analysis which provides tasty insights.
Pages: 1 2
- Text Mining 101: Mining Information From A Resume - May 24, 2017.
We show a framework for mining relevant entities from a text resume, and how to separation parsing logic from entity specification.
- Using Deep Learning To Extract Knowledge From Job Descriptions - May 9, 2017.
We present a deep learning approach to extract knowledge from a large amount of data from the recruitment space. A learning to rank approach is followed to train a convolutional neural network to generate job title and job description embeddings.
- Text Analytics: A Primer - Mar 14, 2017.
Marketing scientist Kevin Gray asks Professor Bing Liu to give us a quick snapshot of text analytics in this informative interview.
- Text Mining Amazon Mobile Phone Reviews: Interesting Insights - Jan 10, 2017.
We analyzed more than 400 thousand reviews of unlocked mobile phones sold on Amazon.com to find out insights with respect to reviews, ratings, price and their relationships.
- Social Media for Marketing and Healthcare: Focus on Adverse Side Effects - Jan 9, 2017.
Social media like twitter, facebook are very important sources of big data on the internet and using text mining, valuable insights about a product or service can be found to help marketing teams. Lets see, how healthcare companies are using big data and text mining to improve their marketing strategies.
- Measuring Topic Interpretability with Crowdsourcing - Nov 30, 2016.
Topic modelling is an important statistical modelling technique to discover abstract topics in collection of documents. This article talks about a new measure for assessing the semantic properties of statistical topics and how to use it.
- Easy Access to Full-Text Articles for Text Mining, Nov 15 Webinar - Nov 8, 2016.
Learn about the benefits of text mining full-text articles compared to abstracts,~ How to streamline the text mining process, how the new CCC and Linguamatics text mining solution works, and more.
- What is emotion analytics and why is it important? - Oct 19, 2016.
In today’s Internet world, humans express their Emotions, Sentiments and Feelings via text/comments, emojis, likes and dislikes. Understanding the true meanings behind the combinations of these electronic symbols is very crucial and this is what this article explains.
Pages: 1 2 3
- Webinar: The Role of Text Mining in Patent Research, Oct 6 - Sep 30, 2016.
Learn how Dr. Thorsten Schweikardt developed a patent analysis workflow, making a previously difficult task achievable by using text mining.
- The Great Algorithm Tutorial Roundup - Sep 20, 2016.
This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!
- America’s Next Topic Model - Jul 15, 2016.
Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.
- Mining Twitter Data with Python Part 7: Geolocation and Interactive Maps - Jul 6, 2016.
The final part of this 7 part series explores using geolocation and interactive maps with Twitter data.
- KDnuggets™ News 16:n24, Jul 6: Text Mining 101; Softmax and Logistic Regression; Data Mining History: Support Vector Machines - Jul 6, 2016.
What is Softmax Regression and How is it Related to Logistic Regression; Text Mining 101: Topic Modeling; Data Mining History: The Invention of Support Vector Machines; Mining Twitter Data with Python Part 5: Data Visualisation Basics
- Mining Twitter Data with Python Part 6: Sentiment Analysis Basics - Jul 5, 2016.
Part 6 of this series builds on the previous installments by exploring the basics of sentiment analysis on Twitter data.
- Text Mining 101: Topic Modeling - Jul 1, 2016.
We introduce the concept of topic modelling and explain two methods: Latent Dirichlet Allocation and TextRank. The techniques are ingenious in how they work – try them yourself.
- Mining Twitter Data with Python Part 5: Data Visualisation Basics - Jun 29, 2016.
Part 5 of this series takes on data visualization, as we look to make sense of our data and highlight interesting insights.
- Mining Twitter Data with Python Part 4: Rugby and Term Co-occurrences - Jun 27, 2016.
Part 4 of this series employs some of the lessons learned thus far to analyze tweets related to rugby matches and term co-occurrences.
- Mining Twitter Data with Python Part 3: Term Frequencies - Jun 22, 2016.
Part 3 of this 7 part series focusing on mining Twitter data discusses the analysis of term frequencies for meaningful term extraction.
- Data Mining Panama Papers & Graph Analytics – Two Upcoming Webinars - May 16, 2016.
Ontotext offers a pair of free live webinars: Diving in Panama Papers and Open Data to Discover Emerging News, and GraphDB Fundamentals: Transforming your Graph Analytics with GraphDB. Reserve your spot today.
- A Data Science Approach to Writing a Good GitHub README - May 4, 2016.
Readme is the first file every user will look for, whenever they are checking out the code repository. Learn, what you should write inside your readme files and analyze your existing files effectiveness.
- New Books on Text Mining, Visualization, Social Media Analysis - Feb 16, 2016.
New books on "Text Mining and Visualization with Open-Source Tools" and "Graph-Based Social Media Analysis" provide essential and up-to-date information on these key topics. Use code BZQ31 to save 20%.
- New Tools Predict Markets with 99.9% certainty - Feb 8, 2016.
Predicting financial markets is a relatively new field of of research, it is cross-disciplinary, it is difficult and requires some insight into trading, computational linguistics, behavioral finance, pattern recognition, and learning models.
- Webinar: Text Mining Along the Drug Development Pipeline, Jan 28 - Jan 19, 2016.
Research and development of a single drug can take 10 years and cost billions. Learn about applications and business value of text mining in the life sciences through a series of real world examples.
- Everything You Need to Know about Natural Language Processing - Dec 21, 2015.
Natural language processing (NLP) helps computers understand human speech and language. We define the key NLP concepts and explain how it fits in the bigger picture of Artificial Intelligence.
- PAKDD 2016 Data Science Contest: Sarcasm detection on Reddit comments - Dec 17, 2015.
The contest task is to design an effective algorithm for sarcasm detection in the domain of opinion mining. Submissions due Feb 15, 2016.
- Roche (Basel): Postdoctoral Fellow, Biomedical Text and Data Mining - Dec 2, 2015.
You will receive scientific mentoring from both Roche and its academic partners, and gain valuable research experience from both academic and industrial perspectives.
- Data By the Bay: Data Science and Engineering in Four Directions - Nov 16, 2015.
The main goal of Data By the Bay is connecting the best data engineers, data scientists and data-driven startup leaders with each other. Co-located conferences will focus on Data, Text, Democracy, AI and IoT, and Life Sciences, May 17 - 20, 2016.
- Tutorial: Building a Twitter Sentiment Analysis Process - Nov 3, 2015.
Tutorial on collecting and analyzing tweets using the “Text Analysis by AYLIEN” extension for RapidMiner.
Pages: 1 2 3
- BABELNET 3.5, Largest Multilingual Dictionary and Semantic Network - Sep 29, 2015.
BabelNet 3.5 covers 272 languages, and offers an improved user interface, new integrated resources of Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF and ImageNet, and a very large knowledge base with over 380 million semantic relations.
- EPA: Text Mining Specialist - Sep 28, 2015.
Conduct research into the development and application of semantic search and text mining approaches to analyze large collections of documents.
- SentimentBuilder: Visual Analysis of Unstructured Texts - Sep 18, 2015.
Sankey diagrams are mainly used to visualize the flow of data on energy flows, material flow and trade-offs. SentimentBuilder found how to use them with unstructured text in their online NLP tool.
- NSA Patents Analysis and Visualization - Sep 6, 2015.
To understand details and images of nearly 300 patents filed by the National Security Agency, Alice Corona collected data made available by the USPTO, put into the Silk data publishing and data visualization platform.
- NASARI: a Novel Approach to a Semantically-Aware Representation of Items - Sep 3, 2015.
NASARI 2.0 semantic vector representations for BabelNet synsets in several languages. BabelNet covers WordNet and Wikipedia among other resources, enabling these vectors to be applicable for representations of concepts.
- CHEMDNER competition: Chemical and drug name recognition task in patents - Jul 18, 2015.
Want to test your data science skills? Get ready for the next text mining and information retrieval challenge by CHEMDNER.
- Mitra Capital: Data Scientist/Machine Learning Engineer - Jul 16, 2015.
Seeking an experienced Data Scientist to help create innovative technology solutions at the intersection of language analysis, data-driven insights and logic-based workflow.
- Top KDnuggets tweets, Jun 16-22: Deep Learning resources from O’Reilly; Free Kaggle Machine Learning Tutorial in R - Jun 23, 2015.
#DeepLearning resources from @OReillyMedia to help you get started; Free @Kaggle #MachineLearning Tutorial in R - learn how to compete in #DataScience; Data Scientists, enjoy your fat salaries while you can; Computational Aesthetics #Algorithm Spots #Beauty That Humans Overlook.
- Most Viewed Data Mining Videos on YouTube - May 18, 2015.
The top Data Mining YouTube videos by those like Google and Revolution Analytics covers topics ranging from statistics in data mining to using R for data mining to data mining in sports.
- Provalis Research WordStat for Stata combines Numerical, Text Analysis - Apr 14, 2015.
This new collaboration couples the cutting-edge numerical analysis of Stata with the unique text analytics functionality of Provalis Research.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Feb 24 and beyond - Feb 23, 2015.
Winning with Big Data Analytics, a Roadmap for Data-Driven Culture, Data Science for Workforce Optimization, Text Mining and Knowledge Graphs in the Cloud, Performance and Scale Options for R with Hadoop, and more.
- Prismatic Interest Graph [API]: Organize and Recommend Content - Feb 20, 2015.
Prismatic Interest Graph API provides a set of tools for automatically analyzing unstructured text and annotating it with a variety of tags that are useful for organizing and recommending content.
- Fun and Top! US States in 2 Words using twitteR - Feb 19, 2015.
Combining twitteR package with text mining techniques and visualization tools can produce interesting outputs. Find out which US state is fun and top, and which is good and crazy, according to Twitter.
- Ontotext: Integrated Text Mining and Triplestores, a form of graph database - Feb 12, 2015.
Learn about 2 hot trends: RDF triplestores, a form of graph database, and the use of text mining to extract meaning from Big Data, and how Ontotext enables both. Free eval, Feb 26 webinar, and more.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Feb 10 and beyond - Feb 9, 2015.
Data Mining: Failure to Launch, 3 Ways to Improve your Regression, The Pragmatic Text Miner, Make It Big As a Data Scientist in 2015, Managing Big Data in Production and more.
- Year 2014 in Review as Seen by a Event Detection System - Jan 29, 2015.
We examine the significant events of 2014 found by event/trend detection tool Signi-Trend, including Sochi, Ukraine and Russia, Malaysian airlines, and Islamic State (ISIS).
- Webinar: The Pragmatic Text Miner, Feb 11 - Jan 29, 2015.
Learn about text mining of biomedical literature, challenges in building large collections, and how bioinformatics professionals can overcome those challenges.
- BabelNet 3.0, Large Multilingual Dictionary and Semantic Network - Dec 20, 2014.
BabelNet 3.0 covers 271 languages, and offers brand-new user interface, Improved accuracy of seamless integration of WordNet, Open Multilingual WordNet, Wikipedia, OmegaWiki, Wikidata and Wiktionary, around 2 billion RDF triples available via a public SPARQL endpoint.
- Lexalytics: Stop ignoring your text data - Dec 9, 2014.
Have a bunch of open-ended questions that need analyzing? Curious about social media? Want to track online reviews? Try Lexalytics Semantria Web Service - first 20k documents analyzed free.
- Apple: Text Mining Analyst, Retail – Online - Dec 7, 2014.
Design, develop, and field unstructured data analyses that have direct and measurable impact to the management of the Apple Online and Retail Stores.
- Big Data & Analytics Innovation Summit, Australia: Day 2 Highlights - Oct 28, 2014.
Highlights from the presentations by Big Data leaders from Paypal, Huawei and Qantas on day 2 of Big Data & Analytics Innovation Summit 2014 in Sydney, Australia.
- Catalytic DS: Biomedical Text Mining Developer - Oct 16, 2014.
Help develop cloud-based text analytics solutions that enable researchers to use biomedical information locked in vast repositories of 'read only' scientific publications.
- Big Data & Analytics for Retail Summit 2014 Chicago: Day 2 Highlights - Oct 9, 2014.
Highlights from the presentations by Big Data leaders from The Hershey Company, Gongos, Clarks, and Mediacom on day 2 of Big Data & Analytics for Retail Summit 2014 in Chicago.
- Top stories for Sep 28 – Oct 4: Mirador, a free tool for visual exploration of complex datasets - Oct 5, 2014.
Mirador, a free tool for visual exploration of complex datasets; Data Science is mainly a Human Science; Get Started in Text Analytics; Associations and Text Mining of World Events.
- Get Started in Text Analytics - Sep 30, 2014.
Text analytics / text mining is the natural extension of predictive analytics and has wide applications in marketing, business, and many industries. Learn text analytics with Statistics.com online program that starts Feb 6.
- Associations and Text Mining of World Events - Sep 30, 2014.
Applying frequent itemset analysis to text may seem daunting, but parallel hardware and two insights open the door to theme extraction.
- Most Viewed Web Mining Lectures - Sep 18, 2014.
Discover interesting lectures on topics like mining information networks and identifying influential members of online communities in this list of the top viewed web mining lectures on videolectures.net.
- Top KDnuggets tweets, Sep 10-14: Most Viewed Machine Learning and Text Mining Talks - Sep 16, 2014.
Most Viewed Machine Learning Talks and Text Mining Lectures at Videolectures; 3 Marks of Real #DataScience; MOOC: Process Mining: Data science in Action.
- Most Viewed Text Mining Lectures - Sep 12, 2014.
View some of the most popular text mining lectures from videolectures.net and learn about topics including mining web data and building large-scale information retrieval systems.
- Ontotext text mining, semantic search, and graph database - Sep 2, 2014.
Ontotext blends text mining, semantic annotation and semantic search with a graph database that infers new meaning at scale, helping organizations find meaning in large volumes of structured and unstructured data.
- Best Text Analytics Summit Presentations - Aug 26, 2014.
Before this year's Text Analytics Summit West, read the previous year's summit's best received presentations and learn about leveraging text analytics for business gain.
- Rule14: Text Mining/Machine Learning Expert - Aug 7, 2014.
Design and implementation of key-phrase/sentence extraction and summarization algorithms, within a variety of domains including: legal, healthcare, banking/finance/insurance.
- Interview: Vita Markman, LinkedIn on Practical Solutions for Sentiment Mining Challenges - Aug 4, 2014.
We discuss sentiment data models, significance of linguistic features, handling the noise in social conversations, industry challenges, important use cases and the appropriateness of over-simplified binary classification.
- Interview: Thomas Levi, POF on How Online Dating is Improving Matching through Big Data - Jul 29, 2014.
We discuss Big Data use cases at Plenty of Fish, insights from text mining of user profiles, using topic modeling for developing user archetypes, challenges and more.
- Interview: Kavita Ganesan, FindiLike on Building Decision Support Systems based on User Opinions - Jul 27, 2014.
We discuss the founding story of FindiLike, Opinion-driven Decision Support Systems (ODSS), challenges in analyzing user opinions, future of Sentiment Analysis, favorite books and more.
- Interview: Piero Ferrante, BCBS on Why Healthcare is Rich in Data but Poor in Information - Jul 17, 2014.
We discuss role of analytics in healthcare payer firms, major challenges in leveraging healthcare data, shift to value-based payments, personal motivation towards analytics, career advice and more.
- Interview: Samaneh Moghaddam, Applied Researcher, eBay on Aspect-based Opinion Mining - Jun 26, 2014.
We discuss aspect-based opinion mining, major challenges, cold start items, the need for accurate opinion mining models for cold start items and how factorized LDA can be leveraged.
- Top KDnuggets tweets, May 9-11: Data Mining for Statisticians; For teachers (and students) of Machine Learning - May 12, 2014.
Data Mining for Statisticians ; For teachers (and students) of #MachineLearning - Slides for LIONbook; Build a word cloud using R text mining tools - step-by-step; Graph Theory: Key to Understanding #BigData - graphs are not just for Google or eBay.
- Additions to KDnuggets Directory in February - Mar 25, 2014.
CSMR Data Miner software suite, Ascribe text mining software, and more added to KDnuggets in February 2014.
- New book: Big Data, Mining, and Analytics: Components of Strategic Decision Making - Mar 15, 2014.
This book ties together big data, data mining, and analytics to explain how readers can leverage them to extract valuable insights from their data.
- Poll Results: Text Analytics Use Shows No Significant Change - Feb 18, 2014.
Surprisingly, latest KDnuggets Poll did not find a significant change in Text Analytics use over the past 2 years. While 66% make some use of text analytics, only 19% use it on the majority of their projects. Text Analytics seems to take off very slowly.