5 Fantastic Natural Language Processing Books

This curated collection of 5 natural language processing books attempts to cover a number of different aspects of the field, balancing the practical and the theoretical. Check out these 5 fantastic selections now in order to improve your NLP skills.



 

With all of the available options for learning, sometimes books get overlooked. It's an odd thought, but with all of the tutorials, blog posts, courses, etc. available all over the internet, often times the tried and true book takes a backseat. And even if you are looking for a book on a subject, you can quickly figure out that there are far too many of them out there to make a snap judgment as to which one will be the best for you.

To help solve a problem, here are 5 fantastic books that can help you build your natural language processing knowledge. Unlike a lot of other book lists, I can say that I own, have read, and recommend each of the books in this collection. With the exception of the first entry, these books are not free, but they have proven to be worth the investment, at least from my point of view. I have chosen a diverse set of books covering different areas of study, and so I hope that there is something herein which everyone finds useful.

 

1. Natural Language Processing with Python

 
Our first book, by Steven Bird, Ewan Klein & Edward Loper, is great starting spot for learning the practical basics of natural language processing from the point of view of the Python ecosystem. Also known as the NLTK Book, Natural Language Processing with Python leans heavily on the NLTK library throughout, which is a useful piece of software for learning purposes.

From the book's preface:

This book provides a highly accessible introduction to the field of NLP. It can be used for individual study or as the textbook for a course on natural language processing or computational linguistics, or as a supplement to courses in artificial intelligence, text mining, or corpus linguistics. The book is intensely practical, containing hundreds of fully-worked examples and graded exercises.
[...]
This book is intended for a diverse range of people who want to learn how to write programs that analyze written language, regardless of previous programming experience.

As stated above, the book is definitely of a practical nature. While you will assuredly have concepts explained as you go, there is little doubt that the book is crafted as a launchpad for those looking to get going with implementing NLP solutions with Python, and doing so now.

 

2. Natural Language Processing with PyTorch

 
Written by Delip Rao & Brian McMahan, the second book in our collection moves on from traditional NLP techniques to those using neural networks. Another practical approach to the subject, Natural Language Processing with PyTorch jumps straight into applying neural network NLP methods using PyTorch.

Directly from the book's website, some of the topics covered include:

  • Explore computational graphs and the supervised learning paradigm
  • Master the basics of the PyTorch optimized tensor manipulation library
  • Get an overview of traditional NLP concepts and methods
  • Learn the basic ideas involved in building neural networks
  • Use embeddings to represent words, sentences, documents, and other features
  • Explore sequence prediction and generate sequence-to-sequence models
  • Learn design patterns for building production NLP systems

This is a great starting point for transitioning from more traditional (non-neural network based) NLP techniques to those which have consumed the field in the past few years, which are heavily reliant on deep learning.

 

3. Neural Network Methods for Natural Language Processing

 
Yoav Goldberg writes this book on neural network methods for NLP. You may have begun implementing such methods using the previous book, and while "Natural Language Processing with PyTorch" does a fine job of outlining the intuitions behind its methods, Goldberg's book takes a deeper dive into explaining these concepts without the burden of implementing them in code.

From the book's website:

This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words.
[...]
The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications.

Firmly in the realm of the theoretical or explanatory, Neural Network Methods for Natural Language Processing will go a long way to shoring up your understanding of how modern neural network based approaches to NLP work, and why they are employed.

 

4. Linguistic Fundamentals for Natural Language Processing

 
Of course, flying blind with respect to linguistic fundamentals is not a great idea when working with NLP, and can be of special concern when approaching NLP or computational linguistics from the purely computational side, lacking any formal study in linguistics. This book by Emily M. Bender seeks to help bridge this gap.

The book's website describes the book's purpose as such:

The purpose of this book is to present in a succinct and accessible fashion information about the morphological and syntactic structure of human languages that can be useful in creating more linguistically sophisticated, more language-independent, and thus more successful NLP systems.

Bender backs this up with the following from Chapter 1:

[K]nowledge about linguistic structures can inform the design of features for machine learning approaches to NLP. Put more strongly: knowledge of linguistic structure will lead to the design of better features for machine learning.

The book is organized as 100 individual "essentials" for better understanding morphology and syntax, with the essentials grouped into chapters of related topics. If you do not have a linguistics background (I do not), this book may be a painstaking read (it is supposed to be) but will undoubtedly lead to a better linguistic understanding you can put to use in your NLP career.

 

5. Natural Language Processing in Action

 
Finally, this book by Hobson Lane, Hannes Hapke & Cole Howard is a return to the practical. Covering both traditional and neural network based approaches to NLP, Natural Language Processing in Action could be considered a combination of the first 2 books in this list, covering practical coding solutions using modern tools such as TensorFlow and Keras, among others.

From the book's website:

Natural Language Processing in Action is your guide to building machines that can read and interpret human language. In it, you’ll use readily available Python packages to capture the meaning in text and react accordingly. The book expands traditional NLP approaches to include neural networks, modern deep learning algorithms, and generative techniques as you tackle real-world problems like extracting dates and names, composing text, and answering free-form questions.

As a consequence of being the most recently released book in this list (just narrowly edging out Natural Language Processing with PyTorch) as well as that with the most pages, it is likely the most up-to-date and comprehensive practical book in this list, and perhaps even currently available on the market. But that doesn't mean it should be your default choice here either; it depends on the ecosystem you want to work in, and the level of detail you are looking to gain, among other considerations.

 
You can't go wrong with any of the books on this list. First figure out what exactly you are looking to learn, and make a selection accordingly.

 
Related: