Master Transformers with This Free Stanford Course!

If you want a deep dive on transformers, this Stanford course has made its courseware freely available, including lecture videos, readings, assignments, and more.

Master Transformers for NLP with This Free Stanford Course!
Screenshot from Transformers United: DL Models that have revolutionized NLP, CV, RL


State of the art (SOTA) natural language processing (NLP) means transformers!

Not Optimus Prime or Bumblebee (80s kids stand up!) but the neural network architecture that has turned the world of NLP on its head over the past couple of years.

There are many resources for learning about transformers and how they apply to SOTA NLP, but there seems to literally be only this one courses that takes a methodical, academic approach to really understanding transforms inside out, and that class is Stanford's CS25: Transformers United.

Let me get this out of the way right away: this is not a course you can enroll in and receive credit for. This is the freely-available content of the first university course on transformers (as far as I can tell), from Stanford instructors, under the tutelage of renowned NLP researcher and faculty member, Chris Manning. But don't get upset; all of the course's material is available, from readings to videos and beyond. If you are really interested in understanding transformers under the hood, this is the course for you.

And while transformers may be most closely associated with NLP applications, there's more than meets the eye.


Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding.

From Stanford's CS25: Transformers United course homepage


This course covers it all, spread out over 10 weekly lectures. You first start with attention, the concept that gave way to transformers, and move on to transformers themselves, then on to their NLP applications, vision transformers, pretrained transformers, switch transformers, and much more. You get lectures by Geoff Hinton. Chris Olah, Aidan Gomez, and many other knowledgeable researchers in the field. The course instructors are Div Garg, Chetanya Rastogi, Advay Pal, with Chris Manning, as previously mentioned, serving as faculty advisor.

You can find the lecture videos below, as well as on YouTube.



This seems to be the current pinnacle of transformer courseware available anywhere, so if you are serious about learning transformers in depth, head over to the course page and get started right now.

Matthew Mayo (@mattmayo13) is a Data Scientist and the Editor-in-Chief of KDnuggets, the seminal online Data Science and Machine Learning resource. His interests lie in natural language processing, algorithm design and optimization, unsupervised learning, neural networks, and automated approaches to machine learning. Matthew holds a Master's degree in computer science and a graduate diploma in data mining. He can be reached at editor1 at kdnuggets[dot]com.