PyTorch or TensorFlow? Comparing popular Machine Learning frameworks
Machine Learning with PyTorch and Scikit-learn is the PyTorch book from the widely acclaimed and bestselling Python Machine Learning series, fully updated and expanded to cover PyTorch, transformers, graph neural networks, and best practices.
You may wonder, with TensorFlow remaining a prominent framework in the deep learning industry, why we bothered to write a PyTorch book of the Python Machine Learning series, Machine Learning with PyTorch and Scikit-Learn. As a matter of fact, PyTorch has become the most widely-used deep learning framework in the academic and research community. To examine this further, let me provide an up-to-date and more comprehensive comparison between PyTorch and TensorFlow.
Dynamic Computational Graphs
Supporting dynamic computational graphs is an advantage of PyTorch over TensorFlow. The computational graphs in PyTorch are built on-demand compared to their static TensorFlow counterparts. This makes PyTorch more debug-friendly: you can execute the code line by line while having full access to all variables. TensorFlow 2.0 applies eager execution that also evaluates operations immediately, similar to PyTorch; however, it comes at the cost of reduced efficiency and speed.
State-of-the-Art Model Availability
Researchers, computational resources restricted startups, and individual practitioners often utilize pre-trained State-of-the-Art (SOTA) models for transfer learning and fine-tuning. As of early 2022, most SOTA models are in PyTorch, for example:
- In the largest hub of ready-to-use ML models, HuggingFace (https://huggingface.co/models), there are 21,000 models in PyTorch and only 2,000 in TensorFlow.
- According to http://horace.io/pytorch-vs-tensorflow/, PyTorch papers take up more than 75% of the total publications from the nine top ML journals and conferences.
- The trend in Papers With Code (https://paperswithcode.com/trends) shows that close to 60% of implementations are in PyTorch, while only 11% are in TensorFlow.
Clearly, PyTorch is the winner from the perspective of model availability.
While PyTorch dominates in the research domain, TensorFlow is more mature in terms of production deployment due to its robust deployment framework. You can painlessly deploy models on servers using TensorFlow Serving and use TensorFlow Extended (TFX) to create and manage end-to-end pipelines. Moreover, TensorFlow Lite (TFLite) optimizes models to serve on mobile and IoT devices.
Deploying PyTorch models used to rely on third-party APIs. PyTorch has closed the gap in production deployment in recent years. Native deployment tools TorchServe and PyTorch Live were released around 2020. TorchServe is a great tool for deploying trained PyTorch models without having to write custom code. It provides an easy-to-use command-line interface and supports lightweight serving at scale on many environments, including Amazon SageMaker, EKS, and Kubernetes. PyTorch Live targets mobile devices, similar to TFLite.
Final Words: Which Should I Choose?
There is no definite answer to this. However, for ML or DL beginners, PyTorch has an easier learning curve as it is more Pythonic and debug-friendly and more straightforward when it comes to handling data. On the other hand, TensorFlow could indicate a steeper learning curve because of the low-level implementations of the neural network structure. Although TensorFlow has the high-level Keras API, which makes it easy to get started learning the basic concepts, PyTorch has a good trade-off between being easy to use and being more easily customizable than Keras.
Given their reputation in the academy and industry, if you are more into research, PyTorch could be your first choice unless you are working on reinforcement learning. In this case, TensorFlow has a phenomenal module (https://www.tensorflow.org/agents/overview) designed for such projects.