Deep Learning on your phone: PyTorch C++ API for use on Mobile Platforms
The PyTorch Deep Learning framework has a C++ API for use on mobile platforms. This article shows an end-to-end demo of how to write a simple C++ application with Deep Learning capabilities using the PyTorch C++ API such that the same code can be built for use on mobile platforms (both Android and iOS).
By Dhruv Matani, software engineer at Meta (Facebook).
PyTorch is a Deep Learning framework for training and running Machine Learning (ML) Models, accelerating the speed from research to production.
The PyTorch C++ API can be used to write compact, performance sensitive code with Deep Learning capabilities to perform ML inference on mobile platforms. For a general introduction on how to deploy a PyTorch model to production, please see this article.
PyTorch Mobile currently supports deploying pre-trained models for inference on both Android and iOS platforms.
Steps at a high level
- Train a model (or fetch a pre-trained model), and save in the lite-interpreter format. The lite-interpreter format for a PyTorch model makes the model compatible for running on mobile platforms. This step is not covered in this article.
- Download the PyTorch source code, and build PyTorch from source. This is recommended for mobile deployments.
- Create a .cpp file with code to load your model, run inference using forward(), and print the results.
- Build the .cpp file, and link against PyTorch shared object files.
- Run the file and see the output.
- Profit!
Steps in Detail
Download and install PyTorch from source on a Linux machine.
For other platforms, please see this link.
# Setup conda and install dependencies needed by PyTorch conda install astunparse numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses # Get the PyTorch source code from github cd /home/$USER/ # Repo will be cloned into /home/$USER/pytorch/ git clone --recursive https://github.com/pytorch/pytorch cd pytorch # if you are updating an existing checkout git submodule sync git submodule update --init --recursive --jobs 0 # Build PyTorch export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"} python setup.py install
Create source .cpp file
Store the .cpp file in the same folder as your AddTensorsModelOptimized.ptl file. Let’s call the C++ source file PTDemo.cpp. Let’s call this directory /home/$USER/ptdemo/.
#include <ATen/ATen.h> #include <torch/library.h> #include <torch/csrc/jit/mobile/import.h> #include using namespace std; int main() { // Load the PyTorch model in lite interpreter format. torch::jit::mobile::Module model = torch::jit::_load_for_mobile( "AddTensorsModelOptimized.ptl"); std::vector inputs; // Create 2 float vectors with values for the 2 input tensors. std::vector v1 = {1.0, 5.0, -11.0}; std::vector v2 = {10.0, -15.0, 9.0}; // Create tensors efficiently from the float vector using the // from_blob() API. std::vector dims{static_cast(v1.size())}; at::Tensor t1 = at::from_blob( v1.data(), at::IntArrayRef(dims), at::kFloat); at::Tensor t2 = at::from_blob( v2.data(), at::IntArrayRef(dims), at::kFloat); // Add tensors to inputs vector. inputs.push_back(t1); inputs.push_back(t2); // Run the model and capture results in 'ret'. c10::IValue ret = model.forward(inputs); // Print the return value. std::cout << "Return Value:\n" << ret << std::endl; // You can also convert the return value into a tensor and // fetch the underlying values using the data_ptr() method. float *data = ret.toTensor().data_ptr(); const int numel = ret.toTensor().numel(); // Print the data buffer. std::cout << "\nData Buffer: "; for (int i = 0; i < numel; ++i) { std::cout << data[i]; if (i != numel - 1) { std::cout << ", "; } } std::cout << std::endl; }
Note: We use symbols from the at:: namespace and not the torch:: namespace (which is what is used by tutorials on the PyTorch web-site) since for mobile builds, we won’t be including the full-jit (TorchScript) interpreter. Instead, we will have access only to the lite-interpreter.
Build and Link
# This is where your PTDemo.cpp file is cd /home/$USER/ptdemo/ # Set LD_LIBRARY_PATH so that the runtime linker can # find the .so files LD_LIBRARY_PATH=/home/$USER/pytorch/build/lib/ export LD_LIBRARY_PATH # Compile the PTDemo.cpp file. The command below should # produce a file named 'a.out' g++ PTDemo.cpp \ -I/home/$USER/pytorch/torch/include/ \ -L/home/$USER/pytorch/build/lib/ \ -lc10 -ltorch_cpu -ltorch
Run the application
./a.out
The command should print the following:
Return Value: 11 -5 8 [ CPUFloatType{3} ] Data Buffer: 11, -5, 8
Conclusion
In this article, we saw how to use the x-platform C++ PyTorch API to load a pre-trained lite-interpreter model with inputs and run inference using that model. The C++ code used is platform agnostic and can be built and run on all mobile platforms supported by PyTorch.
References
- Introduction to PyTorch For Deep Learning
- Getting Started With PyTorch
- Deploy your PyTorch Model to Production
- Contributing to PyTorch: By someone who doesn’t know a ton about PyTorch
- PyTorch Mobile
- PyTorch C++ API
- Build PyTorch from source
- PyTorch iOS 101 Video walk through
Bio: Dhruv Matani is a software engineer at Meta (Facebook), where he leads projects related to PyTorch (Open Source AI Framework). He is an expert on PyTorch internals, PyTorch Mobile, and is a significant contributor to PyTorch. His work is impacting billions of users across the world. He has extensive experience building and scaling infrastructure for the Facebook Data Platform. Of note are contributions to Scuba, a realtime data analytics platform at Facebook, used for rapid product and system insights. He has a M.S. in Computer Science from Stony Brook University.
Related: