Deep Learning on your phone: PyTorch C++ API for use on Mobile Platforms

The PyTorch Deep Learning framework has a C++ API for use on mobile platforms. This article shows an end-to-end demo of how to write a simple C++ application with Deep Learning capabilities using the PyTorch C++ API such that the same code can be built for use on mobile platforms (both Android and iOS).

By Dhruv Matani, software engineer at Meta (Facebook).

PyTorch is a Deep Learning framework for training and running Machine Learning (ML) Models, accelerating the speed from research to production.

The PyTorch C++ API can be used to write compact, performance sensitive code with Deep Learning capabilities to perform ML inference on mobile platforms. For a general introduction on how to deploy a PyTorch model to production, please see this article.

PyTorch Mobile currently supports deploying pre-trained models for inference on both Android and iOS platforms.


Steps at a high level


  1. Train a model (or fetch a pre-trained model), and save in the lite-interpreter format. The lite-interpreter format for a PyTorch model makes the model compatible for running on mobile platforms. This step is not covered in this article.
  2. Download the PyTorch source code, and build PyTorch from source. This is recommended for mobile deployments.
  3. Create a .cpp file with code to load your model, run inference using forward(), and print the results.
  4. Build the .cpp file, and link against PyTorch shared object files.
  5. Run the file and see the output.
  6. Profit!


Steps in Detail


Download and install PyTorch from source on a Linux machine.

For other platforms, please see this link.

# Setup conda and install dependencies needed by PyTorch
conda install astunparse numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses

# Get the PyTorch source code from github
cd /home/$USER/
# Repo will be cloned into /home/$USER/pytorch/
git clone --recursive
cd pytorch
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive --jobs 0

# Build PyTorch
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python install


Create source .cpp file

Store the .cpp file in the same folder as your AddTensorsModelOptimized.ptl file. Let’s call the C++ source file PTDemo.cpp. Let’s call this directory /home/$USER/ptdemo/.

#include <ATen/ATen.h>
#include <torch/library.h>
#include <torch/csrc/jit/mobile/import.h>

using namespace std;

int main() {
  // Load the PyTorch model in lite interpreter format.
  torch::jit::mobile::Module model = torch::jit::_load_for_mobile(

  std::vector inputs;

  // Create 2 float vectors with values for the 2 input tensors.
  std::vector v1 = {1.0, 5.0, -11.0};
  std::vector v2 = {10.0, -15.0, 9.0};

  // Create tensors efficiently from the float vector using the
  // from_blob() API.
  std::vector dims{static_cast(v1.size())};
  at::Tensor t1 = at::from_blob(, at::IntArrayRef(dims), at::kFloat);
  at::Tensor t2 = at::from_blob(, at::IntArrayRef(dims), at::kFloat);

  // Add tensors to inputs vector.

  // Run the model and capture results in 'ret'.
  c10::IValue ret = model.forward(inputs);

  // Print the return value.
  std::cout << "Return Value:\n" << ret << std::endl;

  // You can also convert the return value into a tensor and
  // fetch the underlying values using the data_ptr() method.
  float *data = ret.toTensor().data_ptr();
  const int numel = ret.toTensor().numel();

  // Print the data buffer.
  std::cout << "\nData Buffer: ";
  for (int i = 0; i < numel; ++i) {
    std::cout << data[i];
    if (i != numel - 1) {
      std::cout << ", ";
  std::cout << std::endl;


Note: We use symbols from the at:: namespace and not the torch:: namespace (which is what is used by tutorials on the PyTorch web-site) since for mobile builds, we won’t be including the full-jit (TorchScript) interpreter. Instead, we will have access only to the lite-interpreter.

Build and Link

# This is where your PTDemo.cpp file is
cd /home/$USER/ptdemo/

# Set LD_LIBRARY_PATH so that the runtime linker can
# find the .so files

# Compile the PTDemo.cpp file. The command below should
# produce a file named 'a.out'
g++ PTDemo.cpp \
  -I/home/$USER/pytorch/torch/include/ \
  -L/home/$USER/pytorch/build/lib/ \
  -lc10 -ltorch_cpu -ltorch


Run the application



The command should print the following:

Return Value:
[ CPUFloatType{3} ]

Data Buffer: 11, -5, 8




In this article, we saw how to use the x-platform C++ PyTorch API to load a pre-trained lite-interpreter model with inputs and run inference using that model. The C++ code used is platform agnostic and can be built and run on all mobile platforms supported by PyTorch.




Bio: Dhruv Matani is a software engineer at Meta (Facebook), where he leads projects related to PyTorch (Open Source AI Framework). He is an expert on PyTorch internals, PyTorch Mobile, and is a significant contributor to PyTorch. His work is impacting billions of users across the world. He has extensive experience building and scaling infrastructure for the Facebook Data Platform. Of note are contributions to Scuba, a realtime data analytics platform at Facebook, used for rapid product and system insights. He has a M.S. in Computer Science from Stony Brook University.