12 Docker Commands Every Data Scientist Should Know
Looking to add Docker to your data science toolbox? Here’s a list of essential Docker commands to help you get started.
Image by Author
Working on a data science project is always exciting. However, it is not without challenges. Each project requires you to install a (possibly) long list of libraries and specific versions of each library. So wrapping your head around the project’s dependency can be quite challenging. Here’s where Docker can help.
Docker is a popular containerization technology. With Docker, you can package your data science application—along with the code and required dependency—into a portable artifact called the image. Thus Docker facilitates replication of the development environment and makes local development a breeze.
Here’s a list of essential Docker commands that’ll come in handy as you’re coding your way through your next project. We’ll work with images from Docker Hub, one of the most popular platforms to find, share, and manage container images.
1. docker pull
To the pull an image from the from Docker Hub, you can run the
docker pull command as shown:
docker pull <name-of-the-image>
For example, to pull the Python image from Docker Hub, you can run the following command:
docker pull python
By default, this command pulls the latest version of the image available. You can optionally add a tag to pull a specific version of the image.
Note: If you'd like to run the Docker commands as a user without superuser permissions, create the
dockergroup and add the user to that group.
2. docker images
To view the list of all the downloaded images, you can run the
docker images command.
3. docker run
You can start a container from the downloaded image using the docker run command. After you’ve pulled the image from the registry, you can spin up a docker container, a running instance of the image, as shown:
docker run <name-of-the-image> docker run [options] <name-of-the-image>
For example, you can use the -i option to launch an interactive Python REPL while starting the container, and the -t option assigns a pseudo-tty, as shown:
An image is a portable artifact and a container is a running instance of the image. This means you can run multiple containers from a single Docker image.
Image by Author
4. docker ps
You can run the
docker ps command to get a list of all the running containers.
Note that there’s a
CONTAINER ID associated with each Docker container. Over the next few minutes, we’ll learn Docker commands to stop and restart containers, examine logs, and more. We’ll use the
CONTAINER ID of a particular container in those commands.
Suppose you ran a container in one of the previous sessions, and the container is not running anymore. In this case, you can run the
docker ps command with the
-a option. This will list all the containers: those that are currently running as well as those that were stopped previously.
docker ps -a
5. docker stop
You may sometimes need to stop a running container. To do so, run the
docker stop command.
docker stop <CONTAINER ID>
6. docker start
You can use the
docker start command to restart a previously stopped container. You can run the
docker ps -a command, grab the container ID, and then use it in the
docker start command to restart a container.
docker start <CONTAINER ID>
7. docker rmi
To remove a specific image, you can run the
docker rmi command.
docker rmi <name-of-the-image>
Running this command removes the image from your local development environment. The next time you’d like to start a container from the image, you’ll need to pull the image from DockerHub.
8. docker rm
To remove a container permanently from your development environment, you can run the
docker rm command. However it's recommended to ensure that the container is stopped before attempting to remove it.
docker rm <CONTAINER ID>
9. docker logs
The docker logs command can be especially helpful when debugging containers.
docker logs <CONTAINER ID>
10. docker exec
docker exec command, you can execute commands run inside a running container.
docker exec <CONTAINER ID> <COMMAND> <ARGS>
Try it yourself: As a quick exercise to sum up what you've learned, pull the official Bash image from Docker Hub. Next, try starting an interactive terminal session when spinning up the container, and run a basic Bash command.
11. docker version
To check the version of docker installed in your working environment, run the
docker version command:
12. docker info
docker info command provides more granular information on the system-wide installation of Docker.
Output of docker info (truncated)
I hope you found this tutorial on essential docker commands helpful. Once you’re familiar with Docker, you can try dockerizing your Python and data science applications. You can then push your application’s image to DockerHub. Other developers will then be able to pull your image and spin up containers—in their working environment—all with a single command.
Bala Priya C is a technical writer who enjoys creating long-form content. Her areas of interest include math, programming, and data science. She shares her learning with the developer community by authoring tutorials, how-to guides, and more.