KDnuggets Home » News » 2018 » Feb » Tutorials, Overviews » 3 Essential Google Colaboratory Tips & Tricks ( 18:n07 )

3 Essential Google Colaboratory Tips & Tricks


Google Colaboratory is a promising machine learning research platform. Here are 3 tips to simplify its usage and facilitate using a GPU, installing libraries, and uploading data files.



Colab

Like many of you, I have been very excited by Google's Colaboratory project. While it isn't exactly new, its recent public release has generated a lot of renewed interest in the collaborative platform.

For those that don't know, Google Colaboratory is...

[...] a Google research project created to help disseminate machine learning education and research. It's a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.

Here are a few simple tips for making better use of Colab's capabilities while you play around with it. To be clear, these aren't hidden hacks, but a handy collection of documented (and further clarified) functionality that may be essential.

 
1. Using a Free GPU Runtime

Select "Runtime," "Change runtime type," and this is the pop-up you see:

Colab

Ensure "Hardware accelerator" is set to GPU (the default is CPU). Afterward, ensure that you are connected to the runtime (there is a green check next to "connected" in the menu ribbon).

To check whether you have a visible GPU (i.e. you are currently connected to a GPU instance), run the following excerpt (directly from Google's code samples):

import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))


If you are connected, here is the response:

Found GPU at: /device:GPU:0


Alternatively, supply and demand issues may lead to this:

Colab GPU fail

And there you go. This allows you to access a free GPU for up to 12 hours at a time.

 
2. Installing Libraries

Currently, software installations within Google Colaboratory are not persistent, in that you must reinstall libraries every time you (re-)connect to an instance. Since Colab has numerous useful common libraries installed by default, this is less of an issue than it may seem, and installing those libraries which are not pre-installed are easily added in one of a few different ways.

You will want to be aware, however, that installing any software which needs to be built from source may take longer than is feasible when connecting/reconnecting to your instance.

Colab supports both the pip and apt package managers. Regardless of which you are using, remember to prepend any bash commands with a !.

# Install Keras with pip
!pip install -q keras
import keras

>>> Using TensorFlow backend.


# Install GraphViz with apt
!apt-get install graphviz -y


Colab

 
3. Uploading and Using Data Files

You need to data to use in your Colab notebooks, right? You could use something like wget to grab data from the web, but what if you have some local files you want to upload to your Colab environment within your Google Drive and use them?

Here's the easiest way to do so, IMO, with a little direction from here.

In a 3 step process, first invoke a file selector within your notebook with this:

from google.colab import files
uploaded = files.upload()


After your file(s) is/are selected, use the following to iterate the uploaded files in order to find their key names, using:

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(name=fn, length=len(uploaded[fn])))


Example output:

User uploaded file "iris.csv" with length 3716 bytes


Now, load the contents of the file into a Pandas DataFrame using the following:

import pandas as pd
import io
df = pd.read_csv(io.StringIO(uploaded['iris.csv'].decode('utf-8')))
print(df)


There you go. There are other ways out there of getting to the same place uploading and using data files, but I find this one the most straightforward and simple.

 
Google Colab has me excited to try machine learning in a similar way as using Jupyter notebooks, but with less setup and administration. That's the idea, anyways; we'll see how it plays out.

If you have any helpful Colab tips or tricks, leave them in the comments below.

 
Related:


Sign Up