Jupyter Notebooks: Data Science Reporting
Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks.
Jupyter has become a defacto platform for many of us due to its simple design, interactivity benefits and cross-language support all in one place. There are other ways to use a notebook environment but none so far I have seen offer so many benefits than Jupyter.
I say Jupyter because previously there was only Jupyter Notebooks but now there's Jupyter Lab as well and other notebook environments based on Jupyter.
Here are some simple ways to organize your project (this is based on my personal experience).
1. Install nb-extensions
This is the basis for efficient reporting in Jupyter
I recommend installing through Anaconda since it will also automatically install the Javascript and CSS files that are also needed.
conda install -c conda-forge jupyter_contrib_nbextensions conda install -c conda-forge jupyter_nbextensions_configurator
If your having trouble installing it through Anaconda, use pip instead
pip install jupyter_nbextensions_configurator jupyter_contrib_nbextensions jupyter contrib nbextension install --user jupyter nbextensions_configurator enable --user
Once you have completed installing nbextensions you can start your Jupyter notebook environment and navigate to the tab.
Once you have this just enable the extensions you would like and experiment to see which extensions will help you be more productive.
if you do not see this tab at any point you use Jupyter notebooks just open a new kernel and go to Edit -> nbextensions config
2. Table of Contents
You can enable an interactive TOC and one which will appear at the very top of your notebook. The interactive version will appear on the left side of your screen by default but can then be moved to another part of your notebook if you wish.
3. Use Markdown
This one is more or less obvious to those who already use it. Some basic commands are:
- Esc + m to convert a code chunk into a markdown cell
- Ctrl + Enter to execute the markdown chunk and turn it into plain text.
- Esc + m again if you want to edit a markdown chunk
Here is a list of general markdown commands: https://www.markdownguide.org/basic-syntax/
Use Headings
This goes hand-in-hand with markdown and the Table of Contents. As can be seen in the above image, adding headings with markdown in Jupyter will automatically section it and add it to he list of contents making it easier for the user to scroll and locate whatever they want.
Use Latex
This is dependent on the user of course as not everyone needs to have mathematical equations rendered on their notebook.
See here for more info on using markdown in Jupyter
4. Use Scratchpad
Using this extension allows us to reduce non essential code that would otherwise end up in the notebook. If you need to verify anything from previous output, you can just type it into the Scratchpad and it will not end up in the notebook
Go here for more info: Scratchpad for Jupyter Notebooks
5. Hide code
Hide selected cells
If you only want to hide certain code / input cells and keep some visible, use the following extension:
Hide all code / input cells
If you wish to hide all your code so that people will only see the output, use the following extension:
6. Render/Convert Notebook to PDF/HTML etc
Jupyter Notebook gives us the ability to render the notebook into many formats. Below you will find the list of available options.
I find this a more reliable process than going to File -> Download as....
The simplest way to use nbconvert is > jupyter nbconvert mynotebook.ipynb which will convert mynotebook.ipynb to the default format (probably HTML). You can specify the export format with `--to`. Options include ['asciidoc', 'custom', 'hide_code_html', 'hide_code_latex', 'hide_code_pdf', 'hide_code_slides', 'html', 'html_ch', 'html_embed', 'html_toc', 'html_with_lenvs', 'html_with_toclenvs', 'latex', 'latex_with_lenvs', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'selectLanguage', 'slides', 'slides_with_lenvs'] > jupyter nbconvert --to latex mynotebook.ipynb Both HTML and LaTeX support multiple output templates. LaTeX includes 'base', 'article' and 'report'. HTML includes 'basic' and 'full'. You can specify the flavor of the format used. > jupyter nbconvert --to html --template basic mynotebook.ipynb You can also pipe the output to stdout, rather than a file > jupyter nbconvert mynotebook.ipynb --stdout PDF is generated via latex > jupyter nbconvert mynotebook.ipynb --to pdf You can get (and serve) a Reveal.js-powered slideshow > jupyter nbconvert myslides.ipynb --to slides --post serve Multiple notebooks can be given at the command line in a couple of different ways: > jupyter nbconvert notebook*.ipynb > jupyter nbconvert notebook1.ipynb notebook2.ipynb or you can specify the notebooks list in a config file, containing:: c.NbConvertApp.notebooks = ["my_notebook.ipynb"] > jupyter nbconvert --config mycfg.py
You may need to install external dependencies such as latex or pandoc etc. consult the documentation for rendering further.
Finally, the extension that you would not have heard of but should download straight away (especially if you are a beginner)
7. Setup
check out this handy extension by Will Koehrsen
Set Your Jupyter Notebook up Right with this Extension
Will has created this amazing extension that gives you a great setup for anyone looking to organize their notebook for an easier workflow.
Some more handy tips are:
- Esc+a to add cells above
- Esc+b to add cells below
- Esc+m to activate markdown cell and Ctrl+Enter to execute.
- Esc+d to delete cell
Related:
- Best Practices for Using Notebooks for Data Science
- Jupyter Notebook for Beginners: A Tutorial
- Running R and Python in Jupyter