KDnuggets Top Blog Winner

12 Essential VSCode Extensions for Data Science

Learn about the data science VSCode extensions for super productivity and better user experience.



12 Essential VSCode Extensions for Data Science
Image by Author

 

Visual Studio Code (VSCode) is a free integrated development environment (IDE). It is popular among developers and data practitioners. The VSCode provides rich functionalities, extensions (plugins), built-in Git, ability to run and debug code, and complete customization for the workspace. You can build, test, deploy, and monitor your data science application without leaving the application.

I have tried multiple IDEs, and to be honest, I find VSCode the best as it provides a lightweight, powerful, and customizable work environment. The biggest plus point of this IDE is the huge collection of extensions for all kinds of IT professionals.

In this blog, we are going to learn about the extensions that are an essential part of my workspace.

A quick review of the list:

  1. GitHub Copilot
  2. Python
  3. Pylance
  4. Python Indent
  5. Indent-rainbow
  6. Jupyter
  7. Jupyter Notebook Renderers
  8. R
  9. Julia
  10. DVC
  11. GitLens
  12. Todo MD

 

1. GitHub Copilot

 

GitHub Copilot is your AI assistant. It will suggest a line or whole function. GitHub Copilot uses OpenAI codex to provide real-time suggestions. The best part of the extension is that it learns from users' behaviors. Whenever I have to write a similar Python script it will suggest the comments, function, and docstring. I just have to press “Tab”.

Signup today for a technical preview here.

 

2. Python

 

Python extension provides language support such as linting, debugging, code navigation, code formatting, refactoring, variable explorer, and test explorer. It will automatically install Pylance and Jupyter extensions to provide you the best experience on Python files and Jupyter Notebook files.

 

3. Pylance

 

Pylance with Python extension provides super charge language support. It will provide you with parameter suggestions, code completion, auto imports, type check, and semantic highlighting. It is highly recommended as it has improved my Python development experience by 2X. Pylance is far more than autocomplete for Python.

 

4. Python Indent

 

Python Indent is the extension that you always knew you needed. Every time you type a line of code and Enter, it will provide you with the correct Python indentation. It works with bracket pairs, hanging indents, keywords, and extending comments.

 

5. Indent-rainbow

 

Indent-rainbow brings peace to my world of HTML and Python coding. I now see clean and well-organized Indentations. This extension has helped me debug code fast and write effective code. Indent-rainbow colorizes the indentation in front of your text, alternating four colors for each step.

 

6. Jupyter

 

Jupyter lets you edit, run, and save Python Jupyter Notebook within VScode. It is simple and supports all the programming languages. For example, Julia, R, Scala, and SQL. It combines Jupyter functionalities with VSCode extension to provide the ultimate Python development experience. Jupyter comes with fast `.ipynb` file loading, notebook diff-tool, Python and Pylance integration, and code folding.

I will highly recommend you to use Jupyter Notebook within VSCode.

 

7. Jupyter Notebook Renderers

 

Jupyter Notebook Renderers works with Jupyter extension to provide interactive data visualization. It is a must-have extension for data analysts, data scientists, and data engineers to visualize Plotly, Vega, Bokeh, GIF, PNG, SVG, and JPEG outputs.

 

8. R

 

R extension provides rich language support. If you are a data analyst or researcher you must be aware of R-lang and its ecosystem. The VSCode extension enhances your experience by providing you with syntax highlighting, code analysis, R terminal, and support for R Markdown. It also allows you to view data, plots, and variables.

 

9. Julia

 

Julia extension is language support similar to Python and R. In my opinion Julia is the future of machine learning and data science. The extension comes with syntax highlighting, snippets, Julia REPL, code completing, linter, hover help, and debugging. Similar to R, it provides a plot gallery, grid viewer for tabular data, and the ability to test, build, and benchmark programs.

 

10. DVC

 

DVC is a new and in my opinion MVP extension for versioning and tracking your machine learning experiments. Every data team depends on it to version the dataset for reproducibility purposes. Apart from data, you can version metadata, plots, models, track and store experiments, create data and ML pipelines, and share it like a Git. The extension comes with experiment tracking, dashboard, live tracking, and GUI-based data management.

DVC extension makes large file versioning simpler and easier.

 

11. GitLens

 

GitLens brings your Git repository to life. Instead of typing scripts on the terminal, you can use an interactive user interface to perform all Git-related tasks. It comes with revision navigation, current line blame, authorship, file annotation, sidebar view, Git command palette, and customizable menus and toolbars. It improves your development experience by providing visual composing, seamless team collaboration, and ability to analyze project progress.

 

12. Todo MD

 

Todo MD is the best task tracking extension. You can find multiple todo extensions that might be helpful for your particular development environment, but Todo MD allows you to set priority tasks, and track daily tasks, projects, tags, and context. By using Markdown syntax, you can create your list of tasks with special tags such as “overdue” or filter your tasks related to specific simple tags and special tags.

I use it to track my recurring tasks. For example, running and automating Python scripts for editorial tasks.

 

Conclusion

 

Other extensions are recommended if you are more into the development and deployment of data science solutions such as GitHub Pull Requests and Issues, Docker, and Kubernetes. The extensions that I have mentioned are necessary for me to build, test, and run Python scripts daily.

If you have better suggestions for data science extension, please mention them in the comments. I am always looking to improve my current workspace by replacing old extensions with better alternatives. I am currently looking to automate my workflow using GitHub Actions, and I am open to suggestions.

 
 
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.