Computer Vision Recipes: Best Practices and Examples

This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.



Having recently spotlighted a similar resource from Microsoft, the Natural Language Processing Best Practices & Examples repository, today we bring to you its computer vision counterpart.

Image

 

The Computer Vision Best Practices & Examples GitHub repository describes itself like this:

 

The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in Computer Vision algorithms, neural architectures, and operationalizing such systems. Rather than creating implementions from scratch, we draw from existing state-of-the-art libraries and build additional utility around loading image data, optimizing and evaluating models, and scaling up to the cloud. In addition, having worked in this space for many years, we aim to answer common questions, point out frequently observed pitfalls, and show how to use the cloud for training and deployment.

 

The repo contains a number of Jupyter notebooks designed to highlight this "comprehensive set of tools and examples." The repo's readme further emphasizes its fast iteration approach to practical solution implementations:

 

We hope that these examples and utilities can significantly reduce the “time to market” by simplifying the experience from defining the business problem to development of solution by orders of magnitude. In addition, the example notebooks would serve as guidelines and showcase best practices and usage of the tools in a wide variety of languages.

 

Image

 

The notebooks and their accompanying utility functions are organized into common computer vision scenarios, such as classification, segmentation, and crowd counting. Clicking on any of these scenarios takes you to the corresponding folder of recipes and best practices, an examples of which is shown below:

Image

 

The notebooks also lean on scripts in the utils_cv module "to simplify common tasks used when developing and evaluating computer vision systems." Be sure to check out the utilities developed by Microsoft Research which are intended to save time and speed up some of the more laborious tasks associated with computer vision.

I'm not a computer vision expert, and have not spent as much time studying this area as a lot of other folks in machine learning. As such, I find resources like this one incredibly valuable. Whether you are in a similar situation, or you are a full time computer vision engineer looking for a handy collection of code to draw inspiration and ideas from, I suggest you check out this best practice-oriented repo from Microsoft.

 
Related: