KDnuggets Top Blog Winner

A Brief Introduction to Papers With Code

One-stop shop to learn about state-of-the-art research papers with access to open-source resources including machine learning models, datasets, methods, evaluation tables, and code.



A Brief Introduction to Papers With Code
Image by author

 

The name tells everything. Papers with Code is the platform that contains research papers with code implementations by the authors or community. Recently, Papers with Code have grown in both popularity and in terms of providing a complete ecosystem for machine learning research.
 
You can reproduce the results by using the code, checking all the previous implementations with the model performance metrics, viewing the dataset, models, and methods used in the research paper. It is the next-generation knowledge-sharing platform that is community-driven and open to edits like Wikipedia under the CC-BY-SA license. 

Apart from machine learning, the platform has specialized portals for papers with code in astronomy, physics, computer sciences, mathematics, and statistics. You can also check all the stats on trending papers, frameworks, and code coverage.

 

Papers With Code
Image from Papers With Code

 

Anyone can contribute by clicking on the edit button. If you want to add code to a paper, evaluation table, task or dataset then find the edit button on a particular page to modify it. The user interface is quite friendly so finding papers or adding resources is quite easy. All the submitted code and results are under the free CC BY-SA license.

 

State of the Art

 
State of the Art section contains 6434 benchmark machine learning models, 2735 tasks and sub-tasks (Knowledge Distillation, Few-Shot Image Classification), 65,649 papers with code. These machine learning models are subcategorized by various fields of studies such as Computer Vision, Natural Language Processing, and Time Series. 

After selecting a field of study, you can explore various subfields and results. For example, Computer Vision sub-class Image Classification has the best accuracy score of 90.88% using the CoAtNet-7 model. You can view the code implementations, read the paper, view parameters used in neural networks, and the results in the form of a detailed comparison on similar datasets.
 

ImageNet Benchmark
Image from ImageNet Benchmark

 

Datasets

 
The Datasets section contains 5,628 machine learning datasets and you can either search the dataset directly or filter them out on modality, tasks, and languages. You are not just getting access to the dataset you are getting full stats on what are popular datasets in particular category based on benchmark results and research papers. 

Every dataset contains a link to the paper or website of which the original dataset. The Data page is easy to navigate and within a few minutes you can understand the modality, license information, papers published, and benchmark based on subcategories. For example, the benchmark for Self-Supervised Image Classification on ImageNet is iBOT with 82.3% accuracy. 

To share your dataset with the ML community you need fill the form Add a dataset and provide links and detailed information about dataset. 

 

Machine Learning Datasets
Image by Machine Learning Datasets

 

Methods

 
The platform is well structured and organized as it divides various sections into smaller sub-sections. The state-of-the-art models are organized by various fields of machine learning (Computer Vision, Speech), and each field of study consists of tasks (Object Detection, Image Generation) and sub-tasks (Unsupervised Image Classification, Fine-Grained Image Classification). Finally, these sub-tasks are built using various methods (Stochastic Optimization, Convolutional Neural Networks).

The Method section is divided by type, and each type consists of various methods. For example, the General type consists of Attention and Activation Functions. Each method has some sort of variation that has been used to create models or used in processing the data. If you want to improve your current machine learning system then the Method section is the best place to find solutions. 

 

Example

 
Let’s discover what information we get on the CoAtNet page. The page consists of a complete research paper name and author names with social media links. You can read the abstract or even download the full paper from arxiv or general publications. If you like the research paper and want to know about code implementations and results then start scrolling down the page to discover multiple GitHub repositories links, tasks, datasets, results, and methods. The platform enhances the researcher’s experience by connecting various components of machine learning ecosystems.

 

example page
Gif from CoAtNet

 

Conclusion 

 
Papers with Code have several features that enable machine learning practitioners and researchers to learn and contribute to cutting-edge technologies. The platform also provides a link to Hugging Face Spaces with GitHub repository so that you can experience how the model works. Apart from that, you can mirror the results of competitions on Papers with Code. For example, you can add the results on the Hugging Face model, and it will show up on Papers with Code with the dataset, model, and model metrics.  
In this blog, we have explored various sections of the platform and how it is helping researchers all over the world to learn about top research papers. We have learned how state-of-the-art models, datasets, tasks, sub-tasks, and methods are interconnected to improve the reading experience. It is the most popular platform among the machine learning community because of integrations and universal inclusiveness. 
 
 
 
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.