Awesome Public Datasets on GitHub

A long, categorized list of large datasets (available for public use) to try your analytics skills on. Which one would you pick?

designers-design-with-dataNo matter how many books you read on technology, some knowledge comes only from experience. This is even truer in the field of Big Data. Despite a good number of resources available online (including KDnuggets dataset) for large datasets, many aspirants and practitioners (primarily, the newcomers) are rarely aware of the limitless options when it comes to trying their Data Science skills on real-life large datasets. Thus, we are consistently on the lookout for greater and better datasets available for public use.

In our next endeavor on this journey, we are sharing here an awesome list of public data sources by Xia Ming(bio given at the end) that are collected and organized from blogs, answers, and user responses. Most of the data sets listed below are free, however, some are not.




Complex Networks

Computer Networks