- 8 Places for Data Professionals to Find Datasets - Dec 17, 2020.
Here is a curated list of sites and resources invaluable for data professionals to acquire practice datasets.
Data Science, Datasets, Google, Government, Kaggle, Reddit, UCI
- What is Data Catalog and Why You Should Care? - Dec 23, 2019.
Learn why data catalogs could be just the thing you need to meet the challenges of data and metadata management and collaboration.
Compliance, Consistency, Data Catalog, Data Governance, Datasets, Metadata, Reddit
- Reddit Post Classification - Sep 18, 2019.
This article covers the implementation of a data scraping and natural language processing project which had two parts: scrape as many posts from Reddit’s API as allowed &then use classification models to predict the origin of the posts.
Classification, NLP, Reddit
- Data Science for Newbies: An Introductory Tutorial Series for Software Engineers - May 31, 2017.
This post summarizes and links to the individual tutorials which make up this introductory look at data science for newbies, mainly focusing on the tools, with a practical bent, written by a software engineer from the perspective of a software engineering approach.
Apache Spark, Data Science, Jupyter, Machine Learning, Pandas, Python, Reddit, Scala, SQL
- Top /r/MachineLearning Posts, March: A Super Harsh Guide to Machine Learning; Is it Gaggle or Koogle?!? - Apr 4, 2017.
A Super Harsh Guide to Machine Learning; Google is acquiring data science community Kaggle; Suggestion by Salesforce chief data scientist; Andrew Ng resigning from Baidu; Distill: An Interactive, Visual Journal for Machine Learning Research
Advice, Andrew Ng, Distill, Google, Kaggle, Machine Learning, Reddit, Salesforce
- Top /r/MachineLearning Posts, October: NSFW Image Recognition, Differentiable Neural Computers, Hinton on Coursera - Nov 4, 2016.
NSFW Image Recognition, Differentiable Neural Computers, Hinton's Neural Networks for Machine Learning Coursera course; Introducing the AI Open Network; Making a Self-driving RC Car
DeepMind, Geoff Hinton, Image Recognition, Machine Learning, Neural Networks, Reddit, Self-Driving Car
- Top /r/MachineLearning Posts, September: Open Images Dataset; Whopping Deep Learning Grant; Advanced ML Courseware - Oct 7, 2016.
Google Research announces the Open Images dataset; Canadian Government Deep Learning Research grant; DeepMind: WaveNet - A Generative Model for Raw Audio; Machine Learning in a Year - From total noob to using it at work; Phd-level machine learning courses; xkcd: Linear Regression
Canada, Courses, Deep Learning, Generative Models, Geoff Hinton, Machine Learning, Reddit, xkcd
- AMA Data Scientist, Jan 13: Jake Porway of DataKind - Jan 7, 2016.
Jake Porway is a machine learning and technology enthusiast, and founder of DataKind nonprofit which helps organizations use the power of data science in the service of humanity. He will do Reddit AMA on Jan 13, 2016.
DataKind, Jake Porway, Reddit
- Top /r/MachineLearning Posts, September: Implement a neural network from scratch in C++ - Oct 6, 2015.
Neural network in C++ for beginners, Chinese character handwriting recognition beats humans, a handy machine learning algorithm cheat sheet, neural nets versus functional programming, and a neural nets paper repository.
C++, Deep Learning, Matthew Mayo, Neural Networks, Python, R, Reddit
- Google BigQuery Public Datasets - Feb 20, 2015.
Google BigQuery is not only a fantastic tool to analyze data, but it also has a repository of public data, including GDELT world events database, NYC Taxi rides, GitHub archive, Reddit top posts, and more.
BigQuery, GDELT, Google, New York City, Reddit
- Geoffrey Hinton talks about Deep Learning, Google and Everything - Dec 1, 2014.
A review of Dr. Geoffrey Hinton’s Ask Me Anything on Reddit. He talked about his current research and his thought on some deep learning issues.
Deep Learning, DeepMind, Geoff Hinton, Google, Neural Networks, Reddit, Yann LeCun