Matthew Mayo (@mattmayo13) holds a Master's degree in computer science and a graduate diploma in data mining. As Managing Editor, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.
In this month's installment of Machine Learning Projects You Can No Longer Overlook, we find some data preparation and exploration tools, a (the?) reinforcement learning "framework," a new automated machine learning library, and yet another distributed deep learning library.
Without knowing the ground truth of a dataset, then, how do we know what the optimal number of data clusters are? We will have a look at 2 particular popular methods for attempting to answer this question: the elbow method and the silhouette method.
Check out this Python deep learning virtual machine image, built on top of Ubuntu, which includes a number of machine learning tools and libraries, along with several projects to get up and running with right away.
This is an introduction to recent research which presents an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
Spring. Rejuvenation. Rebirth. Everything’s blooming. And, of course, people want free ebooks. With that in mind, here's a list of 10 free machine learning and data science titles to get your spring reading started right.
Unlike a lot of other tutorials which often pull from the real-time Twitter API, we will be using the downloadable Twitter Analytics data, and most of what we do will be done in Pandas.
The third and final part of 17 new must-know Data Science interview questions and answers covers A/B testing, data visualization, Twitter influence evaluation, and Big Data quality.
What if a simple, deterministic approach which did not rely on randomization could be used for centroid initialization? Naive sharding is such a method, and its time-saving and efficient results, though preliminary, are promising.