Data Science Glossary

Data architect shares an extensive data science glossary of terms from statistics , Data Science, and Machine Learning, from algorithm to vector space.

By Bob DuCharme.

As I've studied up on data science lately in KDnuggets and other sources, I've found myself learning a lot of new terms, especially in the worlds of statistics and machine learning. Eventually, though, I found myself forgetting some of these terms as I learned more.

To make it easier to remember what terms like "gradient descent" meant when I came across them for the ninth or tenth time, I created a little glossary in my data science notes. In addition to terms that I found in articles, blogs, and books, I also looked in data science job postings to see what kinds of terms came up the most.

As my glossary grew bigger, I thought that it might be helpful to others, so I decided to publish it. First, I planned to publish it on my blog, but when I saw that the domain name was up for grabs, I thought it would be fun to put it there. (The domain name was already taken, but it's just got one of those placeholder web pages that hosting companies use for domain names that have no real content.)

I hope that is useful to others as they try to assimilate all the new terms from these different disciplines that are coming together to contribute to this new field. I know it will be useful to me.

Bio: Bob DuCharme is a solutions architect at TopQuadrant, a provider of software for modeling, developing and deploying semantic web applications.