# Tag: Probability (23)

**How to count Big Data: Probabilistic data structures and algorithms**- Aug 26, 2019.

Learn how probabilistic data structures and algorithms can be used for cardinality estimation in Big Data streams.**What is Poisson Distribution?**- Aug 14, 2019.

An solid overview of the Poisson distribution, starting from why it is needed, how it stacks up to binomial distribution, deriving its formula mathematically, and more.**KDnuggets™ News 19:n25, Jul 10: 5 Probability Distributions for Data Scientists; What the Machine Learning Engineer Job is Really Like**- Jul 10, 2019.

This edition of the KDnuggets newsletter is double-sized after taking the holiday week off. Learn about probability distributions every data scientist should know, what the machine learning engineering job is like, making the most money with the least amount of risk, the difference between NLP and NLU, get a take on Nvidia's new data science workstation, and much, much more.**5 Probability Distributions Every Data Scientist Should Know**- Jul 4, 2019.

Having an understanding of probability distributions should be a priority for data scientists. Make sure you know what you should by reviewing this post on the subject.**Probability Mass and Density Functions**- May 21, 2019.

This content is part of a series about the chapter 3 on probability from the Deep Learning Book by Goodfellow, I., Bengio, Y., and Courville, A. (2016). It aims to provide intuitions/drawings/python code on mathematical theories and is constructed as my understanding of these concepts.**Unfolding Naive Bayes From Scratch**- Sep 25, 2018.

Whether you are a beginner in Machine Learning or you have been trying hard to understand the Super Natural Machine Learning Algorithms and you still feel that the dots do not connect somehow, this post is definitely for you!**Machine Learning Cheat Sheets**- Sep 11, 2018.

Check out this collection of machine learning concept cheat sheets based on Stanord CS 229 material, including supervised and unsupervised learning, neural networks, tips & tricks, probability & stats, and algebra & calculus.**Basic Statistics in Python: Probability**- Aug 21, 2018.

At the most basic level, probability seeks to answer the question, "What is the chance of an event happening?" To calculate the chance of an event happening, we also need to consider all the other events that can occur.**Why Data Scientists Love Gaussian**- Jun 26, 2018.

Gaussian distribution model, often identified with its iconic bell shaped curve, also referred as Normal distribution, is so popular mainly because of three reasons.**How Bayesian Networks Are Superior in Understanding Effects of Variables**- Nov 9, 2017.

Bayes Nets have remarkable properties that make them better than many traditional methods in determining variables’ effects. This article explains the principle advantages.**30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets**- Sep 22, 2017.

This collection of data science cheat sheets is not a cheat sheet dump, but a curated list of reference materials spanning a number of disciplines and tools.**The Surprising Complexity of Randomness**- Jun 15, 2017.

The reason we have pseudorandom numbers is because generating true random numbers using a computer is difficult. Computers, by design, are excellent at taking a set of instructions and carrying them out in the exact same way, every single time.**Stuff Happens: A Statistical Guide to the “Impossible”**- Apr 6, 2017.

Why are some people struck by lightning multiple times or, more encouragingly, how could anyone possibly win the lottery more than once? The odds against these sorts of things are enormous.**Introduction to Bayesian Inference**- Dec 16, 2016.

Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.**What Statistics Topics are Needed for Excelling at Data Science?**- Aug 2, 2016.

Here is a list of skills and statistical concepts suggested for excelling at data science, roughly in order of increasing complexity.**Big Data, Bible Codes, and Bonferroni**- Jul 8, 2016.

This discussion will focus on 2 particular statistical issues to be on the look out for in your own work and in the work of others mining and learning from Big Data, with real world examples emphasizing the importance of statistical processes in practice.**Deep Learning, Pachinko, and James Watt: Efficiency is the Driver of Uncertainty**- Jun 8, 2016.

A reasoned discussion of why the next generation of data efficient learning approaches rely on us developing new algorithms that can propagate stochasticity or uncertainty right through the model, and which are mathematically more involved than the standard approaches.**Do You Need Big Data or Smart Data? Part 2**- Jun 2, 2016.

It can be easy to get carried away with the deluge of big data and to rely on its abundance to deliver better models. However, use of data without context and objective could prove counterproductive; contextual and objective driven samples from the large volume and variety of data can be effective tools.**Do You Need Big Data or Smart Data? Part 1**- Jun 1, 2016.

Analyzing Big Data without paying attention to its characteristics and objective can be detrimental, the fix for which can be correct and effective sampling. Read on to transform your Big Data to Smart Data.**Bayes Theorem for Computer Scientists, Explained**- Feb 16, 2016.

Data science is vain without the solid understanding of probability and statistics. Learn the basic concepts of probability, including law of total probability, relevant theorem and Bayes’ theorem, along with their computer science applications.**Plausibility vs. probability, prior distributions, and the garden of forking paths**- Jan 14, 2016.

A discussion on plausibility vs. probability: while many given events may be plausible, but they can’t all be probable.**Top /r/MachineLearning Posts, Apr 5-11: Amazon Machine Learning, Numerical Optimization, and Conditional Random Fields**- Apr 14, 2015.

Amazon Machine Learning as a Service, Numerical Optimization, Extracting data from NYTimes recipes, Intro to Machine Learning with sci-kit, and more.**INFORMS The Business of Big Data 2014: Day 1 Highlights**- Aug 21, 2014.

Highlights from the presentations by Big Data technology practitioners from Teradata, Booz Allen Hamilton, Databricks and ProbabilityManagement.org during INFORMS The Business of Big Data in San Jose.