Probability Distribution Functions

Discrete Probability Functions

Experiments that have countable outcomes; that is, successes and failures you can count (e.g., 1, 2, 3, ...) can be modelled using discrete distributions. Common ones are the binomial, negative-binomial, poisson and hypergeometric. Calculators and a description of each of these follow in the tabs next to this one. Each of these distributions are used in unique situations under a unique set of assumptions. The situation will dictate the distribution needed to make statements about the chances of certain outcomes.

After the description of the distribution in each tab, you will have the chance to input certain assumptions of your own. The calculator will dynamically calculate the mean and variance of the distribution under the assumptions you input and then proceed to generate graphs that illustrate the distribution under those entered assumptions. You will see two graphs: one for the probability density function (showing the probabilities for one specific outcome at a time) and another for the cumulative probability density function (showing the probabilities for outcomes up to a specific level).

More inputs are then requested to answer specific questions regarding various probabilities that stem from the distribution created from your earlier inputs.

The Binomial Distribution

Often one would like to model a situation where you have one, and only one, probability assigned to a specific event, such as a dice roll. In these situations, there is a probability of success (p) and the probability of failure (1 - p). In a number of trials (n), one might have a certain number of successes (x), with the rest being failures. In our dice example, you might be playing a game where you lose automatically if you roll two ones with two dice. Maybe you desire to know the probability of rolling this amount just once after, say, 10 attempts. Or maybe you want to know the probabilty of rolling that amount fewer than 3 times in 20 attempts. Or maybe you want to know the probability of rolling that amount between 5 and 8 times in 30 attempts. In any case, the variable number of successes has a binomial distribution if

  1. The number of observations (n) is fixed.
  2. The observations are independent of each other.
  3. Each observation has a probability of success and a probability of failure.
  4. The probability of success is the same for each trial.

Under these conditions, the probability of observing x successes in n trials is represented by the probability density function (pdf) P(X = x) = (nCx) px(1 - p)n - x.

The mean of such a distribution is np. The variance is np(1 - p). You can interpret the mean of this distribution as the expected number of successes in n trials.

You can use the calculator below to calculate probabilities under your own scenario using the assumptions you input. The probability density function yields probabilities for a specified number of successes, whereas the cumulative density function yields probabilities for each and every number of successes up to a specified number. The graphs generated are shaped in accordance with the assumptions you enter.

Assumptions:

Number of trials attempted (n) =

Probabilty of success (p) =

The Negative-Binomial Distribution

Often one would like to model a situation where you have one, and only one, probability assigned to a specific event, such as the probability of a machine failing. In these situations, there is a probability of success (p) and the probability of failure (1 - p). In a number of trials (x), one might have a certain number of successes (r), with the rest being failures. Maybe you are a manufacturer who will offer to replace a machine purchased by one of your customers if the machine breaks down a certain number of times in a number of uses as opposed to just repairing it. You could model this situation with a negative-binomial distribution. Assumptions surrounding the negative-binomial are

  1. The number of observations (x) is fixed.
  2. The observations are independent of each other.
  3. Each observation has a probability of success and a probability of failure.
  4. The probability of success is the same for each trial.

Under these conditions, the probability of observing the rth success on the xth trial is represented by the probability density function (pdf) P(X = x) = x-1Cr-1pr(1 - p)x - r.

The mean of such a distribution is r/p. The variance is r(1 - p) / p2. You can interpret the mean of this distribution as the expected number of trials needed before observing the rth success.

You can use the calculator below to calculate probabilities under your own scenario using the assumptions you input. The probability density function yields probabilities for a specified number of trials needed to obtain r successes, whereas the cumulative density function yields probabilities for each and every number of trials up to a specified number needed to obtain r successes. The graphs generated are shaped in accordance with the assumptions you enter.

Assumptions:

Number of successes (r) =

Probabilty of success (p) =

The Poisson Distribution

Sometimes you may have an idea of the expected number of successes in a range or interval. In insurance, it is common to collect data that indicates the expected number of accidents a cohort of drivers may have in a year. This situation can be modelled using a Poisson distribution. Questions can be answered by this distribution such as what is the probability that a cohort will have 300 accidents this year given the assumed expected number of auto accidents in one year is 400. Or one could ask what is the probability of having 450 or more accidents in a given year if you believe the expected number of accidents in a year to be 500.

The probability of observing x occurrences over the specified interval (whether it be time, length, etc.) is represented by the probability density function (pdf) P(X = x) = e-n * nx / x!, where n is equal to the expected number of successes in a range or interval.

The mean of such a distribution is n and the variance is also n.

You can use the calculator below to calculate the probabilities under your own scenario using the assumptions you input. The probability density function yields probabilities for a number of intervals or ranges, whereas the cumulative density function yields probabilities for each and every number of intervals or ranges up to a specified number. The graphs generated are shaped in accordance with the assumptions you enter.

Assumption:

Expected number of successes per interval or range =

The Hypergeometric Distribution

The hypergeometric distribution is much like the binomial distribution except the probabilities involved with a hypergeometric distribution are not constant from experiment to experiment. Instead, after each observation, it is not assumed that the state of the population is left untouched; rather, observations are not replaced and are removed from the set of possibilities. Take a card deck for example. If you draw a card from a deck, if a hypergeometric distribution is assumed, that card is not replaced to the deck for the next draw. If it were, we would have to look to the binomial distribution. That is the only difference.

The probability of observing x occurrences of some type from a population N that has k possible outcomes that could result in x where a sample of size n is taken is represented by the probability density function (pdf) P(X = x) = kCx * N-kCn-x / NCn.

The mean of such a distribution is n * k / N and the variance is n * k * (N - k) * (N - n) / ((N2 * (N - 1)). You can think of the mean of this distribution as the expected number of successes that occur out of your sample (n).

You can use the calculator below to calculate the probabilities under your own scenario using the assumptions you input. The probability density function yields probabilities for a specified number of successes out of your sample, whereas the cumulative density function yields probabilities for each and every number of successes out of your sample up to a specified number. The graphs generated are shaped in accordance with the assumptions you enter.

Assumptions:

Total population size =

Total number of possibilities from the population that could result in a success =

Total number sampled =