In this section, we present some specific types of discrete random variables and derive their probability distributions, expectations and variances.

### The binomial probability distribution

Suppose a random experiment has a binary outcome (1 or 0, success or failure, etc.). Let be a random variable that indicates the result of this random experiment. The PMF of can be written as

where . Then we say , meaning is a

**Bernoulli random variable**with parameter , or is drawn from a Bernoulli distribution with parameter .A

**binomial**experiment is a random experiment that contains independent and identical Bernoulli experiments, e.g. tossing a coin times. Denote as the number of successes observed in the trials. Then where is the probability of success in each trial. can take any integer value from . The PMF isAbove is a visualization of the binomial distribution with different values for

*p*. We can also see that the Bernoulli distribution is a special case of the binomial distribution where .## R code for animation.

`library(tidyverse) library(ggpubr) library(gganimate) dat <- data.frame( x = rep(0:10, times = 5), Param = rep(c(0.1, 0.3, 0.5, 0.7, 0.9), each = 11) ) %>% mutate( freq = map2(x, Param, ~ dbinom(.x, 10, .y)), x = factor(x) ) ggbarplot(dat, "x", "freq") + transition_states(Param) + ease_aes('cubic-in-out') + ggtitle("p = {closest_state}") + labs(x = "X ~ Bin(10, {closest_state})", y = "Freq")`

#### Properties

We first want to check that since it’s a distribution function. Then we would derive the expectataion and variance of the two distributions. Doing so for the Bernoulli distribution is straightforward:

For the binomial distribution, we know that

By using the binomial expansion, we have

The

**expected value**of a binomial random variable isThe

**binomial expansion**is given by**Lemma:**Suppose we have and . We can show that

if this equation holds, we can get by setting .

Let , we have

because . So the variance of is

The binomial distribution has many applications, such as modeling defectives in quality control, or anything else that can be put into a success-failure setting.

### The geometric probability distribution

Suppose we have a Binomial experiment which consists of some independent and identical Bernoulli experiments with probability of success . We can define as a random variable to describe the number of trials until the first success. For example, if we’re tossing a coin for times, will be the number of the toss on which a head first appears.

*Y*is said to have a

**geometric probability distribution**if and only if

where and .

#### Properties

As required for any valid discrete probability distribution, the probabilities should add up to 1.

We know the

**geometric sequence**For the

**expectation**of , we haveRearranging terms would give us

With the expectation of known, it’s easy to get the

**variance**of once we find :The variance is thus

The geometric probability distribution is often used to model the distributions of lengths of waiting times. For example, the probability of engine malfunction during any randomly observed time intervals is , and the length of time until the first malfunction can be modeled using the geometric distribution.

### The Poisson probability distribution

The Poisson random variable has a range of infinite size. It takes values in the set . We’ll learn about this random variable through an example.

Suppose we’re performing quality control for a mobile phone company. Each phone made has a

**small chance**to be defect. The average number of defect phones produced per day is . Find the probability of producing defect phones on a usual day.First, we may assume that the production of each phone is a random variable of two outcomes: 0 or 1, or where is the phone with probability to be defect. In addition, we assume that there are phones produced, and the production of the phones are independent and share the same defect probability . Define

Then the probability of producing defect phones can be calculated using the probability mass function of at :

Our problem now is we don’t know and explicitly. But what we do know is on average there are defect phones produced per day, which gives

Now we can replace the variable in the equation with , and write out the probability mass function as

Though we don’t know exactly, it’s reasonable to assume that it’s a very large number, so we can let to study it in an asymptotic way.

where the limit for is found using the following limits

So as

Formally speaking, for a discrete random variable whose probability mass function satisfies

we say

*X*is a**Poisson**random variable with parameter , or . The Poisson distribution provides a good model for the probability distribution of rare events that occur in space, time or any other dimension where*λ*is the average value.#### Properties

As always, we first check if the total probability is 1.

where we’ve used the Taylor series

Next, we prove that the

**expectation**of is .The

**variance**is also . This doesn’t happen often as the expectation and the variance have different units, but it’s not a problem in the Poisson distribution as it is used to model counts, which is unitless.### The negative binomial distribution

Similar to the geometric distribution, suppose we have a sequence of i.i.d. Bernoulli trials with the same probability of success . We’re interested in the number of the trial on which the success occurs ().

Let and be fixed values, and consider events : {the first trials contain successes} and : {trial results in a success}. We’ve assumed that and that and B are independent, so

Using results from the binomial distribution, we can easily find

A random variable

*Y*is said to have a**negative binomial probability distribution**if and only ifwhere , and we can denote it as . Here is the random number of failures. It’s name originates from the fact that

In the field of bioinformatics, the NB distribution is very frequently used to model RNA-Seq data. Simply put, in an RNA-Seq experiment we map the sequencing reads to a reference genome and count the number of reads within each gene. There tends to be millions of reads in total, but the number on each gene is usually within the thousands with great variability.

The Poisson distribution was used to model this in the beginning, but it has the assumption that the mean and variance are the same, which is not the case in RNA-Seq. The variance in the counts is in general much greater than the mean, especially with the highly-expressed genes. The negative binomial distribution’s other formulation - the gamma-Poisson mixture distribution has a dispersion parameter that fits here.

#### Properties

First we want to show that the probabilities add up to 1. We first play with the binomial part:

By constructing the function

we can decompose this into

When , the equation is reduced to the sum of probabilities of a geometric distribution

So we can show that

If we set , and , we have

We can now use this property to calculate the

**expectation**.Let and , we have

### Moments and moment generating functions

In the previous sections, we’ve shown the expected values and variances for multiple random variables. In the calculations, we often have to calculate the expected values of some power functions of the random variable, such as for the variances.

In general, it would be of interest to calculate for some positive integer . This expectation is called the

**moment**of .#### Moment generating function

The moment generating function can be used to systematically calculate the moments of a random variable. For a random variable , its

**moment generating function**is defined aswhere is a parameter. Note that is not random! We call the moment generating function of because all the moments of can be obtained by successively differentiating and then evaluating the result at . For example, consider the first order derivative of .

and at , we have . Similarly, if we take the second order derivative of

In general, we can summarize the derivative of as

Then we evaluate this derivative at , which yields

So for a given random variable, if we know its moment generating function, we can take advantage of this property to calculate all the moments of this random variable. The correspondence between a distribution and its

**MGF**is one-to-one. MGF is an ID for different distributions.#### Binomial distribution

Find the MGF for .

Taking the first derivative, we have

Similarly we can get the variance by taking the second derivative:

#### Poisson distribution

Find and for .

using , we have

Taking the first derivative of the MGF yields

And the second derivative:

The variance can be found by

And that’s pretty much it for discrete random variables. Obviously we haven’t covered everything, but this should be good enough for now. Next, we’re going to talk about .