In this chapter, we restrict ourselves to study a special type of random variables - the discrete random variables.

When an experiment is performed, we are often interested in some functions of the outcomes, rather than the outcomes themselves. For example, when we toss two six-sided dice, we may care about the sum of the two dice. These real-valued functions defined on the sample space are known as

**random variables**.### Probability mass function

A random variable is said to be

**discrete**if it can take a finite or countable infinite number of distinct values. In practice, we usually use an uppercase letter to denote a random variable. For example, , and a lowercase letter, say , to denote a possible value of this random variable. When we say , weβre referring to the*set of outcomes*on the sample space such that holds. As an example,Then we can assign a probability to each of the events. The

**probability mass function (PMF)**of a discrete random variable at a given value is denotedThe

**probability distribution**of a discrete random variable*X*is a collection of its probability mass functions over all its possible values. In other words, a collection of for all .#### Coin flip example

Consider an experiment of tossing two fair coins. Denote as the number of heads. Find the probability distribution of .

Our sample space is

For the random variable : the number of heads, if we view as a function of

can take three possible values: 0, 1 and 2. The probability mass functions are

So the probability distribution of is given by

Often a bar plot is used to show the probability distribution of , with possible values of on the x-axis, and on the y-axis.

`library(ggpubr) dat <- data.frame( X = c(0, 1, 2), P = c(0.25, 0.5, 0.25) ) ggbarplot(dat, "X", "P", ylab = "P(X)", width = 0.5)`

#### Properties

The PMF must satisfy certain properties. Suppose is a discrete random variable with possible values .

- and for all other , which is to say the probability must be greater than 0.

- .

- For all the possible outcomes of a random variable where , .

- For all the possible outcomes we also have .

For properties 3 and 4, we can think of each of as a simple event for the sample space .

### Cumulative distribution function

Besides the PMF which gives us the probability of one possible value of a random variable, we may want to calculate the probability for multiple values of a random variable. In the coin flip example

The

**cumulative distribution function (CDF)**is defined asunder the discrete case. In the example above,

In addition, we can write out the whole cumulative distribution function for all possible values of

The figure for the CDF of a discrete random variable is a non-decreasing step function of . Given a data set in R, we can use the

`geom_step()`

function to visualize the CDF, or `stat_ecdf()`

for the empirical distribution function (ECDF). Visualization knowing the true CDF involves a bit more manual labor.## R code to generate the visualization of the CDF.

`library(ggplot2) library(dplyr) dat <- data.frame( a = rep(0:2, each = 2), CDF = c(0, 1/4, 1/4, 3/4, 3/4, 1), Group = rep(c("Empty", "CDF"), 3) ) ggplot(dat) + geom_point(aes(a, CDF, fill = Group), shape = 21) + scale_fill_manual(values = c("black", "white")) + geom_segment(aes( x = lag(a), y = lag(CDF), xend = a, yend = CDF, lty = Group )) + scale_linetype_manual(values = c("dashed", "solid")) + geom_segment(aes(x = -2, xend = 0, y = 0, yend = 0)) + geom_segment(aes(x = 2, xend = 4, y = 1, yend = 1)) + theme_minimal() + theme(legend.position = "none")`

### Expected value

After weβve learnt the PMF and the CDF, we can move one step further to probably one of the most important concepts in probability theory - the expectation of a random variable.

If is a discrete random variable with probability mass function

*P*(*X*), its**expectation**isIntuitively, it is the long-run average value of repetitions of the same experiment the random variable represents. In our case, the expected value of a discrete random variable is the probability-weighted average of all possible values.

Recall that random variables are some real-valued functions that map the outcomes in a sample space to some real numbers. So a real-valued function of a random variable is also a real-valued function on the same sample space, and hence is also a random variable.

Define with being a real-valued function. This means that everything we can perform on can be done in a similar fashion on . For example, calculating the expected value of :

The only difference between and is we replace by in the summation. Note how wasnβt changed!

Since the expected value is a linear function, given is a random variable, and are two real-valued functions and and are two constants,

#### Variance

The

**variance**of a random variable is often denoted , and can be written as the expectation of a function ofWeβll explain the two definitions through an example. Let be a discrete random variable with three possible values β
, and probability mass function

Find and .

By definition, we have

Let , we have

Next, we introduce some commonly seen discrete probability distributions.