In this chapter, we restrict ourselves to study a special type of random variables - the discrete random variables.
When an experiment is performed, we are often interested in some functions of the outcomes, rather than the outcomes themselves. For example, when we toss two six-sided dice, we may care about the sum of the two dice. These real-valued functions defined on the sample space are known as random variables.
A random variable is said to be discrete if it can take a finite or countable infinite number of distinct values. In practice, we usually use an uppercase letter to denote a random variable. For example, , and a lowercase letter, say , to denote a possible value of this random variable. When we say , we’re referring to the set of outcomes on the sample space such that holds. As an example,
Then we can assign a probability to each of the events. The probability mass function (PMF) of a discrete random variable at a given value is denoted
The probability distribution of a discrete random variable X is a collection of its probability mass functions over all its possible values. In other words, a collection of for all .
Consider an experiment of tossing two fair coins. Denote as the number of heads. Find the probability distribution of .
Our sample space is
For the random variable : the number of heads, if we view as a function of
can take three possible values: 0, 1 and 2. The probability mass functions are
So the probability distribution of is given by
Often a bar plot is used to show the probability distribution of , with possible values of on the x-axis, and on the y-axis.
library(ggpubr) dat <- data.frame( X = c(0, 1, 2), P = c(0.25, 0.5, 0.25) ) ggbarplot(dat, "X", "P", ylab = "P(X)", width = 0.5)
The PMF must satisfy certain properties. Suppose is a discrete random variable with possible values .
- and for all other , which is to say the probability must be greater than 0.
- For all the possible outcomes of a random variable where , .
- For all the possible outcomes we also have .
For properties 3 and 4, we can think of each of as a simple event for the sample space .
Besides the PMF which gives us the probability of one possible value of a random variable, we may want to calculate the probability for multiple values of a random variable. In the coin flip example
The cumulative distribution function (CDF) is defined as
under the discrete case. In the example above,
In addition, we can write out the whole cumulative distribution function for all possible values of
The figure for the CDF of a discrete random variable is a non-decreasing step function of . Given a data set in R, we can use the
geom_step()function to visualize the CDF, or
stat_ecdf()for the empirical distribution function (ECDF). Visualization knowing the true CDF involves a bit more manual labor.
R code to generate the visualization of the CDF.
library(ggplot2) library(dplyr) dat <- data.frame( a = rep(0:2, each = 2), CDF = c(0, 1/4, 1/4, 3/4, 3/4, 1), Group = rep(c("Empty", "CDF"), 3) ) ggplot(dat) + geom_point(aes(a, CDF, fill = Group), shape = 21) + scale_fill_manual(values = c("black", "white")) + geom_segment(aes( x = lag(a), y = lag(CDF), xend = a, yend = CDF, lty = Group )) + scale_linetype_manual(values = c("dashed", "solid")) + geom_segment(aes(x = -2, xend = 0, y = 0, yend = 0)) + geom_segment(aes(x = 2, xend = 4, y = 1, yend = 1)) + theme_minimal() + theme(legend.position = "none")
After we’ve learnt the PMF and the CDF, we can move one step further to probably one of the most important concepts in probability theory - the expectation of a random variable.
If is a discrete random variable with probability mass function P(X), its expectation is
Intuitively, it is the long-run average value of repetitions of the same experiment the random variable represents. In our case, the expected value of a discrete random variable is the probability-weighted average of all possible values.
Recall that random variables are some real-valued functions that map the outcomes in a sample space to some real numbers. So a real-valued function of a random variable is also a real-valued function on the same sample space, and hence is also a random variable.
Define with being a real-valued function. This means that everything we can perform on can be done in a similar fashion on . For example, calculating the expected value of :
The only difference between and is we replace by in the summation. Note how wasn’t changed!
Since the expected value is a linear function, given is a random variable, and are two real-valued functions and and are two constants,
The variance of a random variable is often denoted , and can be written as the expectation of a function of
We’ll explain the two definitions through an example. Let be a discrete random variable with three possible values , and probability mass function
Find and .
By definition, we have
Let , we have
Next, we introduce some commonly seen discrete probability distributions.