### The uniform distribution

In the discrete case, if the outcomes of a random experiment are equally likely, then calculating the probability of the events will be very easy. This “equally likely” idea may also be applied to continuous random variables.

For a continuous random variable , we say is

**uniformly distributed**over the interval if the density function of isIn other words, the density function of is a constant over a given interval and zero otherwise. This holds because

The plot for this function is a horizontal line at between the interval , and 0 otherwise. From , we can figure out the cumulative distribution function to be

Plotting against

*a*would result in a line connecting and , and staying at 1 onward.#### Bus example

Buses arrive at a bus stop at a 15 minutes interval starting at 7:00 a.m. Suppose that a passenger arrives at this stop at a time that is uniformly distributed between 7 and 7:30. Find the probability that he needs to wait less than 5 minutes until the next bus arrives.

We know that buses are going to arrive at 7:00, 7:15 and 7:30 in this time interval. There are then two scenarios that the passenger waits less than 5 minutes: arriving between 7:10 and 7:15, or between 7:25 and 7:30. Define events and .

Since , we have

So .

#### Properties

We can get the

**expectation**of directly from the definition.Then we calculate to find .

So we’ve found the expected value for the uniform distribution is , and the variance is . We also want to find the moment generating function for the uniform distribution.

### The normal distribution

Another very important type of continuous random variables is the

**normal random variable**. The probability distribution of a normal random variable is called the**normal distribution**.The normal distribution is

*the most important*probability distribution in both theory and practice. In real world, many random phenomena obey, or at least approximate, a normal distribution. For example, the distribution of the velocity of the molecules in gas; the number of birds in flocks, etc.Another reason for the popularity of the normal distribution is due to its nice mathematical properties. We first study the simplest case of the normal distribution, the standard normal distribution.

#### Standard normal distribution

A continuous random variable is said to follow a

**standard normal distribution**if the density function of is given byConventionally, we use to denote a standard normal random variable, and to denote a standard normal distribution.

The density function of a normal random variable is a bell-shaped curve which is symmetric about its mean (in this case, 0). The symmetry can be easily proven by . We can also find . Next we prove this density function satisfies the properties of PDFs. The first property we want to prove is

which is equivalent to proving

It’s easier to prove , as shown below:

Intuitively, we can think of this as transforming a circle from Cartesian coordinates to polar coordinates, .

In the Cartesian coordinate system the increase can be thought as the area of a small rectangle. In the polar coordinate system, the increase in area can be approximated by a rectangle with one side being and the other being the length of the arc when the angle increases by and the radius is . Thus, . Now we have

So we can rewrite as

Formally speaking, the in is the

**Jacobian determinant**, and can be found withKnowing that and , we can find the determinant of the Jacobian matrix

#### Expectation and variance

We first find the

**expected value**of a standard normal random variable.and we can also get due to symmetry. To find the

**variance**of , we just need to find since the expectation of is 0:Recall integration by parts: . In our case and , so

When , it goes to linearly, but goes to 0 exponentially, so the product goes to 0. Similarly the term goes to 0 when . So we have

#### Cumulative probability function

The cumulative distribution function of a standard normal random variable is also very useful.

Unfortunately, the integral has no analytical solution. People developed the

*standard normal table*that can be used to find values for a given . The table is referred as the table or the table. In R, we can use the`pnorm()`

function.For example, to find we first look for row 0.3 and then find the column for 0.02. Since the standard normal distribution is symmetric around 0, is equivalent to . There are also tables designed to describe through the right tail area.

In R, we can use the

`qnorm()`

function to find the corresponding quantiles.`library(ggplot2) ggplot(NULL, aes(x = c(-3, 3))) + stat_function(fun = pnorm, geom = "line") + stat_function(fun = pnorm, geom = "area", xlim = c(-3, qnorm(0.32)), fill = "#101010", alpha = 0.3) + labs(x = "a", y = expression(Phi(a))) + ggpubr::theme_pubr()`

#### General normal distribution

So far we’ve been discussing the case , the standard normal distribution. Now we extend our findings to a general case: . The density function is given by

The reason we focused on the standard normal distribution is because it’s easy to analyze. In addition, the properties of the general normal random variable can be derived from that of the standard normal random variable. If we consider a linear function of the standard normal random variable

we can easily find

In that sense,

We can use this linear function to express with and find the cumulative distribution function and probability density function of :

Here we apply the chain rule to calculate the partial derivatives. Let

and the cumulative distribution function

where the exact value can be found in the table. For example, if and we want to find and , we have

#### Moment generating function

Again, we start with the simpler case of the standard normal distribution.

We can validate this by calculating and with the MGF. For the expectation,

And for the variance,

Deriving the MGF of a general normal distribution requires the following theorem

**.****Theorem**

, if , we have

The proof is given as follows.

If we apply this to the MGF, we have , so

#### Approximation to the binomial

Another very useful property of the normal distribution is that it can be used to approximate a binomial distribution when

*n*is very large.Let . Denote and the mean and standard deviation of . Then we have the

**normal approximation to the binomial**where is the distribution function of , i.e. . When , . We also know that

so we can rewrite the equation as

The proof of this theorem is a special case of the central limit theorem, and we’ll discuss it in a later chapter.

There’s another approximation to the binomial distribution: the

**Poisson approximation**, and its conditions are complementary to the conditions of the normal approximation. The Poisson approximation requires to be large () and to be small () such that is a fixed number.The normal approximation is reasonable when is very large and and are not too small, i.e. and .

We’ll demonstrate the approximation with an example. Suppose that we flip a fair coin 40 times, and let denote the number of heads observed. We want to calculate .

We have . We can calculate the probability directly:

We can also apply the normal approximation. We have

Since we’re approximating a discrete distribution with a continuous one, we need a range instead of an exact value:

The numbers turn out to be fairly close. The normal approximation becomes much more useful when the probability of the binomial can’t be easily directly calculated.

Suppose the ideal class size of a college is 150 students. The college knows from past experience that, on average, only 30 of students will accept their offers. If the college gives out 450 offers, what is the probability that more than 150 students will actually attend?

For simplicity, we assume the decision of the students is independent from each other. Let be the number of students who accept the offer. .

We can see that it’s hard to calculate this probability by hand. Again, we can apply the normal approximation. We have and .

### The exponential distribution

A continuous random variable

*X*is an**exponential random variable**with parameter if the density function is given bywhere . In practice, the exponential distribution often arises as the distribution of the amount of time until some specific event occurs. For example, the lifetime of a new mobile phone; the amount of time from now until an earthquake occurs; the survival time of a patient.

#### Properties

If

Cleaning this up a bit, we have the PDF of as

To find the expectation of , we first prove that

Now we can easily find by setting :

We also have

so the variance of

*X*isFinally, the moment generating function of

*X*isThe MGF is given by

#### Memoryless

The exponential probability is usually defined as a probability associated with the time counted from now and onward. So why don’t we care about what happened in the past?

For example, when we consider the time from now to the next earthquake, why don’t we care about how long has it been since the last earthquake? This leads to a nice property the exponential distribution has. We say a non-negative random variable is

**memoryless**ifGoing back to the earthquake example, the memoryless property means that the probability we will

*not*observe an earthquake for days given that it has already been days without an earthquake is the same as the probability of not observing an earthquake in the first days.if is memoryless. For the exponential distribution,

There’s various continuous distributions that are very important but not included here, such as the

**beta distribution**and the**gamma distribution**. In fact, the exponential distribution and the chi-square distribution are both special cases of the gamma distribution. Proofs and properties of these distributions might be included in a later chapter.