This module introduces the concept in probability distributions, such as probability mass function(pmf), cumulative distribution function(cdf) and probability density function(pdf).
The distribution
of a random variable
is simply a
probability measure which assigns probabilities to events on thereal line. The distribution
answers questions of the form:
What is the probability that
lies in some subset
of the real
line?
In practice we summarize
by its
Probability Mass Function - pmf (for
discrete variables only),
Probability Density Function -
pdf (mainly for continuous variables), or
Cumulative Distribution Function - cdf (for either
discrete or continuous variables).
Probability mass function (pmf)
Suppose the discrete random variable
can take a set of
real values
, then the
pmf is defined as:
where
. e.g. For a normal 6-sided die,
and
. For a pair of dice being thrown,
and the pmf is as shown in (a) of
.
Cumulative distribution function (cdf)
The
cdf can describe discrete, continuous or
mixed distributions of
and is
defined as:
For discrete
:
giving step-like cdfs as in the example of (b) of
.
Properties follow directly from the Axioms of Probability:
,
is non-decreasing as
increases
where there is no ambiguity we will often drop the subscript
and refer to the cdf as
.
Probability density function (pdf)
The
pdf of
is
defined as the derivative of the cdf:
The pdf can also be interpreted in derivative form as
:
For a discrete random variable with pmf given by
:
An example of the pdf of the 2-dice discrete random process isshown in (c) of
.
(Strictly the delta functions should extend vertically toinfinity, but we show them only reaching the values of their
areas,
.)
The pdf and cdf of a continuous distribution (in this case the
normal or
Gaussian distribution) are
shown in (d) and (e) of
.
The cdf is the integral of the pdf and
should always go from zero to unity for a valid probabilitydistribution.
Properties of pdfs:
As for the cdf, we will often drop the subscript
and refer simply to
when no confusion can arise.