Copied from Wikipedia.

Template:Distinguish Template:Probability distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] parameterized by two positive shape parameters, typically denoted by α and β.


Probability density function

The probability density function of the beta distribution is

 f(x;\alpha,\beta) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{\int_0^1 u^{\alpha-1} (1-u)^{\beta-1}\, du} \!
= \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}\, x^{\alpha-1}(1-x)^{\beta-1}\!
= \frac{1}{\mathrm{B}(\alpha,\beta)}\, x

where \Gamma is the gamma function. The beta function, B, appears as a normalization constant to ensure that the total probability integrates to unity.

Cumulative distribution function

The cumulative distribution function is

F(x;\alpha,\beta) = \frac{\mathrm{B}_x(\alpha,\beta)}{\mathrm{B}(\alpha,\beta)} = I_x(\alpha,\beta) \!

where \mathrm{B}_x(\alpha,\beta) is the incomplete beta function and I_x(\alpha,\beta) is the regularized incomplete beta function.



The expected value and variance of a beta random variable X with parameters α and β are given by the formulae:

   \operatorname{E}(X)   = & \frac{\alpha}{\alpha+\beta} \\
   \operatorname{Var}(X) = & \frac{\alpha \beta}{(\alpha+\beta)^2(\alpha+\beta+1)}

The skewness is

\frac{2 (\beta - \alpha) \sqrt{\alpha + \beta + 1} }   
        {(\alpha + \beta + 2) \sqrt{\alpha \beta}}. \,\!

The kurtosis excess is:

{\alpha \beta (\alpha+\beta+2) (\alpha+\beta+3)}.\,\!

Quantities of information

Given two beta distributed random variables, X ~ Beta(α, β) and Y ~ Beta(α', β'), the information entropy of X is

H(X) &= \ln\mathrm{B}(\alpha,\beta)-(\alpha-1)\psi(\alpha)-(\beta-1)\psi(\beta)+(\alpha+\beta-2)\psi(\alpha+\beta)

where \psi is the digamma function.

The cross entropy is

H(X,Y) = \ln\mathrm{B}(\alpha',\beta')-(\alpha'-1)\psi(\alpha)-(\beta'-1)\psi(\beta)+(\alpha'+\beta'-2)\psi(\alpha+\beta).\,

It follows that the Kullback-Leibler divergence between these two beta distributions is

 D_{\mathrm{KL}}(X,Y) = \ln\frac{\mathrm{B}(\alpha',\beta')}
                                {\mathrm{B}(\alpha,\beta)} -
                        (\alpha'-\alpha)\psi(\alpha) - (\beta'-\beta)\psi(\beta) + 


The beta density function can take on different shapes depending on the values of the two parameters:

  • \alpha < 1,\ \beta < 1 is U-shaped (red plot)
  • \alpha < 1,\ \beta \geq 1 or \alpha = 1,\ \beta > 1 is strictly decreasing (blue plot)
  • \alpha = 1,\ \beta = 1 is the uniform distribution
  • \alpha = 1,\ \beta < 1 or \alpha > 1,\ \beta \leq 1 is strictly increasing (green plot)
    • \alpha > 2,\ \beta = 1 is strictly convex
    • \alpha = 2,\ \beta = 1 is a straight line
    • 1 < \alpha < 2,\ \beta = 1 is strictly concave
  • \alpha > 1,\ \beta > 1 is unimodal (purple & black plots)

Moreover, if \alpha = \beta then the density function is symmetric about 1/2 (red & purple plots).

Parameter estimation


\bar{x} = \frac{1}{N}\sum_{i=1}^N x_i

be the sample mean and

v = \frac{1}{N}\sum_{i=1}^N (x_i - \bar{x})^2

be the sample variance. The method-of-moments estimates of the parameters are

\alpha = \bar{x} \left(\frac{\bar{x} (1 - \bar{x})}{v} - 1 \right),
\beta = (1-\bar{x}) \left(\frac{\bar{x} (1 - \bar{x})}{v} - 1 \right).

If the distribution is required over an interval other than 0 and 1, say \ l and \ h , then replace \bar{x} with \frac{(\bar{x}-l)}{(h-l)} , and \ v with \frac{v}{(h-l)^2} in the above equations [1] [2].

Related distributions


B(ij) with integer values of i and j is the distribution of the i-th order statistic (the i-th smallest value) of a sample of i + j − 1 independent random variables uniformly distributed between 0 and 1. The cumulative probability from 0 to x is thus the probability that the i-th smallest value is less than x, in other words, it is the probability that at least i of the random variables are less than x, a probability given by summing over the binomial distribution with its p parameter set to x. This shows the intimate connection between the beta distribution and the binomial distribution.

Beta distributions are used extensively in Bayesian statistics, since beta distributions provide a family of conjugate pair distributions for binomial (including Bernoulli) and geometric distributions. The beta(0,0) distribution is an improper prior and sometimes used to represent ignorance of parameter values.

The Beta distribution can be used to model events which are constrained to take place within an interval defined by a minimum and maximum value. For this reason, the Beta distribution - along with the triangular distribution - is used extensively in PERT, critical path method (CPM) and other project management / control systems to describe the time to completion of a task. In project management, shorthand computations are widely used to estimate the mean and standard deviation of the Beta distribution:

  \mathrm{mean}(X) & {} = E(X)= \frac{a + 4b + c}{6}, \\
  \mathrm{s.d.}(X) & {} = \frac{c-a}{6},

where a is the minimum, c is the maximum, and b is the most likely value.

External links