# Posterior Distribution Derivation

- Author  : Payam Parvazmanesh
- Contact : payam.manesh@gmail.com
- Pattern Recognition
- 
We assume the following:


- Prior Distribution for $( \mu )$: $( \mu_0 \sim \mathcal{N}(\mu_0, \sigma_0^2) )$, a normal distribution with mean $( \mu_0 )$ and variance $( \sigma_0^2 )$.

- Likelihood Function: The likelihood of the data $( \mathbf{D} = \{x_1, x_2, \dots, x_n\} )$ is assumed to come from a normal distribution with mean $( \mu )$ and variance $( \sigma^2 )$, i.e., $( x_i \sim \mathcal{N}(\mu, \sigma^2) )$.


Now, we want to find the posterior distribution of $( \mu )$ given the data.

### 1. Prior Distribution
The prior distribution of $( \mu )$ is:

$$
p(\mu) = \frac{1}{\sqrt{2\pi \sigma_0^2}} \exp\left( -\frac{(\mu - \mu_0)^2}{2\sigma_0^2} \right)
$$

### 2. Likelihood Function
Since the data points $( x_1, x_2, \dots, x_n )$ are independent, the likelihood is the product of normal distributions for each data point:

$$
p(\mathbf{D} | \mu) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x_i - \mu)^2}{2\sigma^2} \right)
$$

The likelihood function simplifies to:

$$
p(\mathbf{D} | \mu) = \frac{1}{(2\pi \sigma^2)^{n/2}} \exp\left( -\frac{1}{2\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 \right)
$$

### 3. Posterior Distribution
The posterior distribution is proportional to the product of the prior and the likelihood:

$$
p(\mu | \mathbf{D}) \propto p(\mu) p(\mathbf{D} | \mu)
$$

Substituting the expressions for $( p(\mu) )$ and $( p(\mathbf{D} | \mu) )$:

$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{(\mu - \mu_0)^2}{2\sigma_0^2} \right) \exp\left( -\frac{1}{2\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 \right)
$$

We can combine the terms inside the exponentials:

$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{1}{2} \left( \frac{(\mu - \mu_0)^2}{\sigma_0^2} + \frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 \right) \right)
$$

### 4. Completing the Square (Detailed Derivation)

The goal of completing the square is to rewrite the posterior probability $( p(\mu | \mathbf{D}) )$ into a standard normal distribution form. 

#### 4.1 Expanding Terms in the Exponent
Recall the posterior is proportional to:
$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{1}{2} \left( \frac{(\mu - \mu_0)^2}{\sigma_0^2} + \frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 \right) \right).
$$

First, expand the terms in the exponent:

1. Expand $(\frac{(\mu - \mu_0)^2}{\sigma_0^2})$:
   $$
   \frac{(\mu - \mu_0)^2}{\sigma_0^2} = \frac{\mu^2}{\sigma_0^2} - \frac{2\mu\mu_0}{\sigma_0^2} + \frac{\mu_0^2}{\sigma_0^2}.
   $$

2. Expand $(\frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2)$:
   $$
   \sum_{i=1}^{n} (x_i - \mu)^2 = \sum_{i=1}^{n} x_i^2 - 2\mu \sum_{i=1}^{n} x_i + n\mu^2.
   $$
   
   Substituting this back:
   $$
   \frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 = \frac{1}{\sigma^2} \left( \sum_{i=1}^{n} x_i^2 - 2\mu \sum_{i=1}^{n} x_i + n\mu^2 \right).
   $$

Combine the terms:
$$
\frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 = \frac{n\mu^2}{\sigma^2} - \frac{2\mu}{\sigma^2} \sum_{i=1}^{n} x_i + \frac{\sum_{i=1}^{n} x_i^2}{\sigma^2}.
$$

#### 4.2 Combine All Terms
Now, substitute both expansions back into the posterior:

$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{1}{2} \left( \frac{\mu^2}{\sigma_0^2} - \frac{2\mu\mu_0}{\sigma_0^2} + \frac{\mu_0^2}{\sigma_0^2} + \frac{n\mu^2}{\sigma^2} - \frac{2\mu}{\sigma^2} \sum_{i=1}^{n} x_i + \frac{\sum_{i=1}^{n} x_i^2}{\sigma^2} \right) \right).
$$

Group terms involving $(\mu^2)$, $(\mu)$, and constants:

1. Coefficient of $(\mu^2)$:
   $$
   \frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}.
   $$

2. Coefficient of $(\mu)$:
   $$
   -\frac{2\mu_0}{\sigma_0^2} - \frac{2}{\sigma^2} \sum_{i=1}^{n} x_i.
   $$

3. Constant terms:
   $$
   \frac{\mu_0^2}{\sigma_0^2} + \frac{\sum_{i=1}^{n} x_i^2}{\sigma^2}.
   $$

Thus, the posterior becomes:
$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{1}{2} \left( \left( \frac{1}{\sigma_0^2} + \frac{n}{\sigma^2} \right) \mu^2 - 2\left( \frac{\mu_0}{\sigma_0^2} + \frac{\sum_{i=1}^{n} x_i}{\sigma^2} \right)\mu + \text{(constant terms)} \right) \right).
$$

#### 4.3 Completing the Square for $(\mu)$
To rewrite this as a standard quadratic form, complete the square for $(\mu)$. 

The quadratic expression is:
$$
a\mu^2 - 2b\mu + \text{(constant terms)},
$$
where:
$$
a = \frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}, \quad b = \frac{\mu_0}{\sigma_0^2} + \frac{\sum_{i=1}^{n} x_i}{\sigma^2}.
$$

Complete the square:
$$
a\mu^2 - 2b\mu = a\left( \mu^2 - \frac{2b}{a}\mu \right) = a\left( \left( \mu - \frac{b}{a} \right)^2 - \left( \frac{b}{a} \right)^2 \right).
$$

Substitute this back:
$$
p(\mu | \mathbf{D}) \propto \exp\left( -\frac{1}{2} a \left( \mu - \frac{b}{a} \right)^2 \right).
$$

#### 4.4 Identify the Posterior Mean and Variance
From the completed square, the posterior distribution is normal:
$$
\mu | \mathbf{D} \sim \mathcal{N}\left( \frac{b}{a}, \frac{1}{a} \right).
$$

Substitute $(a)$ and $(b)$:

1. **Posterior Mean**:
   $$
   \mu_{\text{post}} = \frac{b}{a} = \frac{\frac{\mu_0}{\sigma_0^2} + \frac{\sum_{i=1}^{n} x_i}{\sigma^2}}{\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}} = \frac{\sigma^2 \mu_0 + n \sigma_0^2 \bar{x}}{\sigma^2 + n \sigma_0^2}.
   $$

2. **Posterior Variance**:
   $$
   \sigma^2_{\text{post}} = \frac{1}{a} = \frac{1}{\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}} = \frac{\sigma^2 \sigma_0^2}{\sigma^2 + n \sigma_0^2}.
   $$
### 5. Resulting Posterior Distribution
After completing the square, the posterior distribution of $( \mu )$ is a normal distribution with the following mean and variance:

- Mean: 
  $$
  \mu_{\text{post}} = \frac{\sigma^2 \mu_0 + n \sigma_0^2 \bar{x}}{\sigma^2 + n \sigma_0^2}
  $$
  where $( \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i )$ is the sample mean.

- Variance:
  $$
  \sigma^2_{\text{post}} = \frac{\sigma^2 \sigma_0^2}{\sigma^2 + n \sigma_0^2}
  $$

Therefore, the posterior distribution of $( \mu )$ given the data $( \mathbf{D} )$ is:

$$
\mu | \mathbf{D} \sim \mathcal{N}\left( \frac{\sigma^2 \mu_0 + n \sigma_0^2 \bar{x}}{\sigma^2 + n \sigma_0^2}, \frac{\sigma^2 \sigma_0^2}{\sigma^2 + n \sigma_0^2} \right)
$$

This result shows that the posterior distribution of $( \mu )$ is normal, with a mean that is a weighted average of the prior mean $( \mu_0 )$ and the sample mean $( \bar{x} )$, where the weights depend on the prior variance $( \sigma_0^2 )$, the sample size $( n )$, and the variance of the data $( \sigma^2 )$. The posterior variance is a combination of the prior variance and the data variance.


