MATH5835M Practical

The MATH5835M module is assessed by an examination (80%) and a practical (20%). This is the practical, worth 20% of your final module mark.
You must hand in your solution via Gradescope by Thursday, 14th December 2023, 2pm.
Reports must be typeset (not handwritten) and should be no more than 6 pages in length.
Within reason you may talk to your friends about this piece of work, but you should not send R code (or output) to each other. Your report must be your own work.

Introduction

The distribution of wind speeds in a city follows a mixture of log-normal distributions: If we denote the wind speed (in miles per hour) by \(X\), we have \[\begin{equation*} \log(X) \sim \mathcal{N}(\mu_K, \sigma_K^2), \end{equation*}\] where \(K\) is random with \[\begin{equation*} P(K = 1) = 0.1, \quad P(K = 2) = 0.4, \quad P(K = 3) = 0.5, \end{equation*}\] and the means and standard deviations are given by \[\begin{equation*} \mu = \begin{pmatrix} 2.1 \\ 1.6 \\ -0.5 \end{pmatrix}, \quad \sigma = \begin{pmatrix} 0.55 \\ 0.7 \\ 0.6 \end{pmatrix}. \end{equation*}\] The density of the log-normal distribution with parameters \(\mu\in\mathbb{R}\) and \(\sigma > 0\) is \[ \varphi(x; \mu, \sigma) = \frac{1}{x\sqrt{2\pi\sigma^2}} \exp\left(-\frac{\left(\log(x)-\mu\right)^2}{2\sigma^2}\right) \] for all \(x \geq 0\). This function is available in R as dlnorm(). You can generate samples from the wind speed distribution as follows:

wind_speed <- function(n, w, mu, sigma) {
    k <- length(w)
    i <- sample(k, n, replace = TRUE, prob = w)
    rlnorm(n, mu[i], sigma[i])
}

w <- c(0.1, 0.4, 0.5)
mu <- c(2.1, 1.6, -0.5)
sigma <- c(0.55, 0.7, 0.6)
wind_speed(5, w, mu, sigma)

## [1]  0.4719161  0.1363222  1.4373320  6.0327490 10.6146743

Task 1

We are interested in the probability that the wind speed is larger than 40 miles per hour.

Use basic Monte Carlo estimation to estimate the probability that the wind speed is larger than 40 miles per hour.
Determine the root mean squared error of your estimate.

Task 2

Our aim here is to find a more efficient estimator for the probability from task 1.

Use importance sampling to estimate the probability that the wind speed is larger than 40 miles per hour.
Compare the root mean squared error of your estimate with the one from Task 1.

Task 3

Now, for simplicity, assume that the wind speed follows a single log-normal distribution with parameters \(\mu\) and \(\sigma\): \[\begin{equation*} \log(X) \sim \mathcal{N}(\mu, \sigma^2). \end{equation*}\] Furthermore, we assume that \(\mu=0\) is known and \(\sigma\) is unknown, random and exponentially distributed: \(\sigma \sim \mathrm{Exp}(1)\).

On five different days we have observed the wind speeds \(x_1=0.50\), \(x_2=1.67\), \(x_3=2.22\), \(x_4=0.22\) and \(x_5=4.36\). Using these data we want to learn about the unknown value \(\sigma\). From Bayes rule we know that the conditional distribution of \(\sigma\), given the data \((x_1, \ldots, x_5)\) has density \[\begin{equation*} f(s) = \frac{1}{Z} \tilde f(s), \quad \tilde f(s) = \pi(s) \prod_{i=1}^5 \varphi(x_i; 0, s), \end{equation*}\] where \(Z\) is the normalising constant, \(\pi(s)\) is the density of the \(\mathrm{Exp}(1)\) distribution, and \(\varphi(x_i; 0, s)\) is the density of the log-normal distribution with parameters \(\mu=0\) and \(\sigma=s\). In Bayesian statistics, \(\pi\) is called the prior density and \(f\) is called the posterior density.

We want to sample from the posterior density \(f\) using envelope rejection sampling.

Produce a plot of \(\tilde f(s)\) as a function of \(s\).
Choose an appropriate density \(g(s)\) for the proposals for the rejection sampling method. Explain why your choice is appropriate.
Show a plot of \(\tilde f(s) / g(s)\), and from this determine an appropriate value for the constant \(c\) in the envelope rejection sampling algorithm.
Implement the envelope rejection sampling algorithm and generate \(N=10,000\) samples from the posterior density \(f\). Use these samples to plot a histogram.
Compare the posterior distribution to the prior distribution, and comment on the differences you see.

MATH5835M Practical

Jochen Voss

2023-11-24

Introduction

Task 1

Task 2

Task 3