Applied Statistical Methods - Exercise 11

Author

Peter von Rohr

Published

May 20, 2024

Problem 1: Animal Model

Use the same dataset as in Exercise 10 for the sire model and predict breeding values for all animals in the dataset using an animals model. The dataset is available at

https://charlotte-ngs.github.io/asmasss2024/data/asm_ped_sim_data.csv

Hints

The variance component $\sigma_u^2$ of the sire effect can be assumed to be $9$.
The variance component $\sigma_e^2$ of the random resiudals is $36$.
Sex is modelled as a fixed effect.
The inverse sire relationship matrix can be computed using the function getAInv() from the pedigreemm package.

Solution

Specify the model

\[y = ... + ... + ...\]

with vectors

$y$ of length $n$ containing known phenotypic observations
$b$ of length $p$ containing unknown fixed effects
$u$ of length $q$ containing unknown random breeding values for all animals
$e$ of length $n$ containing unknown random residuals

Known design matrices

$X$ of dimension $n\times p$ linking fixed effects to observations and
$Z$ of dimension $n\times q$ linking random breeding values to observations

The expected values and co-variance matrices of the random effects are

\[E\left[\begin{array}{c} y \\ u \\ e \end{array}\right] = \left[\begin{array}{c} ... \\ ... \\ ... \end{array}\right]\]

\[var\left[\begin{array}{c} y \\ u \\ e \end{array}\right] = \left[\begin{array}{ccc} ... & ... & ...\\ ... & ... & ...\\ ... & ... & ... \end{array}\right]\]

with $R = … $, $G = … $ and $V = … $.

Read the data

Inverse Numerator Relationship Matrix

Setup mixed model equations

\[ \left[ \begin{array}{cc} ... & ... \\ ... & ... \end{array} \right] \left[ \begin{array}{c} ... \\ ... \end{array} \right] = \left[ \begin{array}{c} ... \\ ... \end{array} \right] \]

Get the known components from the data into the mixed-model equations

Design matrix $X$

Design matrix $Z$

Variance ration $\lambda_s = \sigma_e^2/\sigma_s^2$

Mixed model equations

Results

The first two numbers of the solutions correspond to estimates $\widehat{b}$ which contains the intercept and the difference between group means of sex f and m. The remaining numbers in the solutions are the predicted breeding values of all animals in the dataset. At this point the numeric values of the predicted breeding values are not interesting. What we are interested is the ranking of the animals according to the breeding values. This is obtained by