Applied Statistical Methods - Exercise 11
Problem 1: Animal Model
Use the same dataset as in Exercise 10 for the sire model and predict breeding values for all animals in the dataset using an animals model. The dataset is available at
https://charlotte-ngs.github.io/asmasss2024/data/asm_ped_sim_data.csv
Hints
- The variance component \(\sigma_u^2\) of the sire effect can be assumed to be \(9\).
- The variance component \(\sigma_e^2\) of the random resiudals is \(36\).
- Sex is modelled as a fixed effect.
- The inverse sire relationship matrix can be computed using the function
getAInv()
from thepedigreemm
package.
Solution
Specify the model
\[y = ... + ... + ...\]
with vectors
- \(y\) of length \(n\) containing known phenotypic observations
- \(b\) of length \(p\) containing unknown fixed effects
- \(u\) of length \(q\) containing unknown random breeding values for all animals
- \(e\) of length \(n\) containing unknown random residuals
Known design matrices
- \(X\) of dimension \(n\times p\) linking fixed effects to observations and
- \(Z\) of dimension \(n\times q\) linking random breeding values to observations
The expected values and co-variance matrices of the random effects are
\[E\left[\begin{array}{c} y \\ u \\ e \end{array}\right] = \left[\begin{array}{c} ... \\ ... \\ ... \end{array}\right]\]
\[var\left[\begin{array}{c} y \\ u \\ e \end{array}\right] = \left[\begin{array}{ccc} ... & ... & ...\\ ... & ... & ...\\ ... & ... & ... \end{array}\right]\]
with $R = … $, $G = … $ and $V = … $.
Read the data
- Inverse Numerator Relationship Matrix
Setup mixed model equations
\[ \left[ \begin{array}{cc} ... & ... \\ ... & ... \end{array} \right] \left[ \begin{array}{c} ... \\ ... \end{array} \right] = \left[ \begin{array}{c} ... \\ ... \end{array} \right] \]
Get the known components from the data into the mixed-model equations
- Design matrix \(X\)
- Design matrix \(Z\)
- Variance ration \(\lambda_s = \sigma_e^2/\sigma_s^2\)
- Mixed model equations
Results
The first two numbers of the solutions correspond to estimates \(\widehat{b}\) which contains the intercept and the difference between group means of sex f
and m
. The remaining numbers in the solutions are the predicted breeding values of all animals in the dataset. At this point the numeric values of the predicted breeding values are not interesting. What we are interested is the ranking of the animals according to the breeding values. This is obtained by