Applied Statistical Methods - Solution 10
Problem 1: Sire Model
Use the dataset available from the address shown below to predict sire-breeding values using a sire model.
https://charlotte-ngs.github.io/asmasss2024/data/asm_ped_sim_data.csv
Hints
- The variance component \(\sigma_s^2\) of the sire effect can be assumed to be \(2.25\).
- The variance component \(\sigma_e^2\) of the random resiudals is \(36\).
- Sex is modelled as a fixed effect.
- The inverse sire relationship matrix can be computed using the function
getAInv()
from thepedigreemm
package.
Solution
Specify the model
\[y = Xb + Zs + e\]
with vectors
- \(y\) of length \(n\) containing known phenotypic observations
- \(b\) of length \(p\) containing unknown fixed effects
- \(s\) of length \(q\) containing unknown random sire breeding values
- \(e\) of length \(n\) containing unknown random residuals
Known design matrices
- \(X\) of dimension \(n\times p\) linking fixed effects to observations and
- \(Z\) of dimension \(n\times q\) linking random breeding values to observations
The expected values and co-variance matrices of the random effects are
\[E\left[\begin{array}{c} y \\ s \\ e \end{array}\right] = \left[\begin{array}{c} Xb \\ 0 \\ 0 \end{array}\right]\]
\[var\left[\begin{array}{c} y \\ s \\ e \end{array}\right] = \left[\begin{array}{ccc} V & ZG_s & R\\ G_sZ^T & G_s & 0\\ R & 0 & R \end{array}\right]\]
with \(R = I * \sigma_e^2\), \(G_s = A_s \sigma_s^2\) and \(V = ZG_sZ^T + R\).
Read the data
- Inverse Sire Relationship Matrix
Setup mixed model equations
\[ \left[ \begin{array}{cc} X^TX & X^TZ \\ Z^TX & Z^TZ + \lambda_s * A_s^{-1} \end{array} \right] \left[ \begin{array}{c} \hat{b} \\ \hat{s} \end{array} \right] = \left[ \begin{array}{c} X^Ty \\ Z^Ty \end{array} \right] \]
Get the known components from the data into the mixed-model equations
- Design matrix \(X\)
- Design matrix \(Z\)
- Variance ration \(\lambda_s = \sigma_e^2/\sigma_s^2\)
- Mixed model equations
Results
The first two numbers of the solutions correspond to estimates \(\widehat{b}\) which contains the intercept and the difference between group means of sex f
and m
. The remaining numbers in the solutions are the predicted breeding values of the three sires 1
, 2
and 8
. At this point the numeric values of the predicted breeding values are not interesting. What we are interested is the ranking of the sires according to the breeding values. This is obtained by