Applied Statistical Methods - Solution 10

Author

Peter von Rohr

Published

May 13, 2024

Problem 1: Sire Model

Use the dataset available from the address shown below to predict sire-breeding values using a sire model.

https://charlotte-ngs.github.io/asmasss2024/data/asm_ped_sim_data.csv

Hints

  • The variance component \(\sigma_s^2\) of the sire effect can be assumed to be \(2.25\).
  • The variance component \(\sigma_e^2\) of the random resiudals is \(36\).
  • Sex is modelled as a fixed effect.
  • The inverse sire relationship matrix can be computed using the function getAInv() from the pedigreemm package.

Solution

Specify the model

\[y = Xb + Zs + e\]

with vectors

  • \(y\) of length \(n\) containing known phenotypic observations
  • \(b\) of length \(p\) containing unknown fixed effects
  • \(s\) of length \(q\) containing unknown random sire breeding values
  • \(e\) of length \(n\) containing unknown random residuals

Known design matrices

  • \(X\) of dimension \(n\times p\) linking fixed effects to observations and
  • \(Z\) of dimension \(n\times q\) linking random breeding values to observations

The expected values and co-variance matrices of the random effects are

\[E\left[\begin{array}{c} y \\ s \\ e \end{array}\right] = \left[\begin{array}{c} Xb \\ 0 \\ 0 \end{array}\right]\]

\[var\left[\begin{array}{c} y \\ s \\ e \end{array}\right] = \left[\begin{array}{ccc} V & ZG_s & R\\ G_sZ^T & G_s & 0\\ R & 0 & R \end{array}\right]\]

with \(R = I * \sigma_e^2\), \(G_s = A_s \sigma_s^2\) and \(V = ZG_sZ^T + R\).

Read the data

  • Inverse Sire Relationship Matrix

Setup mixed model equations

\[ \left[ \begin{array}{cc} X^TX & X^TZ \\ Z^TX & Z^TZ + \lambda_s * A_s^{-1} \end{array} \right] \left[ \begin{array}{c} \hat{b} \\ \hat{s} \end{array} \right] = \left[ \begin{array}{c} X^Ty \\ Z^Ty \end{array} \right] \]

Get the known components from the data into the mixed-model equations

  • Design matrix \(X\)
  • Design matrix \(Z\)
  • Variance ration \(\lambda_s = \sigma_e^2/\sigma_s^2\)
  • Mixed model equations

Results

The first two numbers of the solutions correspond to estimates \(\widehat{b}\) which contains the intercept and the difference between group means of sex f and m. The remaining numbers in the solutions are the predicted breeding values of the three sires 1, 2 and 8. At this point the numeric values of the predicted breeding values are not interesting. What we are interested is the ranking of the sires according to the breeding values. This is obtained by