Disclaimer

The dataset with one locus is used to compute breeding values based on one locus. This requires to first estimate the genotypic values.

Read the dataset

The dataset is read, with an explict specification of column datatypes.

s_gen_phen_data <- ifelse(b_online, 
                          "https://charlotte-ngs.github.io/lbgfs2022/data/p1_mrk_one_locus.csv",
                          file.path(here::here(), "docs", "data", "p1_mrk_one_locus.csv"))
library(readr)
tbl_gen_phen <- read_delim(file = s_gen_phen_data,
                           delim = ";",
                           col_types = cols(
                             ID = col_integer(),
                             Genotype = col_integer(),
                             Phenotype = col_double()
                           ))
head(tbl_gen_phen)

Descriptive Statistics for Genotypes

What are the available genotypes

library(dplyr)
tbl_gen_phen %>% 
  group_by(Genotype) %>%
  summarize(geno_count = n())

Genotypic Values

The genotypic values are estimated with a regression through the homozygous genotypes. For that, we create a new tibble with only the homozygous genotypes

tbl_homozygous_gen <- tbl_gen_phen %>%
  filter(Genotype < 3)
head(tbl_homozygous_gen)

Checking with the counts

tbl_homozygous_gen %>%
  group_by(Genotype) %>%
  summarise(geno_count = n())

Whether, we need a regression with or without intercept can be determined by the average value of all animals with Genotype \(0\).

Because, the mean of all animals with genotype \(0\) is different from \(0\), we use a regression with intercept.

The regression is modelled via lm()

lm_geno_val <- lm(Phenotype ~ Genotype, data = tbl_homozygous_gen)
(sry_geno_val <- summary(lm_geno_val))

The genotypic value \(a\) for the homozygous animals is given by the estimate of the regression coefficient and is

(geno_val_a <- sry_geno_val$coefficients["Genotype", "Estimate"])
[1] 0.04541911

Assigning genotypic values to the homozygous genotypes requires to subtract the intercept and the genotypic value \(a\) from the current mean phenotypes for the two genoypes

geno_val_hom0 <- tbl_hom_gen_mean$mean_phen[tbl_hom_gen_mean$Genotype == 0] - 
  sry_geno_val$coefficients["(Intercept)", "Estimate"] -
  geno_val_a
geno_val_hom0
[1] -0.04541911

For genotype \(2\)

geno_val_hom2 <- tbl_hom_gen_mean$mean_phen[tbl_hom_gen_mean$Genotype == 2] - 
  sry_geno_val$coefficients["(Intercept)", "Estimate"] -
  geno_val_a
geno_val_hom2
[1] 0.04541911

The genotypic value \(d\) is determined with the heterozygous animals. First we select them into a separate tibble

tbl_heterozygous_gen <- tbl_gen_phen %>%
  filter(Genotype > 2)
head(tbl_heterozygous_gen)

The genotype of the heterozygous animals is re-assiged to be \(1\) and no longer \(3\) or \(4\)

tbl_heterozygous_gen$Genotype <- 1L
head(tbl_heterozygous_gen)

The mean of the heterozygous animals is

n_mean_phen_het <- mean(tbl_heterozygous_gen$Phenotype)
n_mean_phen_het

The meaan of the heterozygous animals must also corrected for the intercept and for the genotypic value \(a\).

Hence the value for \(d\) is

(gen_val_d <- n_mean_phen_het - - 
  sry_geno_val$coefficients["(Intercept)", "Estimate"] -
  geno_val_a)
[1] 0.2190996

Plots

Try some plots which might be instructive

library(ggplot2)
p_hom <- ggplot(data = tbl_homozygous_gen, aes(x = Genotype, y = Phenotype)) + 
  geom_point()
p_hom

All genotypes

tbl_geno_recode <- bind_rows(tbl_homozygous_gen, tbl_heterozygous_gen)
p_all <- ggplot(data = tbl_geno_recode, mapping = aes(x = Genotype, y = Phenotype)) +
  geom_point()
p_all

Adding the regression line to the plot

p_all_reg <- p_all + 
  geom_abline(intercept = sry_geno_val$coefficients["(Intercept)", "Estimate"], 
              slope = sry_geno_val$coefficients["Genotype", "Estimate"])
p_all_reg
LS0tCnRpdGxlOiAiU2luZ2xlIExvY3VzIEJyZWVkaW5nIFZhbHVlcyIKZGF0ZTogIjIwMjItMTEwLTA3IgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0KYl9vbmxpbmUgPC0gRkFMU0UKYGBgCgojIyBEaXNjbGFpbWVyClRoZSBkYXRhc2V0IHdpdGggb25lIGxvY3VzIGlzIHVzZWQgdG8gY29tcHV0ZSBicmVlZGluZyB2YWx1ZXMgYmFzZWQgb24gb25lIGxvY3VzLiBUaGlzIHJlcXVpcmVzIHRvIGZpcnN0IGVzdGltYXRlIHRoZSBnZW5vdHlwaWMgdmFsdWVzLiAKCgojIyBSZWFkIHRoZSBkYXRhc2V0ClRoZSBkYXRhc2V0IGlzIHJlYWQsIHdpdGggYW4gZXhwbGljdCBzcGVjaWZpY2F0aW9uIG9mIGNvbHVtbiBkYXRhdHlwZXMuCgpgYGB7cn0Kc19nZW5fcGhlbl9kYXRhIDwtIGlmZWxzZShiX29ubGluZSwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgImh0dHBzOi8vY2hhcmxvdHRlLW5ncy5naXRodWIuaW8vbGJnZnMyMDIyL2RhdGEvcDFfbXJrX29uZV9sb2N1cy5jc3YiLAogICAgICAgICAgICAgICAgICAgICAgICAgIGZpbGUucGF0aChoZXJlOjpoZXJlKCksICJkb2NzIiwgImRhdGEiLCAicDFfbXJrX29uZV9sb2N1cy5jc3YiKSkKbGlicmFyeShyZWFkcikKdGJsX2dlbl9waGVuIDwtIHJlYWRfZGVsaW0oZmlsZSA9IHNfZ2VuX3BoZW5fZGF0YSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgZGVsaW0gPSAiOyIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbF90eXBlcyA9IGNvbHMoCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgSUQgPSBjb2xfaW50ZWdlcigpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgIEdlbm90eXBlID0gY29sX2ludGVnZXIoKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICBQaGVub3R5cGUgPSBjb2xfZG91YmxlKCkKICAgICAgICAgICAgICAgICAgICAgICAgICAgKSkKaGVhZCh0YmxfZ2VuX3BoZW4pCmBgYAoKCiMjIERlc2NyaXB0aXZlIFN0YXRpc3RpY3MgZm9yIEdlbm90eXBlcwpXaGF0IGFyZSB0aGUgYXZhaWxhYmxlIGdlbm90eXBlcwoKYGBge3J9CmxpYnJhcnkoZHBseXIpCnRibF9nZW5fcGhlbiAlPiUgCiAgZ3JvdXBfYnkoR2Vub3R5cGUpICU+JQogIHN1bW1hcml6ZShnZW5vX2NvdW50ID0gbigpKQpgYGAKCgojIyBHZW5vdHlwaWMgVmFsdWVzClRoZSBnZW5vdHlwaWMgdmFsdWVzIGFyZSBlc3RpbWF0ZWQgd2l0aCBhIHJlZ3Jlc3Npb24gdGhyb3VnaCB0aGUgaG9tb3p5Z291cyBnZW5vdHlwZXMuIEZvciB0aGF0LCB3ZSBjcmVhdGUgYSBuZXcgdGliYmxlIHdpdGggb25seSB0aGUgaG9tb3p5Z291cyBnZW5vdHlwZXMKCmBgYHtyfQp0YmxfaG9tb3p5Z291c19nZW4gPC0gdGJsX2dlbl9waGVuICU+JQogIGZpbHRlcihHZW5vdHlwZSA8IDMpCmhlYWQodGJsX2hvbW96eWdvdXNfZ2VuKQpgYGAKCkNoZWNraW5nIHdpdGggdGhlIGNvdW50cwoKYGBge3J9CnRibF9ob21venlnb3VzX2dlbiAlPiUKICBncm91cF9ieShHZW5vdHlwZSkgJT4lCiAgc3VtbWFyaXNlKGdlbm9fY291bnQgPSBuKCkpCmBgYAoKV2hldGhlciwgd2UgbmVlZCBhIHJlZ3Jlc3Npb24gd2l0aCBvciB3aXRob3V0IGludGVyY2VwdCBjYW4gYmUgZGV0ZXJtaW5lZCBieSB0aGUgYXZlcmFnZSB2YWx1ZSBvZiBhbGwgYW5pbWFscyB3aXRoIEdlbm90eXBlICQwJC4gCgpgYGB7cn0KKHRibF9ob21fZ2VuX21lYW4gPC0gdGJsX2hvbW96eWdvdXNfZ2VuICU+JQogIGdyb3VwX2J5KEdlbm90eXBlKSAlPiUKICBzdW1tYXJpc2UobWVhbl9waGVuID0gbWVhbihQaGVub3R5cGUpKSkKYGBgCgpCZWNhdXNlLCB0aGUgbWVhbiBvZiBhbGwgYW5pbWFscyB3aXRoIGdlbm90eXBlICQwJCBpcyBkaWZmZXJlbnQgZnJvbSAkMCQsIHdlIHVzZSBhIHJlZ3Jlc3Npb24gd2l0aCBpbnRlcmNlcHQuCgpUaGUgcmVncmVzc2lvbiBpcyBtb2RlbGxlZCB2aWEgYGxtKClgCgpgYGB7cn0KbG1fZ2Vub192YWwgPC0gbG0oUGhlbm90eXBlIH4gR2Vub3R5cGUsIGRhdGEgPSB0YmxfaG9tb3p5Z291c19nZW4pCihzcnlfZ2Vub192YWwgPC0gc3VtbWFyeShsbV9nZW5vX3ZhbCkpCmBgYAoKVGhlIGdlbm90eXBpYyB2YWx1ZSAkYSQgZm9yIHRoZSBob21venlnb3VzIGFuaW1hbHMgaXMgZ2l2ZW4gYnkgdGhlIGVzdGltYXRlIG9mIHRoZSByZWdyZXNzaW9uIGNvZWZmaWNpZW50IGFuZCBpcyAKCmBgYHtyfQooZ2Vub192YWxfYSA8LSBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyJHZW5vdHlwZSIsICJFc3RpbWF0ZSJdKQpgYGAKCkFzc2lnbmluZyBnZW5vdHlwaWMgdmFsdWVzIHRvIHRoZSBob21venlnb3VzIGdlbm90eXBlcyByZXF1aXJlcyB0byBzdWJ0cmFjdCB0aGUgaW50ZXJjZXB0IGFuZCB0aGUgZ2Vub3R5cGljIHZhbHVlICRhJCBmcm9tIHRoZSBjdXJyZW50IG1lYW4gcGhlbm90eXBlcyBmb3IgdGhlIHR3byBnZW5veXBlcwoKYGBge3J9Cmdlbm9fdmFsX2hvbTAgPC0gdGJsX2hvbV9nZW5fbWVhbiRtZWFuX3BoZW5bdGJsX2hvbV9nZW5fbWVhbiRHZW5vdHlwZSA9PSAwXSAtIAogIHNyeV9nZW5vX3ZhbCRjb2VmZmljaWVudHNbIihJbnRlcmNlcHQpIiwgIkVzdGltYXRlIl0gLQogIGdlbm9fdmFsX2EKZ2Vub192YWxfaG9tMApgYGAKCkZvciBnZW5vdHlwZSAkMiQKCmBgYHtyfQpnZW5vX3ZhbF9ob20yIDwtIHRibF9ob21fZ2VuX21lYW4kbWVhbl9waGVuW3RibF9ob21fZ2VuX21lYW4kR2Vub3R5cGUgPT0gMl0gLSAKICBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyIoSW50ZXJjZXB0KSIsICJFc3RpbWF0ZSJdIC0KICBnZW5vX3ZhbF9hCmdlbm9fdmFsX2hvbTIKYGBgCgoKVGhlIGdlbm90eXBpYyB2YWx1ZSAkZCQgaXMgZGV0ZXJtaW5lZCB3aXRoIHRoZSBoZXRlcm96eWdvdXMgYW5pbWFscy4gRmlyc3Qgd2Ugc2VsZWN0IHRoZW0gaW50byBhIHNlcGFyYXRlIHRpYmJsZQoKYGBge3J9CnRibF9oZXRlcm96eWdvdXNfZ2VuIDwtIHRibF9nZW5fcGhlbiAlPiUKICBmaWx0ZXIoR2Vub3R5cGUgPiAyKQpoZWFkKHRibF9oZXRlcm96eWdvdXNfZ2VuKQpgYGAKClRoZSBnZW5vdHlwZSBvZiB0aGUgaGV0ZXJvenlnb3VzIGFuaW1hbHMgaXMgcmUtYXNzaWdlZCB0byBiZSAkMSQgYW5kIG5vIGxvbmdlciAkMyQgb3IgJDQkCgpgYGB7cn0KdGJsX2hldGVyb3p5Z291c19nZW4kR2Vub3R5cGUgPC0gMUwKaGVhZCh0YmxfaGV0ZXJvenlnb3VzX2dlbikKYGBgCgpUaGUgbWVhbiBvZiB0aGUgaGV0ZXJvenlnb3VzIGFuaW1hbHMgaXMgCgpgYGB7cn0Kbl9tZWFuX3BoZW5faGV0IDwtIG1lYW4odGJsX2hldGVyb3p5Z291c19nZW4kUGhlbm90eXBlKQpuX21lYW5fcGhlbl9oZXQKYGBgCgpUaGUgbWVhYW4gb2YgdGhlIGhldGVyb3p5Z291cyBhbmltYWxzIG11c3QgYWxzbyBjb3JyZWN0ZWQgZm9yIHRoZSBpbnRlcmNlcHQgYW5kIGZvciB0aGUgZ2Vub3R5cGljIHZhbHVlICRhJC4gCgpIZW5jZSB0aGUgdmFsdWUgZm9yICRkJCBpcwoKYGBge3J9CihnZW5fdmFsX2QgPC0gbl9tZWFuX3BoZW5faGV0IC0gLSAKICBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyIoSW50ZXJjZXB0KSIsICJFc3RpbWF0ZSJdIC0KICBnZW5vX3ZhbF9hKQpgYGAKCgoKCiMjIFBsb3RzClRyeSBzb21lIHBsb3RzIHdoaWNoIG1pZ2h0IGJlIGluc3RydWN0aXZlCgpgYGB7cn0KbGlicmFyeShnZ3Bsb3QyKQpwX2hvbSA8LSBnZ3Bsb3QoZGF0YSA9IHRibF9ob21venlnb3VzX2dlbiwgYWVzKHggPSBHZW5vdHlwZSwgeSA9IFBoZW5vdHlwZSkpICsgCiAgZ2VvbV9wb2ludCgpCnBfaG9tCmBgYAoKQWxsIGdlbm90eXBlcwoKYGBge3J9CnRibF9nZW5vX3JlY29kZSA8LSBiaW5kX3Jvd3ModGJsX2hvbW96eWdvdXNfZ2VuLCB0YmxfaGV0ZXJvenlnb3VzX2dlbikKcF9hbGwgPC0gZ2dwbG90KGRhdGEgPSB0YmxfZ2Vub19yZWNvZGUsIG1hcHBpbmcgPSBhZXMoeCA9IEdlbm90eXBlLCB5ID0gUGhlbm90eXBlKSkgKwogIGdlb21fcG9pbnQoKQpwX2FsbApgYGAKCkFkZGluZyB0aGUgcmVncmVzc2lvbiBsaW5lIHRvIHRoZSBwbG90CgpgYGB7cn0KcF9hbGxfcmVnIDwtIHBfYWxsICsgCiAgZ2VvbV9hYmxpbmUoaW50ZXJjZXB0ID0gc3J5X2dlbm9fdmFsJGNvZWZmaWNpZW50c1siKEludGVyY2VwdCkiLCAiRXN0aW1hdGUiXSwgCiAgICAgICAgICAgICAgc2xvcGUgPSBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyJHZW5vdHlwZSIsICJFc3RpbWF0ZSJdKQpwX2FsbF9yZWcKYGBgCgo=