Disclaimer
The dataset with one locus is used to compute breeding values based
on one locus. This requires to first estimate the genotypic values.
Read the dataset
The dataset is read, with an explict specification of column
datatypes.
s_gen_phen_data <- ifelse(b_online,
"https://charlotte-ngs.github.io/lbgfs2022/data/p1_mrk_one_locus.csv",
file.path(here::here(), "docs", "data", "p1_mrk_one_locus.csv"))
library(readr)
tbl_gen_phen <- read_delim(file = s_gen_phen_data,
delim = ";",
col_types = cols(
ID = col_integer(),
Genotype = col_integer(),
Phenotype = col_double()
))
head(tbl_gen_phen)
Descriptive Statistics for Genotypes
What are the available genotypes
library(dplyr)
tbl_gen_phen %>%
group_by(Genotype) %>%
summarize(geno_count = n())
Genotypic Values
The genotypic values are estimated with a regression through the
homozygous genotypes. For that, we create a new tibble with only the
homozygous genotypes
tbl_homozygous_gen <- tbl_gen_phen %>%
filter(Genotype < 3)
head(tbl_homozygous_gen)
Checking with the counts
tbl_homozygous_gen %>%
group_by(Genotype) %>%
summarise(geno_count = n())
Whether, we need a regression with or without intercept can be
determined by the average value of all animals with Genotype \(0\).
Because, the mean of all animals with genotype \(0\) is different from \(0\), we use a regression with
intercept.
The regression is modelled via lm()
lm_geno_val <- lm(Phenotype ~ Genotype, data = tbl_homozygous_gen)
(sry_geno_val <- summary(lm_geno_val))
The genotypic value \(a\) for the
homozygous animals is given by the estimate of the regression
coefficient and is
(geno_val_a <- sry_geno_val$coefficients["Genotype", "Estimate"])
[1] 0.04541911
Assigning genotypic values to the homozygous genotypes requires to
subtract the intercept and the genotypic value \(a\) from the current mean phenotypes for
the two genoypes
geno_val_hom0 <- tbl_hom_gen_mean$mean_phen[tbl_hom_gen_mean$Genotype == 0] -
sry_geno_val$coefficients["(Intercept)", "Estimate"] -
geno_val_a
geno_val_hom0
[1] -0.04541911
For genotype \(2\)
geno_val_hom2 <- tbl_hom_gen_mean$mean_phen[tbl_hom_gen_mean$Genotype == 2] -
sry_geno_val$coefficients["(Intercept)", "Estimate"] -
geno_val_a
geno_val_hom2
[1] 0.04541911
The genotypic value \(d\) is
determined with the heterozygous animals. First we select them into a
separate tibble
tbl_heterozygous_gen <- tbl_gen_phen %>%
filter(Genotype > 2)
head(tbl_heterozygous_gen)
The genotype of the heterozygous animals is re-assiged to be \(1\) and no longer \(3\) or \(4\)
tbl_heterozygous_gen$Genotype <- 1L
head(tbl_heterozygous_gen)
The mean of the heterozygous animals is
n_mean_phen_het <- mean(tbl_heterozygous_gen$Phenotype)
n_mean_phen_het
The meaan of the heterozygous animals must also corrected for the
intercept and for the genotypic value \(a\).
Hence the value for \(d\) is
(gen_val_d <- n_mean_phen_het - -
sry_geno_val$coefficients["(Intercept)", "Estimate"] -
geno_val_a)
[1] 0.2190996
Plots
Try some plots which might be instructive
library(ggplot2)
p_hom <- ggplot(data = tbl_homozygous_gen, aes(x = Genotype, y = Phenotype)) +
geom_point()
p_hom
All genotypes
tbl_geno_recode <- bind_rows(tbl_homozygous_gen, tbl_heterozygous_gen)
p_all <- ggplot(data = tbl_geno_recode, mapping = aes(x = Genotype, y = Phenotype)) +
geom_point()
p_all
Adding the regression line to the plot
p_all_reg <- p_all +
geom_abline(intercept = sry_geno_val$coefficients["(Intercept)", "Estimate"],
slope = sry_geno_val$coefficients["Genotype", "Estimate"])
p_all_reg
LS0tCnRpdGxlOiAiU2luZ2xlIExvY3VzIEJyZWVkaW5nIFZhbHVlcyIKZGF0ZTogIjIwMjItMTEwLTA3IgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0KYl9vbmxpbmUgPC0gRkFMU0UKYGBgCgojIyBEaXNjbGFpbWVyClRoZSBkYXRhc2V0IHdpdGggb25lIGxvY3VzIGlzIHVzZWQgdG8gY29tcHV0ZSBicmVlZGluZyB2YWx1ZXMgYmFzZWQgb24gb25lIGxvY3VzLiBUaGlzIHJlcXVpcmVzIHRvIGZpcnN0IGVzdGltYXRlIHRoZSBnZW5vdHlwaWMgdmFsdWVzLiAKCgojIyBSZWFkIHRoZSBkYXRhc2V0ClRoZSBkYXRhc2V0IGlzIHJlYWQsIHdpdGggYW4gZXhwbGljdCBzcGVjaWZpY2F0aW9uIG9mIGNvbHVtbiBkYXRhdHlwZXMuCgpgYGB7cn0Kc19nZW5fcGhlbl9kYXRhIDwtIGlmZWxzZShiX29ubGluZSwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgImh0dHBzOi8vY2hhcmxvdHRlLW5ncy5naXRodWIuaW8vbGJnZnMyMDIyL2RhdGEvcDFfbXJrX29uZV9sb2N1cy5jc3YiLAogICAgICAgICAgICAgICAgICAgICAgICAgIGZpbGUucGF0aChoZXJlOjpoZXJlKCksICJkb2NzIiwgImRhdGEiLCAicDFfbXJrX29uZV9sb2N1cy5jc3YiKSkKbGlicmFyeShyZWFkcikKdGJsX2dlbl9waGVuIDwtIHJlYWRfZGVsaW0oZmlsZSA9IHNfZ2VuX3BoZW5fZGF0YSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgZGVsaW0gPSAiOyIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbF90eXBlcyA9IGNvbHMoCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgSUQgPSBjb2xfaW50ZWdlcigpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgIEdlbm90eXBlID0gY29sX2ludGVnZXIoKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICBQaGVub3R5cGUgPSBjb2xfZG91YmxlKCkKICAgICAgICAgICAgICAgICAgICAgICAgICAgKSkKaGVhZCh0YmxfZ2VuX3BoZW4pCmBgYAoKCiMjIERlc2NyaXB0aXZlIFN0YXRpc3RpY3MgZm9yIEdlbm90eXBlcwpXaGF0IGFyZSB0aGUgYXZhaWxhYmxlIGdlbm90eXBlcwoKYGBge3J9CmxpYnJhcnkoZHBseXIpCnRibF9nZW5fcGhlbiAlPiUgCiAgZ3JvdXBfYnkoR2Vub3R5cGUpICU+JQogIHN1bW1hcml6ZShnZW5vX2NvdW50ID0gbigpKQpgYGAKCgojIyBHZW5vdHlwaWMgVmFsdWVzClRoZSBnZW5vdHlwaWMgdmFsdWVzIGFyZSBlc3RpbWF0ZWQgd2l0aCBhIHJlZ3Jlc3Npb24gdGhyb3VnaCB0aGUgaG9tb3p5Z291cyBnZW5vdHlwZXMuIEZvciB0aGF0LCB3ZSBjcmVhdGUgYSBuZXcgdGliYmxlIHdpdGggb25seSB0aGUgaG9tb3p5Z291cyBnZW5vdHlwZXMKCmBgYHtyfQp0YmxfaG9tb3p5Z291c19nZW4gPC0gdGJsX2dlbl9waGVuICU+JQogIGZpbHRlcihHZW5vdHlwZSA8IDMpCmhlYWQodGJsX2hvbW96eWdvdXNfZ2VuKQpgYGAKCkNoZWNraW5nIHdpdGggdGhlIGNvdW50cwoKYGBge3J9CnRibF9ob21venlnb3VzX2dlbiAlPiUKICBncm91cF9ieShHZW5vdHlwZSkgJT4lCiAgc3VtbWFyaXNlKGdlbm9fY291bnQgPSBuKCkpCmBgYAoKV2hldGhlciwgd2UgbmVlZCBhIHJlZ3Jlc3Npb24gd2l0aCBvciB3aXRob3V0IGludGVyY2VwdCBjYW4gYmUgZGV0ZXJtaW5lZCBieSB0aGUgYXZlcmFnZSB2YWx1ZSBvZiBhbGwgYW5pbWFscyB3aXRoIEdlbm90eXBlICQwJC4gCgpgYGB7cn0KKHRibF9ob21fZ2VuX21lYW4gPC0gdGJsX2hvbW96eWdvdXNfZ2VuICU+JQogIGdyb3VwX2J5KEdlbm90eXBlKSAlPiUKICBzdW1tYXJpc2UobWVhbl9waGVuID0gbWVhbihQaGVub3R5cGUpKSkKYGBgCgpCZWNhdXNlLCB0aGUgbWVhbiBvZiBhbGwgYW5pbWFscyB3aXRoIGdlbm90eXBlICQwJCBpcyBkaWZmZXJlbnQgZnJvbSAkMCQsIHdlIHVzZSBhIHJlZ3Jlc3Npb24gd2l0aCBpbnRlcmNlcHQuCgpUaGUgcmVncmVzc2lvbiBpcyBtb2RlbGxlZCB2aWEgYGxtKClgCgpgYGB7cn0KbG1fZ2Vub192YWwgPC0gbG0oUGhlbm90eXBlIH4gR2Vub3R5cGUsIGRhdGEgPSB0YmxfaG9tb3p5Z291c19nZW4pCihzcnlfZ2Vub192YWwgPC0gc3VtbWFyeShsbV9nZW5vX3ZhbCkpCmBgYAoKVGhlIGdlbm90eXBpYyB2YWx1ZSAkYSQgZm9yIHRoZSBob21venlnb3VzIGFuaW1hbHMgaXMgZ2l2ZW4gYnkgdGhlIGVzdGltYXRlIG9mIHRoZSByZWdyZXNzaW9uIGNvZWZmaWNpZW50IGFuZCBpcyAKCmBgYHtyfQooZ2Vub192YWxfYSA8LSBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyJHZW5vdHlwZSIsICJFc3RpbWF0ZSJdKQpgYGAKCkFzc2lnbmluZyBnZW5vdHlwaWMgdmFsdWVzIHRvIHRoZSBob21venlnb3VzIGdlbm90eXBlcyByZXF1aXJlcyB0byBzdWJ0cmFjdCB0aGUgaW50ZXJjZXB0IGFuZCB0aGUgZ2Vub3R5cGljIHZhbHVlICRhJCBmcm9tIHRoZSBjdXJyZW50IG1lYW4gcGhlbm90eXBlcyBmb3IgdGhlIHR3byBnZW5veXBlcwoKYGBge3J9Cmdlbm9fdmFsX2hvbTAgPC0gdGJsX2hvbV9nZW5fbWVhbiRtZWFuX3BoZW5bdGJsX2hvbV9nZW5fbWVhbiRHZW5vdHlwZSA9PSAwXSAtIAogIHNyeV9nZW5vX3ZhbCRjb2VmZmljaWVudHNbIihJbnRlcmNlcHQpIiwgIkVzdGltYXRlIl0gLQogIGdlbm9fdmFsX2EKZ2Vub192YWxfaG9tMApgYGAKCkZvciBnZW5vdHlwZSAkMiQKCmBgYHtyfQpnZW5vX3ZhbF9ob20yIDwtIHRibF9ob21fZ2VuX21lYW4kbWVhbl9waGVuW3RibF9ob21fZ2VuX21lYW4kR2Vub3R5cGUgPT0gMl0gLSAKICBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyIoSW50ZXJjZXB0KSIsICJFc3RpbWF0ZSJdIC0KICBnZW5vX3ZhbF9hCmdlbm9fdmFsX2hvbTIKYGBgCgoKVGhlIGdlbm90eXBpYyB2YWx1ZSAkZCQgaXMgZGV0ZXJtaW5lZCB3aXRoIHRoZSBoZXRlcm96eWdvdXMgYW5pbWFscy4gRmlyc3Qgd2Ugc2VsZWN0IHRoZW0gaW50byBhIHNlcGFyYXRlIHRpYmJsZQoKYGBge3J9CnRibF9oZXRlcm96eWdvdXNfZ2VuIDwtIHRibF9nZW5fcGhlbiAlPiUKICBmaWx0ZXIoR2Vub3R5cGUgPiAyKQpoZWFkKHRibF9oZXRlcm96eWdvdXNfZ2VuKQpgYGAKClRoZSBnZW5vdHlwZSBvZiB0aGUgaGV0ZXJvenlnb3VzIGFuaW1hbHMgaXMgcmUtYXNzaWdlZCB0byBiZSAkMSQgYW5kIG5vIGxvbmdlciAkMyQgb3IgJDQkCgpgYGB7cn0KdGJsX2hldGVyb3p5Z291c19nZW4kR2Vub3R5cGUgPC0gMUwKaGVhZCh0YmxfaGV0ZXJvenlnb3VzX2dlbikKYGBgCgpUaGUgbWVhbiBvZiB0aGUgaGV0ZXJvenlnb3VzIGFuaW1hbHMgaXMgCgpgYGB7cn0Kbl9tZWFuX3BoZW5faGV0IDwtIG1lYW4odGJsX2hldGVyb3p5Z291c19nZW4kUGhlbm90eXBlKQpuX21lYW5fcGhlbl9oZXQKYGBgCgpUaGUgbWVhYW4gb2YgdGhlIGhldGVyb3p5Z291cyBhbmltYWxzIG11c3QgYWxzbyBjb3JyZWN0ZWQgZm9yIHRoZSBpbnRlcmNlcHQgYW5kIGZvciB0aGUgZ2Vub3R5cGljIHZhbHVlICRhJC4gCgpIZW5jZSB0aGUgdmFsdWUgZm9yICRkJCBpcwoKYGBge3J9CihnZW5fdmFsX2QgPC0gbl9tZWFuX3BoZW5faGV0IC0gLSAKICBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyIoSW50ZXJjZXB0KSIsICJFc3RpbWF0ZSJdIC0KICBnZW5vX3ZhbF9hKQpgYGAKCgoKCiMjIFBsb3RzClRyeSBzb21lIHBsb3RzIHdoaWNoIG1pZ2h0IGJlIGluc3RydWN0aXZlCgpgYGB7cn0KbGlicmFyeShnZ3Bsb3QyKQpwX2hvbSA8LSBnZ3Bsb3QoZGF0YSA9IHRibF9ob21venlnb3VzX2dlbiwgYWVzKHggPSBHZW5vdHlwZSwgeSA9IFBoZW5vdHlwZSkpICsgCiAgZ2VvbV9wb2ludCgpCnBfaG9tCmBgYAoKQWxsIGdlbm90eXBlcwoKYGBge3J9CnRibF9nZW5vX3JlY29kZSA8LSBiaW5kX3Jvd3ModGJsX2hvbW96eWdvdXNfZ2VuLCB0YmxfaGV0ZXJvenlnb3VzX2dlbikKcF9hbGwgPC0gZ2dwbG90KGRhdGEgPSB0YmxfZ2Vub19yZWNvZGUsIG1hcHBpbmcgPSBhZXMoeCA9IEdlbm90eXBlLCB5ID0gUGhlbm90eXBlKSkgKwogIGdlb21fcG9pbnQoKQpwX2FsbApgYGAKCkFkZGluZyB0aGUgcmVncmVzc2lvbiBsaW5lIHRvIHRoZSBwbG90CgpgYGB7cn0KcF9hbGxfcmVnIDwtIHBfYWxsICsgCiAgZ2VvbV9hYmxpbmUoaW50ZXJjZXB0ID0gc3J5X2dlbm9fdmFsJGNvZWZmaWNpZW50c1siKEludGVyY2VwdCkiLCAiRXN0aW1hdGUiXSwgCiAgICAgICAgICAgICAgc2xvcGUgPSBzcnlfZ2Vub192YWwkY29lZmZpY2llbnRzWyJHZW5vdHlwZSIsICJFc3RpbWF0ZSJdKQpwX2FsbF9yZWcKYGBgCgo=