Problem 1: Linear Regression on Genomic Information

Use the following dataset which is also given in:

https://charlotte-ngs.github.io/asmss2023/data/asm_flem_genomic_data.csv

to estimate marker effects for the single loci using a linear regression model.

Animal SNP G SNP H Observation
1 \(G_1G_1\) \(H_1H_2\) 510
2 \(G_1G_2\) \(H_1H_1\) 528
3 \(G_1G_2\) \(H_1H_1\) 505
4 \(G_1G_1\) \(H_2H_2\) 539
5 \(G_1G_1\) \(H_1H_1\) 530
6 \(G_1G_2\) \(H_1H_2\) 489
7 \(G_1G_2\) \(H_2H_2\) 486
8 \(G_2G_2\) \(H_1H_1\) 485
9 \(G_1G_2\) \(H_2H_2\) 478
10 \(G_2G_2\) \(H_1H_2\) 479
11 \(G_1G_1\) \(H_1H_2\) 520
12 \(G_1G_1\) \(H_1H_1\) 521
13 \(G_2G_2\) \(H_1H_2\) 473
14 \(G_2G_2\) \(H_1H_2\) 457
15 \(G_1G_2\) \(H_1H_1\) 497
16 \(G_1G_2\) \(H_1H_2\) 516
17 \(G_1G_1\) \(H_1H_2\) 524
18 \(G_1G_1\) \(H_1H_2\) 502
19 \(G_1G_1\) \(H_2H_2\) 508
20 \(G_1G_2\) \(H_1H_2\) 506

Your Solution

  • Read the data using read.csv()

  • Re-code the genotypes to numeric values

  • Fit the multiple regression to the data

Problem 2: Regression On Dummy Variables

Use the dataset with the breeds assigned to every animal and find out the influence of the breed on the response variable body weight. The data is available from

[1] "https://charlotte-ngs.github.io/asmss2023/data/asm_bw_flem.csv"
https://charlotte-ngs.github.io/asmss2023/data/asm_bw_flem.csv

Start by fitting a linear model with Breed as the only factor in the model, hence ignore the independent variables such as Breast Circumference, BCS and HEI.

Your Solution

  • Read the data

  • Fit a linear model with breed as the only factor

Problem 3: Estimable Function

Use the matrix vector-notation to setup the model for a regression on dummy variable with the data on breeds and body weight as used in Problem 2. The aim of this problem is to find the estimable functions used in the output of lm().

The model is given by

\[\mathbf{y} = \mathbf{Xb} + \mathbf{e}\]

Setup the least squares normal equations. Find a solution for \(\mathbf{b}^0\) and construct the estimable function that is used in the output lm().

Your Solution

  • Define elements of least squares normal equations

  • Find a solution for \(\mathbf{b}^0\)

  • Construct the estimable function. As a hint, assume the missing factor level in the output of lm() to be zero.


Latest Changes: 2023-03-12 15:28:42 (pvr)

LS0tCnRpdGxlOiBBcHBsaWVkIFN0YXRpc3RpY2FsIE1ldGhvZHMgLSBOb3RlYm9vayAzCmF1dGhvcjogUGV0ZXIgdm9uIFJvaHIKZGF0ZTogJzIwMjMtMDMtMTInCm91dHB1dDogaHRtbF9ub3RlYm9vawpwYXJhbXM6CiAgZG9jdHlwZToKICAgIGxhYmVsOiBEb2N1bWVudCBUeXBlCiAgICB2YWx1ZTogc29sdXRpb24KICAgIGNob2ljZXM6CiAgICAtIGV4ZXJjaXNlCiAgICAtIHNvbHV0aW9uCiAgICAtIG5vdGVib29rCiAgaXNvbmxpbmU6CiAgICBsYWJlbDogT25saW5lICh5L24pCiAgICB2YWx1ZTogdHJ1ZQogICAgY2hvaWNlczoKICAgIC0gdHJ1ZQogICAgLSBmYWxzZQotLS0KCgoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSkKYGBgCgoKYGBge3IgZXgwM3AwMS1zZXR1cCwgZWNobz1GQUxTRX0Kc19leDAzcDAxX2RhdGFfcGF0aCA8LSAiaHR0cHM6Ly9jaGFybG90dGUtbmdzLmdpdGh1Yi5pby9hc21zczIwMjMvZGF0YS9hc21fZmxlbV9nZW5vbWljX2RhdGEuY3N2IgpgYGAKCiMjIFByb2JsZW0gMTogTGluZWFyIFJlZ3Jlc3Npb24gb24gR2Vub21pYyBJbmZvcm1hdGlvbgpVc2UgdGhlIGZvbGxvd2luZyBkYXRhc2V0IHdoaWNoIGlzIGFsc28gZ2l2ZW4gaW46IAoKYHIgIHNfZXgwM3AwMV9kYXRhX3BhdGhgCgp0byBlc3RpbWF0ZSBtYXJrZXIgZWZmZWN0cyBmb3IgdGhlIHNpbmdsZSBsb2NpIHVzaW5nIGEgbGluZWFyIHJlZ3Jlc3Npb24gbW9kZWwuCgpgYGB7ciwgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KdGJsX2V4MDNwMDFfZGF0YSA8LSByZWFkcjo6cmVhZF9jc3YoZmlsZSA9IHNfZXgwM3AwMV9kYXRhX3BhdGgpCmtuaXRyOjprYWJsZSh0YmxfZXgwM3AwMV9kYXRhLAogICAgICAgICAgICAgYm9va3RhYnMgPSBUUlVFLAogICAgICAgICAgICAgbG9uZ3RhYmxlID0gRkFMU0UsCiAgICAgICAgICAgICBlc2NhcGUgPSBGQUxTRSkKYGBgCgoKIyMjIFlvdXIgU29sdXRpb24KCiogUmVhZCB0aGUgZGF0YSB1c2luZyBgcmVhZC5jc3YoKWAKCiogUmUtY29kZSB0aGUgZ2Vub3R5cGVzIHRvIG51bWVyaWMgdmFsdWVzCgoqIEZpdCB0aGUgbXVsdGlwbGUgcmVncmVzc2lvbiB0byB0aGUgZGF0YQoKCgoKCgojIyBQcm9ibGVtIDI6IFJlZ3Jlc3Npb24gT24gRHVtbXkgVmFyaWFibGVzClVzZSB0aGUgZGF0YXNldCB3aXRoIHRoZSBicmVlZHMgYXNzaWduZWQgdG8gZXZlcnkgYW5pbWFsIGFuZCBmaW5kIG91dCB0aGUgaW5mbHVlbmNlIG9mIHRoZSBicmVlZCBvbiB0aGUgcmVzcG9uc2UgdmFyaWFibGUgYGJvZHkgd2VpZ2h0YC4gVGhlIGRhdGEgaXMgYXZhaWxhYmxlIGZyb20KCmBgYHtyIGVjaG89RkFMU0V9CnNfZXgwM3AwMl9kYXRhX3BhdGggPC0gImh0dHBzOi8vY2hhcmxvdHRlLW5ncy5naXRodWIuaW8vYXNtc3MyMDIzL2RhdGEvYXNtX2J3X2ZsZW0uY3N2IgpzX2V4MDNwMDJfZGF0YV9wYXRoCmBgYAoKU3RhcnQgYnkgZml0dGluZyBhIGxpbmVhciBtb2RlbCB3aXRoIGBCcmVlZGAgYXMgdGhlIG9ubHkgZmFjdG9yIGluIHRoZSBtb2RlbCwgaGVuY2UgaWdub3JlIHRoZSBpbmRlcGVuZGVudCB2YXJpYWJsZXMgc3VjaCBhcyBgQnJlYXN0IENpcmN1bWZlcmVuY2VgLCBgQkNTYCBhbmQgYEhFSWAuIAoKIyMjIFlvdXIgU29sdXRpb24KCiogUmVhZCB0aGUgZGF0YQoKKiBGaXQgYSBsaW5lYXIgbW9kZWwgd2l0aCBicmVlZCBhcyB0aGUgb25seSBmYWN0b3IKCgoKCgoKIyMgUHJvYmxlbSAzOiBFc3RpbWFibGUgRnVuY3Rpb24KVXNlIHRoZSBtYXRyaXggdmVjdG9yLW5vdGF0aW9uIHRvIHNldHVwIHRoZSBtb2RlbCBmb3IgYSByZWdyZXNzaW9uIG9uIGR1bW15IHZhcmlhYmxlIHdpdGggdGhlIGRhdGEgb24gYnJlZWRzIGFuZCBib2R5IHdlaWdodCBhcyB1c2VkIGluIFByb2JsZW0gMi4gVGhlIGFpbSBvZiB0aGlzIHByb2JsZW0gaXMgdG8gZmluZCB0aGUgZXN0aW1hYmxlIGZ1bmN0aW9ucyB1c2VkIGluIHRoZSBvdXRwdXQgb2YgYGxtKClgLiAKClRoZSBtb2RlbCBpcyBnaXZlbiBieSAKCiQkXG1hdGhiZnt5fSA9IFxtYXRoYmZ7WGJ9ICsgXG1hdGhiZntlfSQkCgpTZXR1cCB0aGUgbGVhc3Qgc3F1YXJlcyBub3JtYWwgZXF1YXRpb25zLiBGaW5kIGEgc29sdXRpb24gZm9yICRcbWF0aGJme2J9XjAkIGFuZCBjb25zdHJ1Y3QgdGhlIGVzdGltYWJsZSBmdW5jdGlvbiB0aGF0IGlzIHVzZWQgaW4gdGhlIG91dHB1dCBgbG0oKWAuIAoKIyMjIFlvdXIgU29sdXRpb24KCiogRGVmaW5lIGVsZW1lbnRzIG9mIGxlYXN0IHNxdWFyZXMgbm9ybWFsIGVxdWF0aW9ucwoKKiBGaW5kIGEgc29sdXRpb24gZm9yICRcbWF0aGJme2J9XjAkCgoqIENvbnN0cnVjdCB0aGUgZXN0aW1hYmxlIGZ1bmN0aW9uLiBBcyBhIGhpbnQsIGFzc3VtZSB0aGUgbWlzc2luZyBmYWN0b3IgbGV2ZWwgaW4gdGhlIG91dHB1dCBvZiBgbG0oKWAgdG8gYmUgemVyby4KCgoKIAoKCmBgYHtyLCBlY2hvPUZBTFNFLCByZXN1bHRzPSdhc2lzJ30KY2F0KCdcbi0tLVxuXG4gX0xhdGVzdCBDaGFuZ2VzOiAnLCBmb3JtYXQoU3lzLnRpbWUoKSwgJyVZLSVtLSVkICVIOiVNOiVTJyksICcgKCcsIFN5cy5pbmZvKClbJ3VzZXInXSwgJylfXG4nLCBzZXAgPSAnJykKYGBgCiAK