Problem 1: Model Selection

Given is a dataset with body weight as a response and different other variables and factors. The columns Breed and BCS (Body Condition Score) are taken as factors. All other columns are taken as predictor variables. The column Animal is not used in any model. Use model selection to find the relevant predictor variables and factors for the best linear fixed effect model. Use the estimated mean square error \(C_p\) as a quality measure for a single linear model. The dataset to be analysed can be obtained from

https://charlotte-ngs.github.io/asmss2022/data/asm_bw_mod_sel.csv 

Your Tasks

  • Run a forward selection for the given dataset to find the best model
  • Do a backward elemination for the given dataset to find the best model
  • Compare the two models whether they are identical with respect to the set of predictor variables and factors that they include.

Your Solution

Because, we need the residual standard deviation of the full model and backward elimination starts with the full model, we start with backward elimination

Start with Backward Elimination

  • Start with the full model considering all variables and factors
  • Eliminate the variable that increases the residual sum of squares the least and compute \(C_p\) for resulting model
  • Repeat above step until all variables and factors are elminiated
  • Select the model with the smallest \(C_p\) value

Forward Selection

  • Forward selection starts with a minimal model containing only an intercept
  • Add the variable that reduces the residual sum of squares the most and compute \(C_p\) for that model
  • Repeat step above until all variables are added
  • Select from all the models the one with the smallest \(C_p\) value

Problem 2: Verification of Model Selection Results

Use the R-package olsrr to verify the results of Problem 1. Have a look at the documentation of olsrr at https://github.com/rsquaredacademy/olsrr. In a first step, we are going to read the data from

https://charlotte-ngs.github.io/asmss2022/data/asm_bw_mod_sel.csv 

Your Solution

  • Based on the documentation, we are going to use the function ols_step_best_subset

Latest Changes: 2022-04-15 07:46:41 (pvr)

LS0tCnRpdGxlOiBBcHBsaWVkIFN0YXRpc3RpY2FsIE1ldGhvZHMgLSBOb3RlYm9vayA3CmF1dGhvcjogUGV0ZXIgdm9uIFJvaHIKZGF0ZTogJzIwMjItMDQtMDYnCm91dHB1dDogaHRtbF9ub3RlYm9vawpwYXJhbXM6CiAgZG9jdHlwZToKICAgIGxhYmVsOiBEb2N1bWVudCBUeXBlCiAgICB2YWx1ZTogc29sdXRpb24KICAgIGNob2ljZXM6CiAgICAtIGV4ZXJjaXNlCiAgICAtIHNvbHV0aW9uCiAgICAtIG5vdGVib29rCiAgaXNvbmxpbmU6CiAgICBsYWJlbDogT25saW5lICh5L24pCiAgICB2YWx1ZTogdHJ1ZQogICAgY2hvaWNlczoKICAgIC0gdHJ1ZQogICAgLSBmYWxzZQotLS0KCgoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSkKYGBgCgpgYGB7ciBleDA3LXAwMS1pbml0LCBlY2hvPUZBTFNFfQppZiAocGFyYW1zJGlzb25saW5lKXsKICBzX2V4MDdwMDFfcGF0aCA8LSAiaHR0cHM6Ly9jaGFybG90dGUtbmdzLmdpdGh1Yi5pby9hc21zczIwMjIvZGF0YS9hc21fYndfbW9kX3NlbC5jc3YiIAp9IGVsc2UgewogIHNfZXgwN3AwMV9wYXRoIDwtIGZpbGUucGF0aChoZXJlOjpoZXJlKCksICJkb2NzIiwgImRhdGEiLCAiYXNtX2J3X21vZF9zZWwuY3N2IikKfQpgYGAKCiMjIFByb2JsZW0gMTogTW9kZWwgU2VsZWN0aW9uCkdpdmVuIGlzIGEgZGF0YXNldCB3aXRoIGJvZHkgd2VpZ2h0IGFzIGEgcmVzcG9uc2UgYW5kIGRpZmZlcmVudCBvdGhlciB2YXJpYWJsZXMgYW5kIGZhY3RvcnMuIFRoZSBjb2x1bW5zIGBCcmVlZGAgYW5kIGBCQ1NgIChCb2R5IENvbmRpdGlvbiBTY29yZSkgYXJlIHRha2VuIGFzIGZhY3RvcnMuIEFsbCBvdGhlciBjb2x1bW5zIGFyZSB0YWtlbiBhcyBwcmVkaWN0b3IgdmFyaWFibGVzLiBUaGUgY29sdW1uIGBBbmltYWxgIGlzIG5vdCB1c2VkIGluIGFueSBtb2RlbC4gVXNlIG1vZGVsIHNlbGVjdGlvbiB0byBmaW5kIHRoZSByZWxldmFudCBwcmVkaWN0b3IgdmFyaWFibGVzIGFuZCBmYWN0b3JzIGZvciB0aGUgYmVzdCBsaW5lYXIgZml4ZWQgZWZmZWN0IG1vZGVsLiBVc2UgdGhlIGVzdGltYXRlZCBtZWFuIHNxdWFyZSBlcnJvciAkQ19wJCBhcyBhIHF1YWxpdHkgbWVhc3VyZSBmb3IgYSBzaW5nbGUgbGluZWFyIG1vZGVsLiBUaGUgZGF0YXNldCB0byBiZSBhbmFseXNlZCBjYW4gYmUgb2J0YWluZWQgZnJvbSAKCmBgYHtyLCBlY2hvPUZBTFNFfQpjYXQoc19leDA3cDAxX3BhdGgsICJcbiIpCmBgYAoKCiMjIyBZb3VyIFRhc2tzCiogUnVuIGEgZm9yd2FyZCBzZWxlY3Rpb24gZm9yIHRoZSBnaXZlbiBkYXRhc2V0IHRvIGZpbmQgdGhlIGJlc3QgbW9kZWwKKiBEbyBhIGJhY2t3YXJkIGVsZW1pbmF0aW9uIGZvciB0aGUgZ2l2ZW4gZGF0YXNldCB0byBmaW5kIHRoZSBiZXN0IG1vZGVsCiogQ29tcGFyZSB0aGUgdHdvIG1vZGVscyB3aGV0aGVyIHRoZXkgYXJlIGlkZW50aWNhbCB3aXRoIHJlc3BlY3QgdG8gdGhlIHNldCBvZiBwcmVkaWN0b3IgdmFyaWFibGVzIGFuZCBmYWN0b3JzIHRoYXQgdGhleSBpbmNsdWRlLgoKIyMjIFlvdXIgU29sdXRpb24KCkJlY2F1c2UsIHdlIG5lZWQgdGhlIHJlc2lkdWFsIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB0aGUgZnVsbCBtb2RlbCBhbmQgYmFja3dhcmQgZWxpbWluYXRpb24gc3RhcnRzIHdpdGggdGhlIGZ1bGwgbW9kZWwsIHdlIHN0YXJ0IHdpdGggYmFja3dhcmQgZWxpbWluYXRpb24KCiMjIyMgU3RhcnQgd2l0aCBCYWNrd2FyZCBFbGltaW5hdGlvbgoKKiBTdGFydCB3aXRoIHRoZSBmdWxsIG1vZGVsIGNvbnNpZGVyaW5nIGFsbCB2YXJpYWJsZXMgYW5kIGZhY3RvcnMKKiBFbGltaW5hdGUgdGhlIHZhcmlhYmxlIHRoYXQgaW5jcmVhc2VzIHRoZSByZXNpZHVhbCBzdW0gb2Ygc3F1YXJlcyB0aGUgbGVhc3QgYW5kIGNvbXB1dGUgJENfcCQgZm9yIHJlc3VsdGluZyBtb2RlbAoqIFJlcGVhdCBhYm92ZSBzdGVwIHVudGlsIGFsbCB2YXJpYWJsZXMgYW5kIGZhY3RvcnMgYXJlIGVsbWluaWF0ZWQKKiBTZWxlY3QgdGhlIG1vZGVsIHdpdGggdGhlIHNtYWxsZXN0ICRDX3AkIHZhbHVlCgojIyMjIEZvcndhcmQgU2VsZWN0aW9uCgoqIEZvcndhcmQgc2VsZWN0aW9uIHN0YXJ0cyB3aXRoIGEgbWluaW1hbCBtb2RlbCBjb250YWluaW5nIG9ubHkgYW4gaW50ZXJjZXB0CiogQWRkIHRoZSB2YXJpYWJsZSB0aGF0IHJlZHVjZXMgdGhlIHJlc2lkdWFsIHN1bSBvZiBzcXVhcmVzIHRoZSBtb3N0IGFuZCBjb21wdXRlICRDX3AkIGZvciB0aGF0IG1vZGVsCiogUmVwZWF0IHN0ZXAgYWJvdmUgdW50aWwgYWxsIHZhcmlhYmxlcyBhcmUgYWRkZWQKKiBTZWxlY3QgZnJvbSBhbGwgdGhlIG1vZGVscyB0aGUgb25lIHdpdGggdGhlIHNtYWxsZXN0ICRDX3AkIHZhbHVlCgoKCgoKIyMgUHJvYmxlbSAyOiBWZXJpZmljYXRpb24gb2YgTW9kZWwgU2VsZWN0aW9uIFJlc3VsdHMKVXNlIHRoZSBSLXBhY2thZ2UgYG9sc3JyYCB0byB2ZXJpZnkgdGhlIHJlc3VsdHMgb2YgUHJvYmxlbSAxLiBIYXZlIGEgbG9vayBhdCB0aGUgZG9jdW1lbnRhdGlvbiBvZiBgb2xzcnJgIGF0IGh0dHBzOi8vZ2l0aHViLmNvbS9yc3F1YXJlZGFjYWRlbXkvb2xzcnIuIEluIGEgZmlyc3Qgc3RlcCwgd2UgYXJlIGdvaW5nIHRvIHJlYWQgdGhlIGRhdGEgZnJvbQoKYGBge3IgZXgwNy1wMDItaW5pdCwgZWNobz1GQUxTRX0KaWYgKHBhcmFtcyRpc29ubGluZSl7CiAgc19leDA3cDAyX3BhdGggPC0gImh0dHBzOi8vY2hhcmxvdHRlLW5ncy5naXRodWIuaW8vYXNtc3MyMDIyL2RhdGEvYXNtX2J3X21vZF9zZWwuY3N2IiAKfSBlbHNlIHsKICBzX2V4MDdwMDJfcGF0aCA8LSBmaWxlLnBhdGgoaGVyZTo6aGVyZSgpLCAiZG9jcyIsICJkYXRhIiwgImFzbV9id19tb2Rfc2VsLmNzdiIpCn0KY2F0KHNfZXgwN3AwMl9wYXRoLCAiXG4iKQpgYGAKCiMjIyBZb3VyIFNvbHV0aW9uCgoqIEJhc2VkIG9uIHRoZSBkb2N1bWVudGF0aW9uLCB3ZSBhcmUgZ29pbmcgdG8gdXNlIHRoZSBmdW5jdGlvbiBgb2xzX3N0ZXBfYmVzdF9zdWJzZXRgCgoKCgoKCgpgYGB7ciwgZWNobz1GQUxTRSwgcmVzdWx0cz0nYXNpcyd9CmNhdCgnXG4tLS1cblxuIF9MYXRlc3QgQ2hhbmdlczogJywgZm9ybWF0KFN5cy50aW1lKCksICclWS0lbS0lZCAlSDolTTolUycpLCAnICgnLCBTeXMuaW5mbygpWyd1c2VyJ10sICcpX1xuJywgc2VwID0gJycpCmBgYAogCg==