[1] "https://charlotte-ngs.github.io/asmasss2024/data/asm_bw_flem.csv"
Applied Statistical Methods - Exercise 8
Problem 1: Interactions
Use the following dataset on Breed
, Breast.Circumference
and Body.Weight
and fit a fixed linear effects model with Body.Weight
as response and Breed
and Breast.Circumference
as predictors and include an interaction term between the two predictors. Compute the expected difference in Body.Weight
for two animals which differ in Breast.Circumference
by \(1cm\) for every Breed
.
The dataset is available under
Tasks
- Read the data for fitting the linear model
- Fitting the linear model
- Expected difference in body weight for the three breeds:
Angus: The expected difference in body weight (in kg) of one centimeter increase in breast circumference corresponds to the regression coefficient of Breast.Circumference and is
Limousin: Because, for the breed limousin, there is an interaction effect. We have to add the regression coefficient of Breast.Circumference to the interaction effect Breast.Circumference:BreedLimousin. From this we get
Simmental: The same as for limousin, we have for simmental
Problem 2: Simulation
Use the following values for intercept and regression slope for Body.Weight on Breast.Circumference to simulate a dataset of size \(N\). What is the number for \(N\) that has to be chosen such that the regression analysis of the simulated data gives the same result as the true regression slope.
The true values are:
- Intercept: \(-1070\)
- Regression slope: \(8.7\)
- Residual standard error: \(12\)
Hints
- Start with \(N=10\), simulate a dataset and analyse the data with
lm()
- If the result (rounded to 1 digits after decimal point) is not the same then double the size of the dataset, hence use, \(N=20\)
- Continue until you get close to the true value.
- Assume that the random resiudals follow a normal distribution with mean zero and standard devation equal to \(12\)
- Take breast circumference to be normally distributed with a mean of \(180\) and a standard deviation of \(2.6\)
- Use a linear regression model with an intercept to model expected body weight based on breast circumference.
Tasks
- Assign numbers given in problem description into variables
- Start with \(N=10\) and first generate the matrix \(X\) which consists of a column of all ones and a column of breast circumference values in centimeter taken from the given normal distribution. Whenever, we generate some random numbers it is important to first set the seed with the function
set.seed()
to which an integer number is passed. This makes sure that when repeating the simulation the same results are generated.
- Simulate observations of
Body.Weight
- Analyse the simulated data with a regression model
- Compute absolute value of deviation between regression and simulation
- Use a loop to iteratively increase the number of observations until the absolute deviation of the estimated slope from the true value becomes smaller than 0.1.