Stat302_Assignment1_Solutions

.html

School

Simon Fraser University *

*We aren’t endorsed by this school

Course

302

Subject

Statistics

Date

Apr 3, 2024

Type

html

Pages

14

Uploaded by divyakapooruk on coursehero.com

Stat 302 - Assignment 1 Solutions - Spring 2024 - (44 Points) In [1]: ### Call R package Stat2Data #install.packages("Stat2Data") - For those who haven't installed the package library(Stat2Data) # (1 point) In [2]: # Import "LongJumpOlympics2016" data data("LongJumpOlympics2016") # (1 point) # dimensions of the dataset dim(LongJumpOlympics2016) 1. 28 2. 2 In [3]: # First six rows of the dataset head(LongJumpOlympics2016) A data.frame: 6 × 2 Year Gold <int> <dbl> 1 1900 7.185 2 1904 7.340 3 1906 7.200 4 1908 7.480 5 1912 7.600 6 1920 7.150
In [4]: # Regression output for predicting the "Olympic long jump length" (Response) from "Year" (explanatory variable) reg_model1 <- lm(Gold~Year, data=LongJumpOlympics2016) # (1 point) summary(reg_model1) Call: lm(formula = Gold ~ Year, data = LongJumpOlympics2016) Residuals: Min 1Q Median 3Q Max -0.39610 -0.15495 -0.00137 0.11606 0.75349 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -16.470194 2.666282 -6.177 1.56e-06 *** Year 0.012508 0.001361 9.191 1.19e-09 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2595 on 26 degrees of freedom Multiple R-squared: 0.7646, Adjusted R-squared: 0.7556 F-statistic: 84.47 on 1 and 26 DF, p-value: 1.192e-09 Q1: 1.4 - (1 point) Slope of the Least Squares regression line: ^β1=0.012508β1^=0.012508 Q2: 1.6 - (1 point) Intercept of the Least Squares regression line: ^β0=−16.470194β0^=−16.470194 Q3: 1.8 - (1 point) Interpret the slope coefficient: Every year, the length of the gold-medal winning long jump distance has increased on average by 0.012508 meters.
Q4: 1.10 - (1 point) Size of the typical error: =S=√ SSEn−2=0.259522σ ^=S=SSEn−2=0.259522 Q5: 1.12 - (1 point) Degrees of freedom of the regression standard error: 28−2=2628−2=26 Q6: 1.14 - (2 points) ^Y=78−0.5XY^=78−0.5X ^Y1=78−0.5(30)=63Y1^=78−0.5(30)=63 Residual: Y1−^Y1=60−63=−3Y1−Y1^=60−63=−3 Q7: 1.28 - (20 points) In [5]: # Import datafile data("SeaIce") head(SeaIce) A data.frame: 6 × 4 Year Extent Area t <int> <dbl> <dbl> <int> 1 1979 7.22 4.54 1 2 1980 7.86 4.83 2 3 1981 7.25 4.38 3 4 1982 7.45 4.38 4 5 1983 7.54 4.64 5
Year Extent Area t <int> <dbl> <dbl> <int> 6 1984 7.11 4.04 6 Part A - (3 points) In [6]: # Scatterplot - 2 points plot(SeaIce$t, SeaIce$Extent, pch=16, cex=1.2, col='blue', xlab='t', ylab='Extent', main='SeaIce Extent over time') Pattern:Pattern: There is a strong, negative non-linear association between SeaIce Extent and time. (1 point) Note to the marker: give full marks even if a student state that there is a strong, negative linear association since the non- linearity is not clearly evident. Part B - (3 points) In [7]:
# Regression model for predicting Extent on time reg_model2 <- lm(Extent~t, data=SeaIce) # Residual vs. Fit plot plot(reg_model2, 1) Comment:Comment: The residuals versus fits graph too shows a curvature. In fact, it is somewhat easier to see in this plot. Given this amount of curvature, we should not fit a linear model to this data. Note to the marker: give full marks even if a student state that liearity is statisfied since residuals are randomly distributed with no depatures from non-linearity. Furthermore, constant variance is also statisfied except for the point at the top of the plot. Part C - (3 points) In [8]: # Scatterplot - 2 points plot(SeaIce$t, SeaIce$Extent^2, pch=16, cex=1.2, col='blue', xlab='t', ylab='Extent^2', main='SeaIce Extent over time')
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help