-
Notifications
You must be signed in to change notification settings - Fork 0
/
code_along_answers.txt
80 lines (42 loc) · 2.77 KB
/
code_along_answers.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
Code along answers
## CODE ALONG 1
1. Model `y` as a function of `z` using `lm(y ~ z, data = d)` and save to an object called `mod2`. View the summary. What formula does the model return? What is the estimated standard deviation of the Normally distributed noise?
mod2 <- lm(y ~ z, data = d)
summary(mod2)
2. Use the model to simulate density histograms and compare to the original density histogram of `d$y`.
hist(d$y, freq = FALSE)
lines(density(d$y))
lines(density(73.0003 + d$z*0.2312 + rnorm(25, 0, sd = 36.36)), col = "red")
## CODE ALONG 2
1. Insert a code chunk below and model `log(totalvalue)` as function of `fullbath` and `finsqft.` Call your model `m3`
m3 <- lm(log(totalvalue) ~ fullbath + finsqft, data = homes)
2. Insert a code chunk below and check the diagnostic plots
plot(m3)
3. How do we interpret the fullbath coefficient?
round(exp(coef(m3)), 3)
Adding a full bath increases value of house by about 17%
4. Insert a code chunk below and simulate data from the model and compare to the observed `totalvalue`. Does this look like a good model?
sim_m3 <- simulate(m3, nsim = 50)
plot(density(log(homes$totalvalue)))
for(i in 1:50)lines(density(sim_m3[[i]]), lty = 2, col = "grey80")
## CODE ALONG 3
1. Insert a code chunk below and model `log(totalvalue)` as function of `fullbath`, `finsqft` and `cooling.` Call your model `m5`.
m5 <- lm(log(totalvalue) ~ fullbath + finsqft + cooling, data = homes)
2. What is the interpretation of cooling?
round(exp(coef(m5)), 3)
Expected value of a home with No Central Air is about 19-20% lower than an equivalent home with Central Air.
## CODE ALONG 4
1. Insert a code chunk below and model `log(totalvalue)` as function of `fullbath`, `finsqft`, `cooling`, and the interaction of `finsqft` and `cooling`. Call your model `m7`. Is the interaction warranted?
m7 <- lm(log(totalvalue) ~ fullbath + finsqft + cooling + finsqft:cooling,
data = homes)
anova(m7)
2. Visualize the interaction using the `ggpredict` function. Perhaps use `[1000:4000 by=500]` to set the range of `finsqft` on the x-axis.
plot(ggpredict(m7, terms = c("finsqft[1000:4000 by=500]", "cooling")))
## CODE ALONG 5
1. Insert a code chunk below and model `log(totalvalue)` as function of `finsqft` with a natural spline of 5 `df`, `cooling`, and the interaction of `cooling` and `finsqft` (natural spline of 5 `df`). Call your model `nlm4`.
nlm4 <- lm(log(totalvalue) ~ ns(finsqft, df = 5) + cooling +
ns(finsqft, df = 5):cooling, data = homes)
2. Use the `anova` function to check whether the interaction appears necessary. What do you think?
anova(nlm4)
3. Create an effect plot of `finsqft` by `cooling`. Maybe try `[1000:5000 by=250]` for the range of values for `finsqft`.
plot(ggpredict(nlm4, terms = c("finsqft[1000:5000 by=250]", "cooling")))