forked from ModelOriented/DALEX-docs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path04-epilogue.Rmd
30 lines (20 loc) · 1.96 KB
/
04-epilogue.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Epilogue
Let's summarize what has happened in the previous sections.
* Section \@ref(useCaseApartmetns) shows two models with equal performance for `apartments` dataset.
* Section \@ref(modelPerformance) shows that in general the random forest model has smaller residuals than the linear model but there is a small fraction of very large residuals.
* Section \@ref(outlierDetection) shows that the random forest model under-predicts expensive apartments. It is not a model that we would like to employ.
* Section \@ref(featureImportance) shows that `construction_year` is important for the random forest model.
* Section \@ref(variableResponse) shows that the relation between `construction_year` and the price of square meter is non linear.
In this section we showed how to improve the basic linear model by feature engineering of `construction_year`. Findings from the random forest models will help to create a new feature for the linear model.
```{r final_model, message=FALSE, warning=FALSE, fig.height=2, fig.cap="Distribution of residuals for the new improved linear model"}
library("DALEX")
apartments_lm_model_improved <- lm(m2.price ~ I(construction.year < 1935 | construction.year > 1995) + surface + floor +
no.rooms + district, data = apartments)
explainer_lm_improved <- explain(apartments_lm_model_improved,
data = apartmentsTest[,2:6], y = apartmentsTest$m2.price)
mp_lm_improved <- model_performance(explainer_lm_improved)
plot(mp_lm_improved, geom = "boxplot")
```
In conclusion, the results presented above prove that the `apartments_lm_model_improved` model is much better than the two initial models introduced in Chapter \@ref(modelUnderstanding).
In this use-case we showed that explainers implemented in DALEX help to better understand the model and that this knowledge may be used to create a better final model.
Find more examples, vignietts and cheatsheets at DALEX website https://github.com/pbiecek/DALEX.