-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path0307_lab3.Rmd
80 lines (47 loc) · 2.54 KB
/
0307_lab3.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# Lab 3: Explore **gapminder** with **ggplot2** and **dplyr** {#lab03}
In this lab, you will explore the **gapminder** dataset by making summary tables and visualizations.
## Getting started
Make a new RMarkdown script for this lab.
In the *setup* chunk at the top of your scripts,
load the packages needed for this lab.
These include **tidyverse** and **gapminder**.
## Exercise 1: Basic **dplyr**
Use **dplyr** functions to achieve the following.
For each exercise, print your result.
### 1.1
Use `filter()` to subset the `gapminder` data to three countries of your choice in the 1970's.
### 1.2
Start with the original `gapminder` data and use the pipe operator `|>` to first do the above filter and then select the "country" and "gdpPercap" variables.
### 1.3
Make a new variable in `gapminder` for the change in life expectancy from the previous measurement **for that country**.
Filter this table to show all of the entries that have experienced a **drop** in life expectancy.
Save this result as a new object.
__Hint__: you might find the `lag()` or `diff()` functions useful.
### 1.4
Filter `gapminder` so that it shows the max GDP per capita experienced **by each country**.
__Hint__: you might find the `max()` function useful here.
### 1.5
Produce a scatterplot of Canada's life expectancy vs. GDP per capita using `ggplot2`.
In your plot, put GDP per capita on a **log scale**.
## Exercise 2: Explore two variables with **dplyr** and `ggplot2`
Use `gapminder`, `palmerpenguins::penguins`, or another dataset of your choice.
(Check out a dataset from the **datasets** or **psych** R package if you want!
### 2.1
Pick two **quantitative** variables to explore.
* Make a summary table of descriptive statistics for these variables using `summarize()`.
- Include whatever staistics you feel appropriate (mean, median sd, range, etc.).
* Make a scatterplot of these variables using `ggplot()`.
### 2.2
Pick one *categorical* variable and one *quantitative* variable to explore.
* Make a summary table giving the sample size (hint: `n()`) and descriptive statistics for the quantitative variable by group.
* Make one or more useful plots to visualize these variables.
- Try to make a raincloud plot with points,
boxplots, and densities for each group.
## Bonus Exercise: Recycling (Optional)
Evaluate this code and describe the result.
The goal was to get the data for Rwanda and Afghanistan.
Does this work? Why or why not?
If not, what is the correct way to do this?
```
filter(gapminder, country == c("Rwanda", "Afghanistan"))
```