session_info()
diff --git a/.nojekyll b/.nojekyll index 71e5030..ac3ca98 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -10166c3e \ No newline at end of file +466e1c70 \ No newline at end of file diff --git a/lectures.html b/lectures.html index 20e0e14..9c414f4 100644 --- a/lectures.html +++ b/lectures.html @@ -208,7 +208,7 @@
This lecture as the rest of the course is adapted from the version Stephanie C. Hicks designed and maintained in 2021 - 2022. Check the recent changes to this file through the GitHub history.
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Welcome! I am very excited to have you in our one-term (i.e. half a semester) course on Statistical Computing course number (140.776) offered by the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health.
This course is designed for ScM and PhD students at Johns Hopkins Bloomberg School of Public Health. I am pretty flexible about permitting outside students, but I want everyone to be aware of the goals and assumptions so no one feels like they are surprised by how the class works.
Feel free to submit typos/errors/etc via the github repository associated with the class: https://github.com/lcolladotor/jhustatcomputing2023. You will have the thanks of your grateful instructor!
-session_info()
This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -756,7 +757,7 @@ font-style: inherit;">install_github()There are only two kinds of languages: the ones people complain about and the ones nobody uses. —Bjarne Stroustrup
You may or may not see a short message on the screen. Some packages show messages when you load them, and others do not.
This was a quick overview of R packages. We will use a lot of them, so you will get used to them rather quickly.
@@ -835,9 +836,62 @@ Tip[‘Water Colours’ from Danielle Navarro https://art.djnavarro.net]
+ + +options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
[‘Flametree’ from Danielle Navarro https://art.djnavarro.net]
+options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -1328,9 +1437,87 @@ Questions + + +An article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result. —Claerbout and Karrenbach (1992)
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ emojifont 0.5.5 2021-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ proto 1.0.0 2016-10-29 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ showtext 0.9-6 2023-05-03 [1] CRAN (R 4.3.0)
+ showtextdb 3.0 2020-06-04 [1] CRAN (R 4.3.0)
+ sysfonts 0.8.8 2022-03-13 [1] CRAN (R 4.3.0)
+ tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
if (condition) {
-
-}
-
- else {
-
-}
-
-## Case 1
+} else if (condition) {
-
-}
fun
to create a functionoptions(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
What do they do? How might they be helpful to you in terms of reference management?
[Add here.]
+options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
“When writing code, you’re always collaborating with future-you; and past-you doesn’t respond to emails”. —Hadley Wickham
@@ -2521,13 +2821,37 @@ font-style: inherit;">setwd()
The problem is, if I want to use his code, I will need to go and hand-edit every single one of those paths (C:\Users\Brian\path\only\that\Brian\has
) to the path that I want to use on my computer or wherever I saved the folder on my computer (e.g. /Users/Stephanie/Documents/path/only/I/have
).
The problem is, if I want to use his code, I will need to go and hand-edit every single one of those paths (C:\Users\Brian\path\only\that\Brian\has
) to the path that I want to use on my computer or wherever I saved the folder on my computer (e.g. /Users/leocollado/Documents/path/only/I/have
).
if(if (!"my", "relative", "path"))){
- "path"))) {
+ dir.create(::here('functions.R'))
Rows: 1 Columns: 3
── Column specification ────────────────────────────────────────────────────────
@@ -3441,11 +3766,12 @@ font-style: inherit;"> x,y,z
1,2,3",
- comment = "#")
Rows: 1 Columns: 3
── Column specification ────────────────────────────────────────────────────────
@@ -3494,12 +3820,13 @@ font-style: inherit;">here("data", "team_standings.csv"),
- "team_standings.csv"),
+ col_types = "cc")
Note that the col_types
argument accepts a compact representation. Here "cc"
indicates that the first column is character
and the second column is character
(there are only two columns). Using the col_types
argument is useful because often it is not easy to automatically figure out the type of a column by looking at a few rows (especially if a column has many missing values).
Rows: 10 Columns: 10
── Column specification ────────────────────────────────────────────────────────
@@ -3559,18 +3887,19 @@ font-style: inherit;">here("data", "2016-07-19.csv.bz2"),
- "2016-07-19.csv.bz2"),
+ col_types = "ccicccccci",
- "ccicccccci",
+ n_max = 10)
-logs
# A tibble: 10 × 10
date time size r_version r_arch r_os package version country ip_id
@@ -3601,8 +3930,8 @@ font-style: inherit;">here("data", "2016-07-19.csv.bz2"),
- "2016-07-19.csv.bz2"),
+ col_types = date = col_date()),
- n_max = 10)
-logdates
# A tibble: 10 × 1
date
@@ -3694,9 +4024,83 @@ Tip
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0)
+ bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0)
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
tibble(
- a = :5,
- b = :10,
- c = 1,
- z = (a tibble(
- `:5,
- `= "numeric",
- `<- tibble(
- a = :5,
- b = :10,
- c = 1,
- z = (a head(chicago)
background-color: null;
font-style: inherit;">head(transmute(chicago,
- transmute(chicago,
+ pm10detrend = pm10tmean2 na.rm = TRUE),
- o3detrend = o3tmean2 mean(o3tmean2, na.rm = TRUE)))
+font-style: inherit;">TRUE)
+))
# A tibble: 6 × 2
pm10detrend o3detrend
@@ -4909,7 +5315,8 @@ font-style: inherit;">group_by(chicago, year)
summarize(years, summarize(years,
+ pm25 = mean(pm25, na.rm = TRUE),
- TRUE),
+ o3 = max(o3tmean2, na.rm = TRUE),
- TRUE),
+ no2 = median(no2tmean2, na.rm = TRUE))
# A tibble: 19 × 4
year pm25 o3 no2
@@ -5018,7 +5426,8 @@ font-style: inherit;">group_by(chicago, pm25.quint)
summarize(quint, summarize(quint,
+ o3 = mean(o3tmean2, na.rm = TRUE),
- TRUE),
+ no2 = mean(no2tmean2, na.rm = TRUE))
# A tibble: 6 × 3
pm25.quint o3 no2
@@ -5073,9 +5483,15 @@ font-style: inherit;">first(x)))
chicago %>%
- %>%
+ mutate(+ 1900) %>%
- %>%
+ group_by(year) %>%
- %>%
+ summarize(summarize(
+ pm25 = mean(pm25, na.rm = TRUE),
- TRUE),
+ o3 = max(o3tmean2, na.rm = TRUE),
- TRUE),
+ no2 = median(no2tmean2, na.rm = TRUE))
# A tibble: 19 × 4
year pm25 o3 no2
@@ -5200,15 +5618,16 @@ font-style: inherit;">+ 1) %>%
- %>%
+ group_by(month) %>%
- %>%
+ summarize(summarize(
+ pm25 = mean(pm25, na.rm = TRUE),
- TRUE),
+ o3 = max(o3tmean2, na.rm = TRUE),
- TRUE),
+ no2 = median(no2tmean2, na.rm = TRUE))
# A tibble: 12 × 4
month pm25 o3 no2
@@ -5285,16 +5705,16 @@ font-style: inherit;">10)
# A tibble: 10 × 11
city tmpd dewpoint date pm25 pm10tmean2 o3tmean2 no2tmean2
<chr> <dbl> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
- 1 chic 62 45.3 2001-05-08 7.3 51.5 26.5 27.6
- 2 chic 36 36.8 1991-11-28 NA 10 11.7 16.6
- 3 chic 29 19.6 2005-03-14 19.6 51 9.93 39.9
- 4 chic 20 11.2 2004-02-13 24.5 17.5 21.8 23.3
- 5 chic 32.5 20.4 1997-03-23 NA 14.2 25.4 19.0
- 6 chic 68.5 64.1 1996-07-27 NA 21 19.6 22.4
- 7 chic 28.5 18.2 1997-11-11 NA 24.5 3.94 28.1
- 8 chic 45.5 44.1 1991-04-13 NA 25 13.0 15.4
- 9 chic 67 49.3 2000-10-14 19.4 54.5 24.9 31.0
-10 chic 71 48 1994-09-21 NA 82 30.5 48.5
+ 1 chic 49 40.2 2000-09-25 6.6 7 17.2 15.5
+ 2 chic 35 24.1 1989-11-02 NA 25 8.83 17.3
+ 3 chic 63.5 54.4 1996-04-18 NA 54 30.5 26.7
+ 4 chic 70 65.9 1997-06-19 NA 60.5 32.4 39.9
+ 5 chic 54 50.6 2005-11-05 27.2 32 11.5 18.2
+ 6 chic 86.5 73.4 1990-07-04 NA 60.6 52.2 12.8
+ 7 chic 74 74.6 1987-08-14 NA 49.5 24.2 18.6
+ 8 chic 34.5 29.1 1995-11-27 NA 25 6.57 29.3
+ 9 chic 73 61.2 1995-09-13 NA 46 25.3 26.5
+10 chic 79 64.6 2005-07-31 20.8 29.5 40.8 20.2
# ℹ 3 more variables: pm25detrend <dbl>, year <dbl>, pm25.quint <fct>
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
“Happy families are all alike; every unhappy family is unhappy in its own way.” —- Leo Tolstoy
@@ -5579,7 +6085,7 @@ font-style: inherit;">library(tidyverse) relig_income %>% - pivot_longer("respondents") %>% - mutate(# Gather everything EXCEPT religion to tidy datarelig_income %>% - pivot_longer(relig_income %>% - pivot_longer("respondents") %>% - mutate(income = factor(income)) %>% - %>% + group_by(income) %>% - %>% + summarize(sum(respondents)) %>% - pivot_wider(pivot_wider( + names_from = "income", - "income", + values_from = "total_respondents") "total_respondents" + ) %>% - knitr knitr::<- tibble( - "company" :3, each=each = 4), - 4), + "year" "year" = 2009, 3), - "Q1" "Q1" = size = 12), - "Q2" "Q2" = size = 12), - "Q3" "Q3" = size = 12), - "Q4" "Q4" = 12),
+ 1 1 2006 99 6 54 47 + 2 1 2007 28 79 90 9 + 3 1 2008 7 72 69 24 + 4 1 2009 16 56 6 100 + 5 2 2006 42 58 75 25 + 6 2 2007 64 1 100 6 + 7 2 2008 43 88 37 77 + 8 2 2009 95 74 17 44 + 9 3 2006 34 47 77 38 +10 3 2007 73 31 31 54 +11 3 2008 4 49 93 0 +12 3 2009 57 4 45 96# A tibble: 12 × 6 company year Q1 Q2 Q3 Q4 <int> <int> <int> <int> <int> <int> - 1 1 2006 34 7 70 7 - 2 1 2007 72 26 96 64 - 3 1 2008 62 68 45 98 - 4 1 2009 45 48 42 92 - 5 2 2006 51 13 75 36 - 6 2 2007 49 71 34 93 - 7 2 2008 100 83 22 71 - 8 2 2009 91 67 28 80 - 9 3 2006 19 28 85 1 -10 3 2007 61 38 65 75 -11 3 2008 32 57 47 51 -12 3 2009 4 58 63 0
@@ -6082,22 +6590,24 @@ font-style: inherit;"># try it yourself+font-style: inherit;"># try it yourself# try it yourself
+font-style: inherit;">"_" + )gapminder %>% - %>% + unite(unite( + col=col = "country_continent_year", - country"country_continent_year", + country:year, - :year, + sep=sep = "_")
# A tibble: 1,704 × 4 country_continent_year lifeExp pop gdpPercap @@ -6119,34 +6629,37 @@ font-style: inherit;">"_")
+font-style: inherit;">"_" + )gapminder %>% - %>% + unite(unite( + col=col = "country_continent_year", - country"country_continent_year", + country:year, - :year, + sep=sep = "_") "_" + ) %>% - %>% + separate(separate( + col=col = "country_continent_year", - "country_continent_year", + into=into = c("country", "continent", "year"), - "year"), + sep=sep = "_")
# A tibble: 1,704 × 6 country continent year lifeExp pop gdpPercap @@ -6213,8 +6727,8 @@ font-style: inherit;">"d,e,f,g", "h,i,j")) %>% - %>% + separate(x, "d,e", "f,g,i")) %>% - %>% + separate(x, +
R session information
++- ]]> @@ -6293,7 +6891,7 @@ Tip+options(width = 120) +sessioninfo::session_info()
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + gapminder * 1.0.0 2023-03-10 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
here tidyverse https://lcolladotor.github.io/jhustatcomputing2023/posts/09-tidy-data-and-the-tidyverse/index.html -Fri, 18 Aug 2023 01:10:34 GMT +Fri, 18 Aug 2023 01:51:49 GMT - tibble( - id = each = 3), - visit = 2, 3), - outcome = print(outcomes)
10 - Joining data in R @@ -6304,6 +6902,7 @@ Tip +This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -6432,7 +7031,7 @@ background-color: null; font-style: inherit;"><-+1 a 0 3.07 +2 a 1 3.25 +3 a 2 3.93 +4 b 0 2.18 +5 b 1 2.91 +6 b 2 2.83 +7 c 0 1.49 +8 c 1 2.56 +9 c 2 1.46# A tibble: 9 × 3 id visit outcome <chr> <int> <dbl> -1 a 0 1.54 -2 a 1 3.39 -3 a 2 3.03 -4 b 0 0.309 -5 b 1 2.52 -6 b 2 3.03 -7 c 0 2.13 -8 c 1 3.12 -9 c 2 3.99
Note that subjects are labeled by a unique identifer in the
@@ -6506,7 +7105,7 @@ background-color: null; font-style: inherit;"><- tibble( - id = "b", "c"), - house =id
column.subjects
@@ -6654,15 +7253,15 @@ font-style: inherit;">"id")+1 a 0 3.07 detached +2 a 1 3.25 detached +3 a 2 3.93 detached +4 b 0 2.18 rowhouse +5 b 1 2.91 rowhouse +6 b 2 2.83 rowhouse +7 c 0 1.49 rowhouse +8 c 1 2.56 rowhouse +9 c 2 1.46 rowhouse# A tibble: 9 × 4 id visit outcome house <chr> <int> <dbl> <chr> -1 a 0 1.54 detached -2 a 1 3.39 detached -3 a 2 3.03 detached -4 b 0 0.309 rowhouse -5 b 1 2.52 rowhouse -6 b 2 3.03 rowhouse -7 c 0 2.13 rowhouse -8 c 1 3.12 rowhouse -9 c 2 3.99 rowhouse
@@ -6687,7 +7286,7 @@ background-color: null; font-style: inherit;"><- tibble( - id = "b", "c"), - visit = 1, 0), - house = "visit"))+1 a 0 3.07 detached +2 a 1 3.25 <NA> +3 a 2 3.93 <NA> +4 b 0 2.18 <NA> +5 b 1 2.91 rowhouse +6 b 2 2.83 <NA> +7 c 0 1.49 rowhouse +8 c 1 2.56 <NA> +9 c 2 1.46 <NA># A tibble: 9 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 a 0 1.54 detached -2 a 1 3.39 <NA> -3 a 2 3.03 <NA> -4 b 0 0.309 <NA> -5 b 1 2.52 rowhouse -6 b 2 3.03 <NA> -7 c 0 2.13 rowhouse -8 c 1 3.12 <NA> -9 c 2 3.99 <NA>
@@ -6786,7 +7385,7 @@ background-color: null; font-style: inherit;"><- tibble( - id = "b", "c"), - visit = 1, 0), - house = "visit"))+1 a 0 3.07 <NA> +2 a 1 3.25 <NA> +3 a 2 3.93 <NA> +4 b 0 2.18 <NA> +5 b 1 2.91 rowhouse +6 b 2 2.83 <NA> +7 c 0 1.49 rowhouse +8 c 1 2.56 <NA> +9 c 2 1.46 <NA># A tibble: 9 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 a 0 1.54 <NA> -2 a 1 3.39 <NA> -3 a 2 3.03 <NA> -4 b 0 0.309 <NA> -5 b 1 2.52 rowhouse -6 b 2 3.03 <NA> -7 c 0 2.13 rowhouse -8 c 1 3.12 <NA> -9 c 2 3.99 <NA>
@@ -6897,8 +7496,8 @@ font-style: inherit;">"visit"))+1 b 1 2.91 rowhouse +2 c 0 1.49 rowhouse @@ -6925,8 +7524,8 @@ font-style: inherit;">"visit"))# A tibble: 2 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 b 1 2.52 rowhouse -2 c 0 2.13 rowhouse
+1 b 1 2.91 rowhouse +2 c 0 1.49 rowhouse @@ -6966,7 +7565,8 @@ font-style: inherit;"># Create first example data frame background-color: null; font-style: inherit;"><- data.frame(data.frame( + ID = :3, - X1 = "a1", "a2", "a3")) -"a3") +) +# Create second example data frame -df2 df2 <- data.frame(data.frame( + ID = 2:4, - 4, + X2 = "b1", "b2", "b3")) +font-style: inherit;">"b3") +)# A tibble: 2 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 b 1 2.52 rowhouse -2 c 0 2.13 rowhouse
- Try changing the order from the above e.g.
@@ -7039,9 +7642,92 @@ Tip + + +inner_join(df2, df1)
,semi_join(df2, df1)
andanti_join(df2, df1)
. What changed? What did not change?+ ]]> @@ -7053,7 +7739,7 @@ TipR session information
++-+options(width = 120) +sessioninfo::session_info()
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr * 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
here tidyverse https://lcolladotor.github.io/jhustatcomputing2023/posts/10-joining-data-in-r/index.html -Fri, 18 Aug 2023 01:10:34 GMT +Fri, 18 Aug 2023 01:51:49 GMT
This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -7165,10 +7852,10 @@ font-style: inherit;">data(airquality) with(airquality, { - plot(Temp, Ozone) - lines(data(airquality) with(airquality, { - plot(Temp, Ozone, main = "my plot") - lines(data(mpg) mpg %>% - ggplot(aes(displ, hwy)) + - + + geom_point()The data may not contain the answer. And, if you torture the data long enough, it will tell you anything. —John W. Tukey
There are additional functions in ggplot2
that allow you to make arbitrarily sophisticated plots.
We will discuss more about this in the next lecture.
+There are additional functions in ggplot2
that allow you to make arbitrarily sophisticated plots.
We will discuss more about this in the next lecture.
+ + +options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0)
+ lattice * 0.21-8 2023-04-05 [1] CRAN (R 4.3.1)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -7583,11 +8357,11 @@ Example“The greatest value of a picture is when it forces us to notice what we never expected to see.” —John Tukey
with(airquality, {
- with(airquality, {
+ plot(Temp, Ozone)
- lines(library(tidyverse)
airquality %>%
- ggplot(aes(Temp, Ozone)) +
- +
+ geom_point() +
- +
+ geom_smooth(geom_smooth(
+ method = "loess",
- "loess",
+ se = FALSE) FALSE
+ ) +
- +
+ theme_minimal()
penguins %>%
- %>%
+ count(species)
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0)
+ bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0)
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0)
+ lattice 0.21-8 2023-04-05 [1] CRAN (R 4.3.1)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ Matrix 1.6-1 2023-08-14 [1] CRAN (R 4.3.0)
+ mgcv 1.9-0 2023-07-11 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ nlme 3.1-163 2023-08-09 [1] CRAN (R 4.3.0)
+ palmerpenguins * 0.1.1 2022-08-15 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
# A tibble: 517 × 4
logpm25 logno2_new bmicat NocturnalSympt
@@ -8943,13 +9817,15 @@ font-style: inherit;"><- ggplot(maacs, aes(aes(
+ x = logpm25,
- x = logpm25,
+ y = NocturnalSympt))
-y = NocturnalSympt
+))
+summary(g)
g +
- +
+ geom_point() +
- +
+ geom_smooth()
g +
- +
+ geom_point() +
- +
+ geom_smooth(# try it yourself
library(palmerpenguins)
-penguins
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
@@ -9149,13 +10025,13 @@ Example
g +
- +
+ geom_point() +
- +
+ geom_smooth("lm") +
- facet_grid(. ~ bmicat)
+font-style: inherit;">~ bmicat)
@@ -9257,9 +10133,9 @@ font-style: inherit;">4, alpha = 11 // 2)
@@ -9280,52 +10156,55 @@ font-style: inherit;">2)
g +
- +
+ geom_point(aes(color = bmicat),
- color = bmicat),
+ size = 2,
- 2,
+ alpha = 11 // 2) 2
+ ) +
- +
+ geom_smooth(geom_smooth(
+ size = 4,
- 4,
+ linetype = 3,
- 3,
+ method = "lm",
- "lm",
+ se = FALSE)
+font-style: inherit;">FALSE
+ )
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
@@ -9368,8 +10247,8 @@ Note
g +
- +
+ geom_point(aes(color = bmicat)) +
- +
+ theme_bw(# try it yourself
library(palmerpenguins)
-penguins
+penguins
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
@@ -9460,8 +10339,8 @@ Note
g +
- +
+ geom_point(aes(color = bmicat)) +
- +
+ labs(title = "MAACS Cohort") +
- +
+ labs(labs(
+ x = "log " * PM[2.5]),
- 2.5]),
+ y = "Nocturnal Symptoms")
+font-style: inherit;">"Nocturnal Symptoms"
+ )
background-color: null;
font-style: inherit;"><- data.frame(data.frame(
+ x = 1:100,
- 100,
+ y = rnorm(100))
-testdat[100)
+)
+testdat[50,50, 2] <- 100 100 ## Outlier!
-plot(testdat$x,
- testdat$x,
+ testdat$y,
- type = "l",
- "l",
+ ylim = c(-3,3, 3))
+font-style: inherit;">3)
+)
g +
- +
+ geom_line() +
- +
+ ylim(3)
g +
- +
+ geom_line() +
- +
+ coord_cartesian(<- maacs %>%
- ggplot(geom_point(alpha = 11 // 3) +
- +
+ facet_grid(bmicat ~ no2tert) +
- +
+ geom_smooth(method=method = "lm", se=se = FALSE, col=col = "steelblue") +
- +
+ theme_bw(base_size = 10) +
- +
+ labs(* PM[2.5])) +
- +
+ labs(y = "Nocturnal Symptoms") +
- +
+ labs(
+R session information
+
+options(width = 120)
+sessioninfo::session_info()
+
+─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0)
+ bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0)
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0)
+ lattice 0.21-8 2023-04-05 [1] CRAN (R 4.3.1)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ Matrix 1.6-1 2023-08-14 [1] CRAN (R 4.3.0)
+ mgcv 1.9-0 2023-07-11 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ nlme 3.1-163 2023-08-09 [1] CRAN (R 4.3.0)
+ palmerpenguins * 0.1.1 2022-08-15 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+
+
-
]]>
@@ -9922,7 +10902,7 @@ Tip
ggplot2
data viz
https://lcolladotor.github.io/jhustatcomputing2023/posts/13-ggplot2-plotting-system-part-2/index.html
- Fri, 18 Aug 2023 01:10:34 GMT
+ Fri, 18 Aug 2023 01:51:49 GMT
-
14 - R Nuts and Bolts
@@ -9933,6 +10913,7 @@ Tip
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -11572,9 +12553,93 @@ Tip
+
+
+
+R session information
+
+options(width = 120)
+sessioninfo::session_info()
+
+─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ palmerpenguins * 0.1.1 2022-08-15 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+
+
-
]]>
@@ -11583,7 +12648,7 @@ Tip
R
programming
https://lcolladotor.github.io/jhustatcomputing2023/posts/14-r-nuts-and-bolts/index.html
- Fri, 18 Aug 2023 01:10:34 GMT
+ Fri, 18 Aug 2023 01:51:49 GMT
-
15 - Control Structures
@@ -11594,6 +12659,7 @@ Tip
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -11751,20 +12817,20 @@ font-style: inherit;"><- runif(n=n = 1, min=min = 0, max=max = 10)
+font-style: inherit;">10)
x
-[1] 1.907048
+[1] 3.521267
Then, we can write and if
-else
statement that tests whethere x
is greater than 3 or not.
@@ -11775,14 +12841,14 @@ font-style: inherit;">> 3
-[1] FALSE
+[1] TRUE
If x
is greater than 3, then the first condition occurs. If x
is not greater than 3, then the second condition occurs.
if(x if (x > <- 10
- } } else {
y <- 0
- }
+}
Finally, we can auto print y
to see what the value is.
y
[1] 0
+[1] 10
This expression can also be written a different (but equivalent!) way in R.
@@ -11815,7 +12881,7 @@ font-style: inherit;">0 background-color: null; font-style: inherit;"><- if(x if (x > 3) { 10 - } } else { +font-style: inherit;">else { 0 - } +} y[1] 0
+[1] 10
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
@@ -11928,7 +12994,7 @@ font-style: inherit;">library(palmerpenguins)
for(i for (i in :10) {
- print(i)
}
@@ -11979,7 +13045,7 @@ background-color: null;
font-style: italic;">## create for loop
for(i for (i in :4) {
- ## Print out each element of 'x'
- print(x[i])
+font-style: inherit;">print(x[i])
}
[1] "a"
@@ -12026,7 +13092,7 @@ background-color: null;
font-style: italic;">## create for loop
for(i for (i in :4) {
- ## Print out just 'i'
- print(i)
}
@@ -12072,12 +13138,12 @@ background-color: null;
font-style: italic;">## Generate a sequence based on length of 'x'
for(i for (i in seq_along(x)) {
- seq_along(x)) {
+ print(x[i])
}
for(babyshark for (babyshark in x) {
- print(babyshark)
}
for(candyisgreat for (candyisgreat in x) {
- print(candyisgreat)
}
for(RememberToVote for (RememberToVote in x) {
- print(RememberToVote)
}
for(for (1999 in x) {
- print(1999)
}
Error: <text>:1:5: unexpected numeric constant
-1: for(1999
- ^
+Error: <text>:1:6: unexpected numeric constant
+1: for (1999
+ ^
For one line loops, the curly braces are not strictly necessary.
for(i for (i in 3)
for(i for (i in seq_len(nrow(x))) {
- for(j for (j in seq_len(ncol(x))) {
- print(x[i, j])
- }
+ }
}
[1] 1
@@ -12304,15 +13370,15 @@ background-color: null;
font-style: inherit;">0
while(count while (count < 10) {
- print(count)
- count count <- count 1)
while(z while (z >= <= 10) {
- coin coin <- 1, 0.5)
-
-
+ if(coin if (coin == 1) { 1) { ## random walk
- z z <- z + 1
- } } else {
- z z <- z - 1
- }
+ }
}
1e-8
repeat {
- x1 x1 <- computeEstimate()
-
-
+ if(if (abs(x1 - x0) < tol) { < tol) { ## Close enough?
- break
- } } else {
- x0 x0 <- x1
- }
+ }
}
@@ -12608,7 +13674,7 @@ Pro-tip
for(i for (i in :100) {
- if(i if (i <= 20) {
- ## Skip the first 20 iterations
- next
- }
- next
+ }
+ ## Do something here
}
@@ -12640,7 +13706,7 @@ font-style: italic;">## Do something here
for(i for (i in :100) {
- print(i)
- if(i if (i > 20) {
- ## Stop loop after 20 iterations
- break
- }
+font-style: inherit;">break
+ }
}
@@ -12721,9 +13787,93 @@ Tip
+
+
+
+R session information
+
+options(width = 120)
+sessioninfo::session_info()
+
+─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ palmerpenguins * 0.1.1 2022-08-15 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+
+
-
]]>
@@ -12732,7 +13882,7 @@ Tip
R
programming
https://lcolladotor.github.io/jhustatcomputing2023/posts/15-control-structures/index.html
- Fri, 18 Aug 2023 01:10:34 GMT
+ Fri, 18 Aug 2023 01:51:49 GMT
-
16 - Functions
@@ -12743,6 +13893,7 @@ Tip
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -12854,7 +14005,7 @@ background-color: null;
font-style: inherit;"><- function() {
- ## This is an empty function
}
@@ -12863,7 +14014,7 @@ background-color: null;
font-style: italic;">## Functions have their own class
class(f)
[1] "function"
NULL
[1] 1.110707
+[1] 1.014286
## Specify 'x' argument by name, default for 'na.rm'
background-color: null;
font-style: inherit;">sd(x = mydata)
[1] 1.110707
+[1] 1.014286
x = mydata, na.rm = FALSE)
[1] 1.110707
+[1] 1.014286
[1] 1.110707
+[1] 1.014286
You can mix positional matching with matching by name.
@@ -13331,7 +14482,7 @@ font-style: inherit;">na.rm = FALSE, mydata)[1] 1.110707
+[1] 1.014286
Here, the mydata
object is assigned to the x
argument, because it’s the only argument not yet specified.
function (x, ...)
UseMethod("mean")
-<bytecode: 0x138e33de8>
+<bytecode: 0x1075ea1e8>
<environment: namespace:base>
In many programming languages, this would be an error, because y
is not defined inside the function.
In R, this is valid code because R uses rules called lexical scoping to find the value associated with a name.
@@ -13828,7 +14979,7 @@ background-color: null; font-style: inherit;"><- function(x, y) { - if (< 0.1) { - sum(x, y) - } } else { - sum(x, y) * 1.1 - } + } } 2))
3 3.3
- 95 905
+ 82 918
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
[1] 1 4 9 16 25 36 49 64 81 100
lapply(x, mean)
Notice that here we are passing the mean()
function as an argument to the lapply()
function.
[[1]]
-[1] 0.4687761
+[1] 0.5924944
[[2]]
-[1] 0.9249996 0.3011933
+[1] 0.8660588 0.3277243
[[3]]
-[1] 0.5811661 0.1755092 0.5232761
+[1] 0.5009080 0.2951163 0.6264905
[[4]]
-[1] 0.6459540 0.3708483 0.6723211 0.7998949
+[1] 0.04282267 0.14951908 0.82034538 0.64614463
[[1]]
-[1] 8.291326
+[1] 5.653385
[[2]]
-[1] 8.893872 9.878169
+[1] 8.325503 7.234466
[[3]]
-[1] 5.5325986 0.4374242 7.2026176
+[1] 5.968981 9.174316 7.920678
[[4]]
-[1] 1.6807689 0.2755822 8.5226424 9.5019399
+[1] 9.491500 3.023649 2.990945 8.757496
So now, instead of the random numbers being between 0 and 1 (the default), the are all between 0 and 10.
@@ -14495,7 +15700,7 @@ font-style: inherit;">6, 3, 2)) +font-style: inherit;">2)) x$a
@@ -14516,9 +15721,11 @@ $b
background-color: null;
font-style: inherit;">lapply(x, function(elt) { elt[,function(elt) {
+ elt[, 1] })
$a
[1] 1 2
@@ -14538,7 +15745,7 @@ background-color: null;
font-style: inherit;"><- function(elt) {
- elt[, elt[, 1]
}
@@ -14621,13 +15828,13 @@ font-style: inherit;">lapply(x, mean)
Notice that lapply()
returns a list (as usual), but that each element of the list has length 1.
sapply(x, mean)
a b c d
- 2.5000000 -0.3561419 1.0788156 5.0209365
+ 2.5000000 -0.1478465 0.8197940 4.9544836
Because the result of lapply()
was a list where each element had length 1, sapply()
collapsed the output into a numeric vector, which is often more useful than a list.
$`1`
- [1] -0.88306749 -1.86719488 0.63289913 1.05916422 -0.55471433 0.14180641
- [7] 0.07777047 -0.09623353 0.80288817 -0.07352678
+ [1] 0.78541247 -0.06267966 -0.89713180 0.11796725 0.66689447 -0.02523006
+ [7] -0.19081948 0.44974528 -0.51005146 -0.08103298
$`2`
- [1] 0.52710414 0.78458044 0.85538500 0.11115802 0.43938934 0.30846324
- [7] 0.12611702 0.92352094 0.07062165 0.61957181
+ [1] 0.29977033 0.31873253 0.53182993 0.85507540 0.21585775 0.89867742
+ [7] 0.78109747 0.06887742 0.79661568 0.60022565
$`3`
- [1] -0.67639542 0.72492785 0.10007215 0.29327660 0.85127149 0.50446636
- [7] 0.05115469 2.29881193 -0.63035160 2.09792647
+ [1] -0.38262045 0.06294368 0.41768485 1.57972821 1.17555228 1.47374130
+ [7] 1.79199913 2.25569283 1.55226509 -1.51811384
A common idiom is split
followed by an lapply
.
$`1`
-[1] -0.07602086
+[1] 0.0253074
$`2`
-[1] 0.4765912
+[1] 0.536676
$`3`
-[1] 0.5615161
+[1] 0.8408873
5 6 7 8 9
Ozone 23.61538 29.44444 59.115385 59.961538 31.44828
@@ -14970,7 +16178,7 @@ font-style: inherit;">gl(3, 10)
+font-style: inherit;">10)
f
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
@@ -14980,8 +16188,8 @@ Levels: 1 2 3
background-color: null;
font-style: inherit;">tapply(x, f, mean) 1 2 3
-0.03546858 0.50033323 1.23684289
+ 1 2 3
+0.3554738 0.5195466 0.6764006
$`1`
-[1] -1.597023 1.582242
+[1] -1.431912 2.695089
$`2`
-[1] 0.01799498 0.98731564
+[1] 0.1263379 0.8959040
$`3`
-[1] -0.1673642 2.8815083
+[1] -1.207741 1.696309
[,1] [,2] [,3] [,4] [,5] [,6]
-[1,] -0.01270296 0.12521307 -0.35347017 -0.288597192 0.4754956 -1.4952687
-[2,] -1.76025729 -0.36661801 1.57260727 0.909927684 -0.8722067 2.4145309
-[3,] -0.04541822 -0.08756584 0.09477815 0.587649433 -0.2839712 -0.3948512
-[4,] -0.79873007 2.33988787 0.04433525 -0.043574962 1.8351096 -1.4161750
-[5,] 0.57385840 0.22221005 -1.15025884 0.002239365 -1.1274753 0.2699411
-[6,] -0.79337310 0.15304664 0.05230485 2.088306453 -2.5307486 1.0901328
- [,7] [,8] [,9] [,10]
-[1,] -0.06995917 -0.970955222 -0.6081838 0.36135088
-[2,] 0.98219144 1.226671950 0.7388203 0.99107134
-[3,] 0.36028126 1.080908318 -1.4657096 -0.83599160
-[4,] -0.46741177 -0.341382567 0.6639626 0.90447006
-[5,] -0.63266831 -0.828562584 -0.5595121 -0.51470923
-[6,] 0.44488488 -0.005120275 -1.2554960 -0.09944684
+ [,1] [,2] [,3] [,4] [,5] [,6]
+[1,] 1.589728 0.7733454 -1.3311072 -0.77084025 -0.1947478 0.1748546
+[2,] 2.395088 0.3243910 -1.5133366 0.09199955 0.3850993 0.1851718
+[3,] 1.039643 -2.1721402 -0.9933217 -1.89261272 0.1748050 1.0563987
+[4,] -1.580978 -0.9884235 -1.4976744 -0.51011200 -2.7512079 0.5547477
+[5,] 1.264799 -2.0551874 0.4483417 -3.08561764 -0.1549359 -0.8384706
+[6,] 1.756973 0.9244522 0.2740854 -0.61441465 -1.0661350 1.4497808
+ [,7] [,8] [,9] [,10]
+[1,] 0.7163086 -0.01817166 0.2193225 -0.3346788
+[2,] 0.7606851 0.42082416 0.1099027 0.2834439
+[3,] -1.1218204 -1.17000278 0.4302792 -0.5684986
+[4,] 0.6082452 0.46763465 -0.3481830 -0.1765517
+[5,] -0.7460224 -0.01123782 1.8116342 -0.1033175
+[6,] 1.0160202 -0.82361401 -0.1616471 -0.1628032
apply(x, 2, mean) 2, mean) ## Take the mean of each column
[1] -0.24958041 0.14629702 -0.14633652 -0.26691102 -0.15595976 0.07473874
- [7] 0.05314485 0.07476061 -0.30001733 0.14398756
+ [1] 0.083759441 -0.134507982 -0.246473461 -0.371270102 -0.078433882
+ [6] -0.101665531 -0.007126106 -0.003193726 0.114767264 0.070612124
[1] -2.8370777 5.8367390 -0.9898905 2.7204911 -3.7449375 -0.8555091
- [7] 2.4826554 0.9494142 -3.9096827 0.2117756 0.3672752 -2.7321397
-[13] 2.4937133 -2.7042877 -4.6029774 -6.2231452 -1.9386089 0.5097158
-[19] -2.2691720 4.7181237
+ [1] 0.82401382 3.44326903 -5.21727094 -6.22250299 -3.47001414 2.59269751
+ [7] -1.76049948 -0.54534465 1.26993157 -0.05660623 1.89101638 2.60154094
+[13] -0.80804188 1.96321614 -2.68869045 0.56525640 0.44214056 -4.25890694
+[19] -3.02509115 -1.01075274
[,1] [,2] [,3] [,4] [,5] [,6]
-[1,] -1.09759334 -0.58191082 -0.6190918 0.7545051 -1.6708063 -1.2382435
-[2,] -0.04952269 0.50872978 1.6895949 0.1657323 1.7746160 1.7427081
-[3,] 0.45414643 1.22539326 0.6284307 0.2973018 1.0887260 0.4581224
-[4,] -0.03995540 0.23679937 -0.7905091 0.6370128 0.7911886 -0.2637556
-[5,] 0.12208387 -1.41751608 1.2769118 0.8510867 -0.4888010 -0.1692706
-[6,] -1.31501439 -0.08597665 -0.7616683 0.7553028 1.1584617 -2.0701933
- [,7] [,8] [,9] [,10]
-[1,] -1.1974074 1.22719350 -0.32231319 1.16291606
-[2,] -0.6335309 0.95729514 -0.84747657 0.91182060
-[3,] -0.7138229 -1.88743158 0.07026544 -2.01649459
-[4,] -0.2273346 1.76161541 -1.26793435 -1.89014826
-[5,] 0.3346429 -0.75236320 0.31607231 0.09632038
-[6,] -1.0845780 0.02416961 0.50295930 1.93484470
+ [,1] [,2] [,3] [,4] [,5] [,6]
+[1,] 0.58654399 -0.502546440 1.1493478 0.6257709 -0.02866237 1.490139530
+[2,] -0.14969248 0.327632870 0.0202589 0.2889600 -0.16552218 -0.829703298
+[3,] 1.12561766 0.707836011 0.6038607 -0.6722613 0.85092968 0.550785886
+[4,] -1.71719604 0.554424755 0.4229181 0.1484968 0.22134369 0.258853355
+[5,] 0.31827641 1.555568589 0.8971850 -0.7742244 0.45459793 -0.043814576
+[6,] -0.08429415 0.001737282 0.1906608 1.1145869 0.54156791 -0.004889302
+ [,7] [,8] [,9] [,10]
+[1,] -0.7879713 1.02206400 -1.0420765 -1.2779945
+[2,] 1.7217146 0.06728039 0.6408182 -0.3551929
+[3,] -0.2439192 -0.71553120 -0.8273868 0.2559954
+[4,] -0.1085818 -0.28763268 1.9010457 1.7950971
+[5,] -1.4082747 -1.07621679 0.5428189 0.4538626
+[6,] -1.0644006 -0.04186614 -0.8150566 1.0490749
c(0.25, 0.75))
[,1] [,2] [,3] [,4] [,5] [,6]
-25% -1.1724539 0.004291043 -0.5178008 -0.6588207 -0.4089184 -1.0038506
-75% 0.4853005 1.506519993 0.5858536 0.5369595 0.3300002 0.6922169
+ [,1] [,2] [,3] [,4] [,5] [,6]
+25% -0.7166151 -0.1615648 -0.5651758 -0.04431213 -0.5916219 -0.07368714
+75% 0.9229907 0.3179646 0.6818422 0.52154809 0.5207637 0.45384114
[,7] [,8] [,9] [,10] [,11] [,12]
-25% -0.9842272 -1.0220842 -0.7082846 -0.8992771 -0.3444137 -0.4086714
-75% 0.2951763 0.6737552 0.1853825 1.0853115 0.6014494 0.3695608
- [,13] [,14] [,15] [,16] [,17] [,18]
-25% -1.1790230 -0.7932644 -0.002708936 -0.5149016 -0.83974314 -0.7881085
-75% 0.1577916 0.9562642 1.100022074 0.4498309 -0.04954139 0.2352183
+25% -0.4355993 -0.1313015 -0.8149658 -0.9260982 0.02077709 -0.1343613
+75% 1.5985929 0.8889319 0.2213238 0.3661333 0.82424899 0.4156328
+ [,13] [,14] [,15] [,16] [,17] [,18]
+25% -0.1281593 -0.6691927 -0.2824997 -0.6574923 0.06421797 -0.7905708
+75% 1.3073689 1.2450340 0.5072401 0.5023885 1.08294108 0.4653062
[,19] [,20]
-25% 0.03656589 -0.7393304
-75% 0.35820288 0.5060296
+25% -0.5826196 -0.6965163
+75% 0.1313324 0.6849689
Notice that I had to pass the probs = c(0.25, 0.75)
argument to quantile()
via the ...
argument to apply()
.
[1] 201.5111
+[1] 248.8765
However, passing a vector of mu
s or sigma
s won’t work with this function because it’s not vectorized.
[1] 121.9851
+[1] 119.3071
[1] 201.5111 127.6611 113.3086 108.0569 105.5217 104.0882 103.1900 102.5851
- [9] 102.1553 101.8371
+ [1] 248.8765 146.5055 124.7964 116.2695 111.8983 109.2945 107.5867 106.3890
+ [9] 105.5067 104.8318
Pretty cool, right?
@@ -15427,9 +16635,62 @@ Tip +options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
This function is simple:
@@ -15736,42 +16998,43 @@ background-color: null; font-style: inherit;"><- function(x) { - if(if (is.na(x)) - is.na(x)) { + print("x is a missing value!") - } else if(x if (x > 0) - 0) { + print("x is greater than zero") - } else - else { + print("x is less than or equal to zero") - } + invisible(x) -} +}Now we can run the following.
Error in if (is.na(x)) print("x is a missing value!") else if (x > 0) print("x is greater than zero") else print("x is less than or equal to zero"): the condition has length > 1
+Error in if (is.na(x)) {: the condition has length > 1
Now what?? Why are we getting this warning?
@@ -15828,54 +17091,56 @@ background-color: null; font-style: inherit;"><- function(x) { - if(if (length(x) > 1L) - > 1L) { + stop("'x' has length > 1") - } + if(if (is.na(x)) - is.na(x)) { + print("x is a missing value!") - } else if(x if (x > 0) - 0) { + print("x is greater than zero") - } else - else { + print("x is less than or equal to zero") - } + invisible(x) -} +}Now when we pass print_message3()
a vector, we should get an error.
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ reprex * 2.0.2 2022-08-17 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
This function takes an expression as an argument and tries to evaluate it. If the expression can be evaluated without any errors or warnings then the result of the expression is returned and the message Finally done!
is printed to the R console. If an error or warning is generated, then the functions that are provided to the error
or warning
arguments are printed. Let’s try this function out with a few examples.
beera({
- 2 2
beera({
- "two" beera({
- as.numeric(<- function(n){
- n function(n) {
+ n %% "two")
background-color: null;
font-style: inherit;"><- function(n){
- function(n) {
+ tryCatch(n == 0,
- error = function(e){
- function(e) {
+ FALSE
- })
-}
-
- }
+ )
+}
+
+is_even_error("eight")
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
library(tidyverse)
library(lubridate)
[1] "2023-08-17 17:32:07 EDT"
+[1] "2023-08-17 21:47:51 EDT"
Otherwise, there are three ways you are likely to create a date/time:
@@ -17496,7 +18876,7 @@ font-style: inherit;">20170131) background-color: null; font-style: inherit;">ymd("2016-09-13") "2016-09-13") ## International standardflights %>%
- %>%
+ select(year, month, day) %>%
- %>%
+ mutate(
flights %>%
- %>%
+ select(year, month, day, hour, minute)
@@ -17696,7 +19078,7 @@ font-style: inherit;">today())
background-color: null;
font-style: inherit;">now()
[1] "2023-08-17 17:32:08 EDT"
+[1] "2023-08-17 21:47:52 EDT"
"1970-01-01 01:00")
class(x)
[1] "POSIXct" "POSIXt"
Error in `+.POSIXt`(x, y): binary '+' is not defined for "POSIXt" objects
[1] "2011-01-10"
storm_sub %>%
- ggplot(x = begin, y = deaths)) +
- +
+ geom_point()
storm_sub %>%
- filter(6) %>%
- ggplot(aes(begin, deaths)) +
- +
+ geom_point()
storm_sub %>%
- filter(16) %>%
- ggplot(aes(begin, deaths)) +
- +
+ geom_point()
options(width = 120)
+sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0)
+ bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0)
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ emojifont 0.5.5 2021-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ nycflights13 * 1.0.2 2021-04-12 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ proto 1.0.0 2016-10-29 [1] CRAN (R 4.3.0)
+ purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ showtext 0.9-6 2023-05-03 [1] CRAN (R 4.3.0)
+ showtextdb 3.0 2020-06-04 [1] CRAN (R 4.3.0)
+ stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
+ stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0)
+ sysfonts 0.8.8 2022-03-13 [1] CRAN (R 4.3.0)
+ tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+ timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+ tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0)
+ withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture as the rest of the course is adapted from the version Stephanie C. Hicks designed and maintained in 2021 - 2022. Check the recent changes to this file through the GitHub history.
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Welcome! I am very excited to have you in our one-term (i.e. half a semester) course on Statistical Computing course number (140.776) offered by the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health.
This course is designed for ScM and PhD students at Johns Hopkins Bloomberg School of Public Health. I am pretty flexible about permitting outside students, but I want everyone to be aware of the goals and assumptions so no one feels like they are surprised by how the class works.
Feel free to submit typos/errors/etc via the github repository associated with the class: https://github.com/lcolladotor/jhustatcomputing2023. You will have the thanks of your grateful instructor!
-options(width = 120)
::session_info() sessioninfo
This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -575,7 +577,7 @@There are only two kinds of languages: the ones people complain about and the ones nobody uses. —Bjarne Stroustrup
We will not do that now, but it is quite likely that at one point later in this course we will.
You only need to install a package once, unless you upgrade/re-install R. Once installed, you still need to load the package before you can use it. That has to happen every time you start a new R session. You do that using the library()
command. For instance to load the ggplot2
package, type
library('ggplot2')
library("ggplot2")
You may or may not see a short message on the screen. Some packages show messages when you load them, and others do not.
This was a quick overview of R packages. We will use a lot of them, so you will get used to them rather quickly.
@@ -654,9 +656,52 @@[‘Water Colours’ from Danielle Navarro https://art.djnavarro.net]
+ + +options(width = 120)
+::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
[‘Flametree’ from Danielle Navarro https://art.djnavarro.net]
+options(width = 120)
+::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -500,9 +536,77 @@An article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result. —Claerbout and Karrenbach (1992)
options(width = 120)
+::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.3.0)
+ emojifont 0.5.5 2021-04-20 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+ ggplot2 3.4.3 2023-08-14 [1] CRAN (R 4.3.1)
+ glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+ gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0)
+ here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
+ magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+ munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+ pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+ pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+ proto 1.0.0 2016-10-29 [1] CRAN (R 4.3.0)
+ R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ showtext 0.9-6 2023-05-03 [1] CRAN (R 4.3.0)
+ showtextdb 3.0 2020-06-04 [1] CRAN (R 4.3.0)
+ sysfonts 0.8.8 2022-03-13 [1] CRAN (R 4.3.0)
+ tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+ tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+ utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0)
+ vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
if (condition) {
-
-
- }
-else {
-
-
- }
-else if (condition) {
-
- }
fun
to create a function<- function(variables) {
- name
+
}
for (variable in vector) {
-
+
}
options(width = 120)
+::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Install the following packages:
install.packages(c("bibtex", "RefManageR")
install.packages(c("bibtex", "RefManageR"))
What do they do? How might they be helpful to you in terms of reference management?
[Add here.]
+options(width = 120)
+::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
+ setting value
+ version R version 4.3.1 (2023-06-16)
+ os macOS Ventura 13.5
+ system aarch64, darwin20
+ ui X11
+ language (EN)
+ collate en_US.UTF-8
+ ctype en_US.UTF-8
+ tz America/New_York
+ date 2023-08-17
+ pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)
+
+─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
+ package * version date (UTC) lib source
+ cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+ colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)
+ digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+ evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)
+ fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+ htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)
+ htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)
+ jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)
+ knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)
+ rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)
+ rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)
+ rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+ sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+ xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)
+ yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
+
+ [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
+
+──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
“When writing code, you’re always collaborating with future-you; and past-you doesn’t respond to emails”. —Hadley Wickham
@@ -365,9 +367,9 @@Relative versus absolute paths
function being used which explicitly tells R to change the absolute path or absolute location of which directory to move into.
For example, say I want to clone a GitHub repo from my colleague Brian, which has 100 R script files, and in every one of those files at the top is:
--+setwd("C:\Users\Brian\path\only\that\Brian\has")
setwd("C:\\Users\\Brian\\path\\only\\that\\Brian\\has")
The problem is, if I want to use his code, I will need to go and hand-edit every single one of those paths (
+C:\Users\Brian\path\only\that\Brian\has
) to the path that I want to use on my computer or wherever I saved the folder on my computer (e.g./Users/Stephanie/Documents/path/only/I/have
).The problem is, if I want to use his code, I will need to go and hand-edit every single one of those paths (
C:\Users\Brian\path\only\that\Brian\has
) to the path that I want to use on my computer or wherever I saved the folder on my computer (e.g./Users/leocollado/Documents/path/only/I/have
).
- This is an unsustainable practice.
- I can go in and manually edit the path, but this assumes I know how to set a working directory. Not everyone does.
@@ -472,8 +474,8 @@Finding
- List the files in the path.
-@@ -584,7 +586,7 @@if(!file.exists(here("my", "relative", "path"))){ -dir.create(here("my", "relative", "path")) +
if (!file.exists(here("my", "relative", "path"))) { +dir.create(here("my", "relative", "path")) }list.files(here("my", "relative", "path"))
R code
For example, it might be something like this:
-+source(here::here('functions.R'))
source(here::here("functions.R"))
@@ -650,9 +652,9 @@ Example
Let’s try an example. Let’s save a vector of length 5 into the two file formats.
+save(x, file = here("data", "x.Rda")) +saveRDS(x, file = here("data", "x.Rds")) +list.files(path = here("data"))<- 1:5 - x save(x, file=here("data", "x.Rda")) -saveRDS(x, file=here("data", "x.Rds")) -list.files(path=here("data"))
+skip = 2 + )[1] "2016-07-19.csv.bz2" "b_lyrics.RDS" [3] "bmi_pm25_no2_sim.csv" "chicago.rds" @@ -720,7 +722,7 @@
Example
<- 1:5 x <- x^2 - y save(x,y, file=here("data", "x.Rda")) +save(x, y, file = here("data", "x.Rda")) <- load(here("data", "x.Rda")) new_x2
When you are done:
@@ -963,7 +965,8 @@Example
The second line of metadata x,y,z 1,2,3", -skip = 2)Rows: 1 Columns: 3 ── Column specification ──────────────────────────────────────────────────────── @@ -985,7 +988,8 @@
Example
+comment = "#" + )read_csv("# A comment I want to skip x,y,z 1,2,3", -comment = "#")
Rows: 1 Columns: 3 ── Column specification ──────────────────────────────────────────────────────── @@ -1024,8 +1028,9 @@
Example
Here is an example of how to specify the column types explicitly:
-+<- read_csv(here("data", "team_standings.csv"), - teams col_types = "cc")
<- read_csv(here("data", "team_standings.csv"), + teams col_types = "cc" + )
Note that the
col_types
argument accepts a compact representation. Here"cc"
indicates that the first column ischaracter
and the second column ischaracter
(there are only two columns). Using thecol_types
argument is useful because often it is not easy to automatically figure out the type of a column by looking at a few rows (especially if a column has many missing values).@@ -1044,8 +1049,9 @@Example
The following call reads a gzip-compressed CSV file containing download logs from the RStudio CRAN mirror.
-+<- read_csv(here("data", "2016-07-19.csv.bz2"), - logs n_max = 10)
<- read_csv(here("data", "2016-07-19.csv.bz2"), + logs n_max = 10 + )
Rows: 10 Columns: 10 ── Column specification ──────────────────────────────────────────────────────── @@ -1061,10 +1067,11 @@
Example
Note that the warnings indicate that
read_csv()
may have had some difficulty identifying the type of each column. This can be solved by using thecol_types
argument.-+<- read_csv(here("data", "2016-07-19.csv.bz2"), - logs col_types = "ccicccccci", - n_max = 10) - logs
<- read_csv(here("data", "2016-07-19.csv.bz2"), + logs col_types = "ccicccccci", + n_max = 10 + + ) logs
# A tibble: 10 × 10 date time size r_version r_arch r_os package version country ip_id @@ -1085,10 +1092,11 @@
Example
For example, in the log data above, the first column is actually a date, so it might make more sense to read it in as a
Date
object.If we wanted to just read in that first column, we could do
-+ + ++<- read_csv(here("data", "2016-07-19.csv.bz2"), - logdates col_types = cols_only(date = col_date()), - n_max = 10) - logdates
<- read_csv(here("data", "2016-07-19.csv.bz2"), + logdates col_types = cols_only(date = col_date()), + n_max = 10 + + ) logdates
# A tibble: 10 × 1 date @@ -1166,9 +1174,73 @@
Additional Resources<
+ diff --git a/posts/08-managing-data-frames-with-tidyverse/index.html b/posts/08-managing-data-frames-with-tidyverse/index.html index 8c08a31..284f7fb 100644 --- a/posts/08-managing-data-frames-with-tidyverse/index.html +++ b/posts/08-managing-data-frames-with-tidyverse/index.html @@ -276,6 +276,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0) + bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0) + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
Final Questions Additional Resources +R session information @@ -289,6 +290,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -464,8 +466,8 @@Want to see
Here, we again display the
chicago
data.frame as a tibble but specify that we would only like to see 5 rows. Thewidth = Inf
argument specifies that we would like to see all the possible columns. Here, there are only 8, but for larger datasets, this can be helpful to specify.-+as_tibble(chicago) %>% -print(n = 5, width = Inf)
as_tibble(chicago) %>% +print(n = 5, width = Inf)
# A tibble: 6,940 × 8 city tmpd dptp date pm25tmean2 pm10tmean2 o3tmean2 no2tmean2 @@ -497,10 +499,10 @@
tibble()
In the example here, we see that the column
c
will contain the value ‘1’ across all rows.tibble( -a = 1:5, - b = 6:10, - c = 1, - z = (a + b)^2 + c + a = 1:5, + b = 6:10, + c = 1, + z = (a + b)^2 + c )
# A tibble: 5 × 4 @@ -531,9 +533,9 @@
tibble()
Note that to refer to such columns in other tidyverse packages, you willl continue to use backticks surrounding the variable name.
tibble( -`two words` = 1:5, - `12` = "numeric", - `:)` = "smile", + `two words` = 1:5, + `12` = "numeric", + `:)` = "smile", )
# A tibble: 5 × 3 @@ -561,10 +563,10 @@
Subsetting tibbles
For example:
<- tibble( - df a = 1:5, - b = 6:10, - c = 1, - z = (a + b)^2 + c + a = 1:5, + b = 6:10, + c = 1, + z = (a + b)^2 + c ) # Extract by name using $ or [[]] @@ -1043,9 +1045,10 @@
mutate()
There is also the related
transmute()
function, which does the same thing asmutate()
but then drops all non-transformed variables.Here, we de-trend the PM10 and ozone (O3) variables.
-+head(transmute(chicago, -pm10detrend = pm10tmean2 - mean(pm10tmean2, na.rm = TRUE), - o3detrend = o3tmean2 - mean(o3tmean2, na.rm = TRUE)))
head(transmute(chicago, +pm10detrend = pm10tmean2 - mean(pm10tmean2, na.rm = TRUE), + o3detrend = o3tmean2 - mean(o3tmean2, na.rm = TRUE) + ))
# A tibble: 6 × 2 pm10detrend o3detrend @@ -1103,9 +1106,11 @@
group_by()
Finally, we compute summary statistics for each year in the data frame with the
summarize()
function.-+summarize(years, pm25 = mean(pm25, na.rm = TRUE), -o3 = max(o3tmean2, na.rm = TRUE), - no2 = median(no2tmean2, na.rm = TRUE))
summarize(years, +pm25 = mean(pm25, na.rm = TRUE), + o3 = max(o3tmean2, na.rm = TRUE), + no2 = median(no2tmean2, na.rm = TRUE) + )
# A tibble: 19 × 4 year pm25 o3 no2 @@ -1156,8 +1161,10 @@
group_by()
Finally, we can compute the mean of
o3
andno2
within quintiles ofpm25
.-+summarize(quint, o3 = mean(o3tmean2, na.rm = TRUE), -no2 = mean(no2tmean2, na.rm = TRUE))
summarize(quint, +o3 = mean(o3tmean2, na.rm = TRUE), + no2 = mean(no2tmean2, na.rm = TRUE) + )
# A tibble: 6 × 3 pm25.quint o3 no2 @@ -1185,7 +1192,9 @@
%>%
This nesting is not a natural way to think about a sequence of operations.
The
%>%
operator allows you to string operations in a left-to-right fashion, i.e.-+first(x) %>% second %>% third
first(x) %>% +second() %>% + third()
@@ -1200,12 +1209,14 @@+ + +
%>%
Take the example that we just did in the last section.
That can be done with the following sequence in a single R expression.
-+%>% - chicago mutate(year = as.POSIXlt(date)$year + 1900) %>% - group_by(year) %>% - summarize(pm25 = mean(pm25, na.rm = TRUE), - o3 = max(o3tmean2, na.rm = TRUE), - no2 = median(no2tmean2, na.rm = TRUE))
%>% + chicago mutate(year = as.POSIXlt(date)$year + 1900) %>% + group_by(year) %>% + summarize( + pm25 = mean(pm25, na.rm = TRUE), + o3 = max(o3tmean2, na.rm = TRUE), + no2 = median(no2tmean2, na.rm = TRUE) + )
# A tibble: 19 × 4 year pm25 o3 no2 @@ -1250,11 +1261,13 @@
%>%
Another example might be computing the average pollutant level by month. This could be useful to see if there are any seasonal trends in the data.
-@@ -1411,9 +1424,84 @@+mutate(chicago, month = as.POSIXlt(date)$mon + 1) %>% -group_by(month) %>% - summarize(pm25 = mean(pm25, na.rm = TRUE), - o3 = max(o3tmean2, na.rm = TRUE), - no2 = median(no2tmean2, na.rm = TRUE))
mutate(chicago, month = as.POSIXlt(date)$mon + 1) %>% +group_by(month) %>% + summarize( + pm25 = mean(pm25, na.rm = TRUE), + o3 = max(o3tmean2, na.rm = TRUE), + no2 = median(no2tmean2, na.rm = TRUE) + )
# A tibble: 12 × 4 month pm25 o3 no2 @@ -1299,16 +1312,16 @@
slice_*()
# A tibble: 10 × 11 city tmpd dewpoint date pm25 pm10tmean2 o3tmean2 no2tmean2 <chr> <dbl> <dbl> <date> <dbl> <dbl> <dbl> <dbl> - 1 chic 62 45.3 2001-05-08 7.3 51.5 26.5 27.6 - 2 chic 36 36.8 1991-11-28 NA 10 11.7 16.6 - 3 chic 29 19.6 2005-03-14 19.6 51 9.93 39.9 - 4 chic 20 11.2 2004-02-13 24.5 17.5 21.8 23.3 - 5 chic 32.5 20.4 1997-03-23 NA 14.2 25.4 19.0 - 6 chic 68.5 64.1 1996-07-27 NA 21 19.6 22.4 - 7 chic 28.5 18.2 1997-11-11 NA 24.5 3.94 28.1 - 8 chic 45.5 44.1 1991-04-13 NA 25 13.0 15.4 - 9 chic 67 49.3 2000-10-14 19.4 54.5 24.9 31.0 -10 chic 71 48 1994-09-21 NA 82 30.5 48.5 + 1 chic 49 40.2 2000-09-25 6.6 7 17.2 15.5 + 2 chic 35 24.1 1989-11-02 NA 25 8.83 17.3 + 3 chic 63.5 54.4 1996-04-18 NA 54 30.5 26.7 + 4 chic 70 65.9 1997-06-19 NA 60.5 32.4 39.9 + 5 chic 54 50.6 2005-11-05 27.2 32 11.5 18.2 + 6 chic 86.5 73.4 1990-07-04 NA 60.6 52.2 12.8 + 7 chic 74 74.6 1987-08-14 NA 49.5 24.2 18.6 + 8 chic 34.5 29.1 1995-11-27 NA 25 6.57 29.3 + 9 chic 73 61.2 1995-09-13 NA 46 25.3 26.5 +10 chic 79 64.6 2005-07-31 20.8 29.5 40.8 20.2 # ℹ 3 more variables: pm25detrend <dbl>, year <dbl>, pm25.quint <fct>
Additional Resources<
+ diff --git a/posts/09-tidy-data-and-the-tidyverse/index.html b/posts/09-tidy-data-and-the-tidyverse/index.html index 9ca8928..a68d32d 100644 --- a/posts/09-tidy-data-and-the-tidyverse/index.html +++ b/posts/09-tidy-data-and-the-tidyverse/index.html @@ -254,6 +254,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
Final Questions Additional Resources +R session information @@ -267,6 +268,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
“Happy families are all alike; every unhappy family is unhappy in its own way.” —- Leo Tolstoy
@@ -403,8 +405,8 @@Tidy data
+pivot_longer(-religion, names_to = "income", values_to = "respondents") %>% + mutate(religion = factor(religion), income = factor(income))library(tidyverse) %>% - relig_income pivot_longer(-religion, names_to = "income", values_to = "respondents") %>% - mutate(religion = factor(religion), income = factor(income))
# A tibble: 180 × 3 religion income respondents @@ -487,7 +489,7 @@
pivot_longer()
+pivot_longer(-religion, names_to = "income", values_to = "respondents")# Gather everything EXCEPT religion to tidy data %>% - relig_income pivot_longer(-religion, names_to = "income", values_to = "respondents")
# A tibble: 180 × 3 religion income respondents @@ -525,13 +527,15 @@
You use the
pivot_wider()
summarize()
function indplyr
to summarize the total number of respondents per income category.+pivot_longer(-religion, names_to = "income", values_to = "respondents") %>% + mutate(religion = factor(religion), income = factor(income)) %>% + group_by(income) %>% + summarize(total_respondents = sum(respondents)) %>% + pivot_wider( + names_from = "income", + values_from = "total_respondents" + %>% + ) ::kable() knitr%>% - relig_income pivot_longer(-religion, names_to = "income", values_to = "respondents") %>% - mutate(religion = factor(religion), income = factor(income)) %>% - group_by(income) %>% - summarize(total_respondents = sum(respondents)) %>% - pivot_wider(names_from = "income", - values_from = "total_respondents") %>% - ::kable() knitr
@@ -645,34 +649,34 @@ Bonus: Calculate a mean revenue for each company AND each year (averaged across all 4 quarters).
pivot_wider()
<- tibble( - df "company" = rep(1:3, each=4), - "year" = rep(2006:2009, 3), - "Q1" = sample(x = 0:100, size = 12), - "Q2" = sample(x = 0:100, size = 12), - "Q3" = sample(x = 0:100, size = 12), - "Q4" = sample(x = 0:100, size = 12), + "company" = rep(1:3, each = 4), + "year" = rep(2006:2009, 3), + "Q1" = sample(x = 0:100, size = 12), + "Q2" = sample(x = 0:100, size = 12), + "Q3" = sample(x = 0:100, size = 12), + "Q4" = sample(x = 0:100, size = 12), ) df
+ 1 1 2006 99 6 54 47 + 2 1 2007 28 79 90 9 + 3 1 2008 7 72 69 24 + 4 1 2009 16 56 6 100 + 5 2 2006 42 58 75 25 + 6 2 2007 64 1 100 6 + 7 2 2008 43 88 37 77 + 8 2 2009 95 74 17 44 + 9 3 2006 34 47 77 38 +10 3 2007 73 31 31 54 +11 3 2008 4 49 93 0 +12 3 2009 57 4 45 96# A tibble: 12 × 6 company year Q1 Q2 Q3 Q4 <int> <int> <int> <int> <int> <int> - 1 1 2006 34 7 70 7 - 2 1 2007 72 26 96 64 - 3 1 2008 62 68 45 98 - 4 1 2009 45 48 42 92 - 5 2 2006 51 13 75 36 - 6 2 2007 49 71 34 93 - 7 2 2008 100 83 22 71 - 8 2 2009 91 67 28 80 - 9 3 2006 19 28 85 1 -10 3 2007 61 38 65 75 -11 3 2008 32 57 47 51 -12 3 2009 4 58 63 0
-@@ -686,10 +690,12 @@+# try it yourself
# try it yourself
separate()
First, we combine the first three columns into one new column using
unite()
.-+%>% - gapminder unite(col="country_continent_year", - :year, - countrysep="_")
%>% + gapminder unite( + col = "country_continent_year", + :year, + countrysep = "_" + )
# A tibble: 1,704 × 4 country_continent_year lifeExp pop gdpPercap @@ -709,13 +715,17 @@
separate()
Next, we show how to separate the columns into three separate columns using
separate()
using thecol
,into
andsep
arguments.-+ + ++%>% - gapminder unite(col="country_continent_year", - :year, - countrysep="_") %>% - separate(col="country_continent_year", - into=c("country", "continent", "year"), - sep="_")
%>% + gapminder unite( + col = "country_continent_year", + :year, + countrysep = "_" + %>% + ) separate( + col = "country_continent_year", + into = c("country", "continent", "year"), + sep = "_" + )
# A tibble: 1,704 × 6 country continent year lifeExp pop gdpPercap @@ -755,11 +765,11 @@
Final Questions
What do the extra and fill arguments do in
separate()
? Experiment with the various options for the following two toy datasets.-tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>% -separate(x, c("one", "two", "three")) +
+tibble(x = c("a,b,c", "d,e", "f,g,i")) %>% +separate(x, c("one", "two", "three"))tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>% +separate(x, c("one", "two", "three")) -tibble(x = c("a,b,c", "d,e", "f,g,i")) %>% -separate(x, c("one", "two", "three"))
- @@ -787,9 +797,83 @@
Both
unite()
andseparate()
have a remove argument. What does it do? Why would you set it to FALSE?Additional Resources<
+ diff --git a/posts/10-joining-data-in-r/index.html b/posts/10-joining-data-in-r/index.html index b492a82..017229f 100644 --- a/posts/10-joining-data-in-r/index.html +++ b/posts/10-joining-data-in-r/index.html @@ -266,6 +266,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + gapminder * 1.0.0 2023-03-10 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
Final Questions Additional Resources +R session information @@ -279,6 +280,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -401,9 +403,9 @@The first table
@@ -411,15 +413,15 @@library(tidyverse) <- tibble( - outcomes id = rep(c("a", "b", "c"), each = 3), - visit = rep(0:2, 3), - outcome = rnorm(3 * 3, 3) + id = rep(c("a", "b", "c"), each = 3), + visit = rep(0:2, 3), + outcome = rnorm(3 * 3, 3) ) print(outcomes)
The first table
+1 a 0 3.07 +2 a 1 3.25 +3 a 2 3.93 +4 b 0 2.18 +5 b 1 2.91 +6 b 2 2.83 +7 c 0 1.49 +8 c 1 2.56 +9 c 2 1.46# A tibble: 9 × 3 id visit outcome <chr> <int> <dbl> -1 a 0 1.54 -2 a 1 3.39 -3 a 2 3.03 -4 b 0 0.309 -5 b 1 2.52 -6 b 2 3.03 -7 c 0 2.13 -8 c 1 3.12 -9 c 2 3.99
Note that subjects are labeled by a unique identifer in the
@@ -429,8 +431,8 @@id
column.A second table
Here is some code to create a second table (we will be joining the first and second tables shortly). This table contains some data about the hypothetical subjects’ housing situation by recording the type of house they live in.
@@ -517,15 +519,15 @@<- tibble( - subjects id = c("a", "b", "c"), - house = c("detached", "rowhouse", "rowhouse") + id = c("a", "b", "c"), + house = c("detached", "rowhouse", "rowhouse") ) print(subjects)
Left Join
+1 a 0 3.07 +2 a 1 3.25 +3 a 2 3.93 +4 b 0 2.18 +5 b 1 2.91 +6 b 2 2.83 +7 c 0 1.49 +8 c 1 2.56 +9 c 2 1.46# A tibble: 9 × 3 id visit outcome <chr> <int> <dbl> -1 a 0 1.54 -2 a 1 3.39 -3 a 2 3.03 -4 b 0 0.309 -5 b 1 2.52 -6 b 2 3.03 -7 c 0 2.13 -8 c 1 3.12 -9 c 2 3.99
subjects
@@ -545,15 +547,15 @@Left Join
+1 a 0 3.07 detached +2 a 1 3.25 detached +3 a 2 3.93 detached +4 b 0 2.18 rowhouse +5 b 1 2.91 rowhouse +6 b 2 2.83 rowhouse +7 c 0 1.49 rowhouse +8 c 1 2.56 rowhouse +9 c 2 1.46 rowhouse# A tibble: 9 × 4 id visit outcome house <chr> <int> <dbl> <chr> -1 a 0 1.54 detached -2 a 1 3.39 detached -3 a 2 3.03 detached -4 b 0 0.309 rowhouse -5 b 1 2.52 rowhouse -6 b 2 3.03 rowhouse -7 c 0 2.13 rowhouse -8 c 1 3.12 rowhouse -9 c 2 3.99 rowhouse
@@ -574,9 +576,9 @@Left Join w
In the previous examples, the
subjects
table didn’t have avisit
column. But suppose it did? Maybe people move around during the study. We could image a table like this one.@@ -596,15 +598,15 @@<- tibble( - subjects id = c("a", "b", "c"), - visit = c(0, 1, 0), - house = c("detached", "rowhouse", "rowhouse"), + id = c("a", "b", "c"), + visit = c(0, 1, 0), + house = c("detached", "rowhouse", "rowhouse"), ) print(subjects)
Left Join w
+1 a 0 3.07 detached +2 a 1 3.25 <NA> +3 a 2 3.93 <NA> +4 b 0 2.18 <NA> +5 b 1 2.91 rowhouse +6 b 2 2.83 <NA> +7 c 0 1.49 rowhouse +8 c 1 2.56 <NA> +9 c 2 1.46 <NA># A tibble: 9 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 a 0 1.54 detached -2 a 1 3.39 <NA> -3 a 2 3.03 <NA> -4 b 0 0.309 <NA> -5 b 1 2.52 rowhouse -6 b 2 3.03 <NA> -7 c 0 2.13 rowhouse -8 c 1 3.12 <NA> -9 c 2 3.99 <NA>
@@ -627,9 +629,9 @@Left Join w
We may even have a situation where we are missing housing data for a subject completely. The following table has no information about subject
a
.@@ -648,15 +650,15 @@<- tibble( - subjects id = c("b", "c"), - visit = c(1, 0), - house = c("rowhouse", "rowhouse"), + id = c("b", "c"), + visit = c(1, 0), + house = c("rowhouse", "rowhouse"), ) subjects
Left Join w
+1 a 0 3.07 <NA> +2 a 1 3.25 <NA> +3 a 2 3.93 <NA> +4 b 0 2.18 <NA> +5 b 1 2.91 rowhouse +6 b 2 2.83 <NA> +7 c 0 1.49 rowhouse +8 c 1 2.56 <NA> +9 c 2 1.46 <NA># A tibble: 9 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 a 0 1.54 <NA> -2 a 1 3.39 <NA> -3 a 2 3.03 <NA> -4 b 0 0.309 <NA> -5 b 1 2.52 rowhouse -6 b 2 3.03 <NA> -7 c 0 2.13 rowhouse -8 c 1 3.12 <NA> -9 c 2 3.99 <NA>
@@ -686,8 +688,8 @@@@ -700,8 +702,8 @@Inner Join
+1 b 1 2.91 rowhouse +2 c 0 1.49 rowhouse# A tibble: 2 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 b 1 2.52 rowhouse -2 c 0 2.13 rowhouse
Right Join
+1 b 1 2.91 rowhouse +2 c 0 1.49 rowhouse @@ -735,11 +737,15 @@# A tibble: 2 × 4 id visit outcome house <chr> <dbl> <dbl> <chr> -1 b 1 2.52 rowhouse -2 c 0 2.13 rowhouse
Final Questions
+<- data.frame( + df1 ID = 1:3, + X1 = c("a1", "a2", "a3") + + )# Create second example data frame +<- data.frame( + df2 ID = 2:4, + X2 = c("b1", "b2", "b3") + )# Create first example data frame -<- data.frame(ID = 1:3, - df1 X1 = c("a1", "a2", "a3")) - # Create second example data frame -<- data.frame(ID = 2:4, - df2 X2 = c("b1", "b2", "b3"))
- Try changing the order from the above e.g.
@@ -766,9 +772,82 @@inner_join(df2, df1)
,semi_join(df2, df1)
andanti_join(df2, df1)
. What changed? What did not change?Additional Resources< + + +
+ diff --git a/posts/11-plotting-systems/index.html b/posts/11-plotting-systems/index.html index fb0d71d..ae0389a 100644 --- a/posts/11-plotting-systems/index.html +++ b/posts/11-plotting-systems/index.html @@ -247,6 +247,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr * 1.43 2023-05-25 [1] CRAN (R 4.3.0) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
- The Lattice System
- The ggplot2 System
+- R session information
@@ -260,6 +261,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -357,8 +359,8 @@The data may not contain the answer. And, if you torture the data long enough, it will tell you anything. —John W. Tukey
The Base Plotting
data(airquality) with(airquality, { -plot(Temp, Ozone) - lines(loess.smooth(Temp, Ozone)) + plot(Temp, Ozone) + lines(loess.smooth(Temp, Ozone)) })
@@ -380,8 +382,8 @@The Base Plotting
data(airquality) with(airquality, { -plot(Temp, Ozone, main = "my plot") - lines(loess.smooth(Temp, Ozone)) + plot(Temp, Ozone, main = "my plot") + lines(loess.smooth(Temp, Ozone)) })
@@ -559,8 +561,8 @@The ggplot2 System
+ggplot(aes(displ, hwy)) + + geom_point()library(tidyverse) data(mpg) %>% - mpg ggplot(aes(displ, hwy)) + - geom_point()
There are additional functions in
ggplot2
that allow you to make arbitrarily sophisticated plots.We will discuss more about this in the next lecture.
+ + ++ diff --git a/posts/12-ggplot2-plotting-system-part-1/index.html b/posts/12-ggplot2-plotting-system-part-1/index.html index 9e0296d..c2cab99 100644 --- a/posts/12-ggplot2-plotting-system-part-1/index.html +++ b/posts/12-ggplot2-plotting-system-part-1/index.html @@ -258,6 +258,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0) + lattice * 0.21-8 2023-04-05 [1] CRAN (R 4.3.1) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
- Final Questions
- Additional Resources
+- R session information
@@ -271,6 +272,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
@@ -368,9 +370,9 @@“The greatest value of a picture is when it forces us to notice what we never expected to see.” —John Tukey
The ggplot2 Plotting System
Consider the following plot made using base graphics previously.
-with(airquality, { -plot(Temp, Ozone) - lines(loess.smooth(Temp, Ozone)) +
with(airquality, { +plot(Temp, Ozone) + lines(loess.smooth(Temp, Ozone)) })
@@ -394,11 +396,13 @@The ggplot2 Plotting System
+ggplot(aes(Temp, Ozone)) + + geom_point() + + geom_smooth( + method = "loess", + se = FALSE + + + ) theme_minimal()library(tidyverse) %>% - airquality ggplot(aes(Temp, Ozone)) + - geom_point() + - geom_smooth(method = "loess", - se = FALSE) + - theme_minimal()
If we wanted to count the number of penguins for each of the three species, we can use the
count()
function indplyr
:-+%>% - penguins count(species)
%>% + penguins count(species)
# A tibble: 3 × 2 species n @@ -901,8 +905,8 @@
Facets
What if you wanted to add a smoother to each one of those panels? Simple, you literally just add the smoother as another geom.
-+qplot(displ, hwy, data = mpg, facets = . ~ drv) + -geom_smooth(method = "lm")
qplot(displ, hwy, data = mpg, facets = . ~ drv) + +geom_smooth(method = "lm")
This is slightly better but the substantial overlap makes it difficult to discern any trends in the data. For this we need to add a smoother of some sort. Here we add a linear regression line (a type of smoother) to each group to see if there’s any difference.
-+ + ++qplot(log(pm25), log(eno), data = maacs, color = mopos) + -geom_smooth(method = "lm")
qplot(log(pm25), log(eno), data = maacs, color = mopos) + +geom_smooth(method = "lm")
@@ -1111,8 +1115,8 @@`geom_smooth()` using formula = 'y ~ x'
Case Study: MAACS
Here we see quite clearly that the red group and the green group exhibit rather different relationships between PM2.5 and eNO. For the non-allergic individuals, there appears to be a slightly negative relationship between PM2.5 and eNO and for the allergic individuals, there is a positive relationship. This suggests a strong interaction between PM2.5 and allergic status, an hypothesis perhaps worth following up on in greater detail than this brief exploratory analysis.
Another, and perhaps more clear, way to visualize this interaction is to use separate panels for the non-allergic and allergic individuals using the
facets
argument toqplot()
.-+qplot(log(pm25), log(eno), data = maacs, facets = . ~ mopos) + -geom_smooth(method = "lm")
qplot(log(pm25), log(eno), data = maacs, facets = . ~ mopos) + +geom_smooth(method = "lm")
@@ -1169,9 +1173,95 @@`geom_smooth()` using formula = 'y ~ x'
Additional Resources<
+ diff --git a/posts/13-ggplot2-plotting-system-part-2/index.html b/posts/13-ggplot2-plotting-system-part-2/index.html index bee9aaa..bda08da 100644 --- a/posts/13-ggplot2-plotting-system-part-2/index.html +++ b/posts/13-ggplot2-plotting-system-part-2/index.html @@ -274,6 +274,7 @@R session information
++-+options(width = 120) +::session_info() sessioninfo
+++─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── + setting value + version R version 4.3.1 (2023-06-16) + os macOS Ventura 13.5 + system aarch64, darwin20 + ui X11 + language (EN) + collate en_US.UTF-8 + ctype en_US.UTF-8 + tz America/New_York + date 2023-08-17 + pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown) + +─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── + package * version date (UTC) lib source + bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0) + bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0) + cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) + colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd) + colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) + crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0) + digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) + dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) + evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) + fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) + farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0) + fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) + forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0) + generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) + ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.3.1) + glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) + gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) + here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0) + hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0) + htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0) + htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) + jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) + knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) + labeling 0.4.2 2020-10-20 [1] CRAN (R 4.3.0) + lattice 0.21-8 2023-04-05 [1] CRAN (R 4.3.1) + lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) + lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) + magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) + Matrix 1.6-1 2023-08-14 [1] CRAN (R 4.3.0) + mgcv 1.9-0 2023-07-11 [1] CRAN (R 4.3.0) + munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) + nlme 3.1-163 2023-08-09 [1] CRAN (R 4.3.0) + palmerpenguins * 0.1.1 2022-08-15 [1] CRAN (R 4.3.0) + pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) + pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) + purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) + R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) + readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0) + rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) + rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1) + rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0) + rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) + scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) + sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) + stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) + stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) + tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) + tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) + tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) + tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0) + timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) + tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0) + utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) + vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.0) + vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.0) + withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) + xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) + yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) + + [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library + +──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Table of contents
- Final Questions
- Additional Resources
+- R session information
@@ -287,6 +288,7 @@Table of contents
+This lecture, as the rest of the course, is adapted from the version Stephanie C. Hicks designed and maintained in 2021 and 2022. Check the recent changes to this file through the GitHub history.
Pre-lecture materials
@@ -431,8 +433,9 @@Example: BMI, PM2
+col_types = "nnci" + + ) maacslibrary(tidyverse) library(here) <- read_csv(here("data", "bmi_pm25_no2_sim.csv"), - maacs col_types = "nnci") - maacs
# A tibble: 517 × 4 logpm25 logno2_new bmicat NocturnalSympt @@ -476,9 +479,11 @@
Building up in layers
Here, we will eventually be plotting the log of PM2.5 and
NocturnalSymp
variable.-+<- ggplot(maacs, aes(x = logpm25, - g y = NocturnalSympt)) - summary(g)
<- ggplot(maacs, aes( + g x = logpm25, + y = NocturnalSympt + + ))summary(g)
data: logpm25, logno2_new, bmicat, NocturnalSympt [517x4] mapping: x = ~logpm25, y = ~NocturnalSympt @@ -510,7 +515,7 @@
Building up in layers
Now, normally if you were to
print()
aggplot
object a plot would appear on the plot device, however, our objectg
actually does not contain enough information to make a plot yet.<- maacs %>% - g ggplot(aes(logpm25, NocturnalSympt)) + ggplot(aes(logpm25, NocturnalSympt)) print(g)
@@ -527,7 +532,7 @@First plot wit
Here, we add the
geom_point()
function to create a traditional scatter plot.<- maacs %>% - g ggplot(aes(logpm25, NocturnalSympt)) + ggplot(aes(logpm25, NocturnalSympt)) + geom_point() g
@@ -546,9 +551,9 @@Adding more layers
smooth
Because the data appear rather noisy, it might be better if we added a smoother on top of the points to see if there is a trend in the data with PM2.5.
-++ - g geom_point() + - geom_smooth()
+ + g geom_point() + + geom_smooth()
The default smoother is a loess smoother, which is flexible and nonparametric but might be too flexible for our purposes. Perhaps we’d prefer a simple linear regression line to highlight any first order trends. We can do this by specifying
method = "lm"
togeom_smooth()
.-++ - g geom_point() + - geom_smooth(method = "lm")
+ + g geom_point() + + geom_smooth(method = "lm")
# A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -626,10 +631,10 @@
facets
We want one row and two columns, one column for each weight category. So we specify
bmicat
on the right hand side of the forumla passed tofacet_grid()
.-++ - g geom_point() + - geom_smooth(method = "lm") + - facet_grid(. ~ bmicat)
+ + g geom_point() + + geom_smooth(method = "lm") + + facet_grid(. ~ bmicat)