switched to quarto

UtrechtUniversity · Aug 6, 2024 · 733dbd1 · 733dbd1
1 parent 790e89b
commit 733dbd1
Show file tree

Hide file tree

Showing 19 changed files with 2,418 additions and 0 deletions.
diff --git a/book/.gitignore b/book/.gitignore
@@ -0,0 +1 @@
+/.quarto/
diff --git a/book/_quarto.yml b/book/_quarto.yml
@@ -0,0 +1,31 @@
+project:
+  type: book
+  output-dir: ../docs
+
+book:
+  title: "Dynamics of Youth"
+  subtitle: "DATA HANDBOOK"
+  author: "Neha Moopen"
+  date: last-modified
+  sidebar: 
+    style: docked
+  chapters:
+    - index.qmd
+    - data-management-plans.qmd
+    - naming-conventions.qmd
+    - data-pipelining.qmd
+    - codebooks.qmd
+    - references.qmd
+
+bibliography: references.bib
+
+format:
+  html:
+    theme: 
+      light: [cosmo, styles/theme-light.scss]
+      dark: [cosmo, styles/theme-dark.scss]
+  pdf:
+    documentclass: scrreprt
+  docx: default
+
+
diff --git a/book/codebooks.qmd b/book/codebooks.qmd
@@ -0,0 +1,50 @@
+# Codebooks {.unnumbered}
+
+A codebook is an example of data-level metadata.
+
+The purpose of a codebook or data dictionary is to explain what all the variable names and values in your spreadsheet really mean.
+
+Information to include in a codebook includes:
+
+- Variable Names
+- Readable Variable Name
+- Measurement Units
+- Allowed Values
+- Definition Of The Variable
+- Synonyms For The Variable Name (Optional)
+- Description Of The Variable (Optional)
+- Other Resources
+
+See: <https://help.osf.io/article/217-how-to-make-a-data-dictionary>
+
+## codebook R package
+
+```
+library(qualtRics)
+library(readr)
+library(dplyr)
+library(codebook)
+library(writexl)
+
+surveys <- all_surveys()
+
+survey_results <- fetch_survey(surveyID = surveys$id[2], # you can also replace surveys$id[2] with "<SUVREY-ID>"
+                               verbose = TRUE)
+
+survey_results <- select(survey_results, -c(1:17))
+
+# survey_questions() retrieves a data frame containing questions and question IDs for a survey;
+survey_questions <- survey_questions(surveyID = surveys$id[2])
+survey_questions <- select(survey_questions, -c(1, 4))
+survey_questions <- slice(survey_questions, -1)
+  
+# generate codebook
+
+codebook <- codebook_table(survey_results)
+
+codebook <- rename(codebook, qname = name)
+
+codebook <- full_join(survey_questions, codebook, by = "qname")
+
+write_xlsx(codebook, "documentation/codebook-demo.xlsx")
+```
diff --git a/book/cover.png b/book/cover.png
diff --git a/book/data-management-plans.qmd b/book/data-management-plans.qmd
@@ -0,0 +1,66 @@
+# Data Management Plans {.unnumbered}
+
+## What Is A Data Management Plan?
+
+A Data Management Plan (DMP) is a formal document that describes your data and outlines all aspects of managing your data - both during and after your project.
+
+Moreover, it is a _living_ document that can you can revise and update as needed.
+
+![](images/data-management-plan.jpg)
+
+<figcaption><a href="https://the-turing-way.netlify.app/">The Turing Way</a> project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: <a href="https://doi.org/10.5281/zenodo.3332807">10.5281/zenodo.3332807</a>.</figcaption>
+
+## Why Should You Write A DMP?
+
+Writing a DMP provides an opportunity to reflect on your data, particularly how you organize and manage it. It nudges you to think about how to make your RDM more _concrete_ and _actionable_. This creates efficiency and more value for your data.
+
+## When Should You Write A DMP?
+
+Working on a DMP at the start of your project will ensure that you are better informed of best practices in RDM and prepared to implement them. That being said, you can also write a DMP can during the project or when it's completed.
+
+## DMPonline & DMP Templates
+
+DMPonline is a tool that helps you create and maintain DMPs. With DMPonline, you can:
+
+- register and sign in with your institutional credentials,
+- write and collaborate on (multiple) DMPs,
+- share DMPs or switch their visibility between private and public,
+- request feedback from RDM Support,
+- download DMPs in various formats.
+
+DMPonline offers DMP templates from various institutions and funders, including:
+
+- Utrecht University
+- UMC Utrecht
+- [NWO](https://dmponline.dcc.ac.uk/template_export/1753695087.pdf)
+- [ZonMw](https://dmponline.dcc.ac.uk/template_export/1461074155.pdf)
+- [ERC](https://dmponline.dcc.ac.uk/template_export/2088403152.pdf)
+- [Horizon 2020](https://dmponline.dcc.ac.uk/template_export/1612436782.pdf)
+- [Horizon Europe](https://dmponline.dcc.ac.uk/template_export/5992485.pdf)
+
+These templates also contain example answers and guidance.
+
+![](images/uu-dmp-template.JPG)
+
+## Tips
+
+!!! note "Tips"
+
+    - Contact your DoY data manager! They can (co)write your DMP and/or review it.
+    - If the DoY data manager is unavailable, you can still request feedback from RDM Support.
+
+## Resources
+
+- [Create your DMP online](https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online)
+- [Data management planning](https://www.uu.nl/en/research/research-data-management/guides/data-management-planning)
+- [Learn to write your DMP (online training)](https://www.uu.nl/en/research/research-data-management/training-workshops/online-training-learn-to-write-your-dmp) 
+
+## References
+
+1. [https://www.uu.nl/en/research/research-data-management/guides/data-management-planning](https://www.uu.nl/en/research/research-data-management/guides/data-management-planning)
+
+2. [https://www.kuleuven.be/rdm/en/faq/faq-dmp](https://www.kuleuven.be/rdm/en/faq/faq-dmp)
+
+3. [https://rdm.uva.nl/en/planning/data-management-plan/data-management-plan.html](https://rdm.uva.nl/en/planning/data-management-plan/data-management-plan.html)
+
+4. [https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online.html](https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online.html)
diff --git a/book/data-pipelining.qmd b/book/data-pipelining.qmd
@@ -0,0 +1,73 @@
+# Data Pipelining {.unnumbered}
+
+A data pipeline is a series of (automated) actions that ingests raw data from various sources and moves the data to a destination for storage and (eventual) analysis.
+
+Benefits of a data pipeline include:
+
+- Time saved by automating the boring stuff!
+- Reduced mistakes.
+- Tasks broken down into smaller steps.
+- Reproducibility!
+
+## When do I need a data pipeline?
+
+Here's a rule of thumb, just as an example:
+
+If you have a task that needs to occur >= 3 times, you could think about automating it.
+
+If automation is not possible, think about how you can make the task as efficient as possible.
+
+## How can I implement a data pipeline? Some examples for inspiration
+
+- If you data collection tools have APIs, they can be leveraged to extract data.
+
+- For example, Qualtrics has the qualtRics R package & pyQualtrics Python library which contain functions to automate exporting surveys.
+
+- If APIs are not available, you could use R/Python to automate the use of an internet browser using the RSelenium package / Selenium library. Imagine automating the clicks and typing of going to a specific website, logging in, clicking the download button.
+
+- You can use Windows Task Scheduler / cron / the taskscheduleR R package / cronR to schedule your scripts to run automatically, on a recurring basis as well (if needed).
+
+- You can also send emails with R & Python! Consider if you've ever had to contact participants because you noticed something wrong with their incoming data. You could implement these data checks with a script and automatically draft and send emails (from a template) to those participants who were flagged as having issues with their data.
+
+## QualtRics R package
+
+```
+library(readr)
+library(qualtRics)
+
+qualtrics_api_credentials(api_key = "YOUR-QUALTRICS-API-KEY", 
+                          base_url = "YOUR-QUALTRICS-BASE-URL",
+                          overwrite = TRUE,
+                          install = TRUE)
+
+readRenviron("~/.Renviron")
+
+surveys <- all_surveys() 
+
+survey_results <- fetch_survey(surveyID = surveys$id[2], # you can also replace surveys$id[2] with "<SUVREY-ID>" 
+                                  verbose = TRUE)
+
+write_csv(survey_results, paste0("path/to/folder/", format(Sys.time(), "%d-%m-%Y-%H.%M"), "_survey_results.csv"))
+```
+
+## taskscheduleR package
+
+```
+library(taskscheduleR)
+
+scheduled_script <- "path/to/folder/myscript.R"
+
+## run script once within 120 seconds
+
+taskscheduler_create(taskname = "extract-data-once", rscript = scheduled_script,
+                     schedule = "ONCE", starttime = format(Sys.time() + 120, "%H:%M"))
+
+## Run every 5 minutes, starting from 10:40
+
+taskscheduler_create(taskname = "extract-data-5min", rscript = scheduled_script,
+                     schedule = "MINUTE", starttime = "10:40", modifier = 5)
+
+## delete tasks
+
+taskscheduler_delete("extract-data-once")
+```
diff --git a/book/images/fair-1x4.png b/book/images/fair-1x4.png
diff --git a/book/images/fair-2x2.png b/book/images/fair-2x2.png
diff --git a/book/index.aux b/book/index.aux
@@ -0,0 +1,19 @@
+\relax 
+\providecommand*\new@tpo@label[2]{}
+\providecommand\hyper@newdestlabel[2]{}
+\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
+\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
+\global\let\oldnewlabel\newlabel
+\gdef\newlabel#1#2{\newlabelxx{#1}#2}
+\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
+\AtEndDocument{\ifx\hyper@anchor\@undefined
+\let\newlabel\oldnewlabel
+\fi}
+\fi}
+\global\let\hyper@last\relax 
+\gdef\HyperFirstAtBeginDocument#1{#1}
+\providecommand*\HyPL@Entry[1]{}
+\HyPL@Entry{0<</S/D>>}
+\newlabel{welcome}{{}{3}{}{chapter*.2}{}}
+\@writefile{toc}{\contentsline {chapter}{Welcome!}{3}{chapter*.2}\protected@file@percent }
+\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces This illustration is created by Scriberia with The Turing Way community. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807}}{3}{figure.caption.3}\protected@file@percent }