Skip to content

Commit

Permalink
switched to quarto
Browse files Browse the repository at this point in the history
  • Loading branch information
nehamoopen committed Aug 6, 2024
1 parent 790e89b commit 733dbd1
Show file tree
Hide file tree
Showing 19 changed files with 2,418 additions and 0 deletions.
1 change: 1 addition & 0 deletions book/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/.quarto/
31 changes: 31 additions & 0 deletions book/_quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
project:
type: book
output-dir: ../docs

book:
title: "Dynamics of Youth"
subtitle: "DATA HANDBOOK"
author: "Neha Moopen"
date: last-modified
sidebar:
style: docked
chapters:
- index.qmd
- data-management-plans.qmd
- naming-conventions.qmd
- data-pipelining.qmd
- codebooks.qmd
- references.qmd

bibliography: references.bib

format:
html:
theme:
light: [cosmo, styles/theme-light.scss]
dark: [cosmo, styles/theme-dark.scss]
pdf:
documentclass: scrreprt
docx: default


50 changes: 50 additions & 0 deletions book/codebooks.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Codebooks {.unnumbered}

A codebook is an example of data-level metadata.

The purpose of a codebook or data dictionary is to explain what all the variable names and values in your spreadsheet really mean.

Information to include in a codebook includes:

- Variable Names
- Readable Variable Name
- Measurement Units
- Allowed Values
- Definition Of The Variable
- Synonyms For The Variable Name (Optional)
- Description Of The Variable (Optional)
- Other Resources

See: <https://help.osf.io/article/217-how-to-make-a-data-dictionary>

## codebook R package

```
library(qualtRics)
library(readr)
library(dplyr)
library(codebook)
library(writexl)
surveys <- all_surveys()
survey_results <- fetch_survey(surveyID = surveys$id[2], # you can also replace surveys$id[2] with "<SUVREY-ID>"
verbose = TRUE)
survey_results <- select(survey_results, -c(1:17))
# survey_questions() retrieves a data frame containing questions and question IDs for a survey;
survey_questions <- survey_questions(surveyID = surveys$id[2])
survey_questions <- select(survey_questions, -c(1, 4))
survey_questions <- slice(survey_questions, -1)
# generate codebook
codebook <- codebook_table(survey_results)
codebook <- rename(codebook, qname = name)
codebook <- full_join(survey_questions, codebook, by = "qname")
write_xlsx(codebook, "documentation/codebook-demo.xlsx")
```
Binary file added book/cover.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
66 changes: 66 additions & 0 deletions book/data-management-plans.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Data Management Plans {.unnumbered}

## What Is A Data Management Plan?

A Data Management Plan (DMP) is a formal document that describes your data and outlines all aspects of managing your data - both during and after your project.

Moreover, it is a _living_ document that can you can revise and update as needed.

![](images/data-management-plan.jpg)

<figcaption><a href="https://the-turing-way.netlify.app/">The Turing Way</a> project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: <a href="https://doi.org/10.5281/zenodo.3332807">10.5281/zenodo.3332807</a>.</figcaption>

## Why Should You Write A DMP?

Writing a DMP provides an opportunity to reflect on your data, particularly how you organize and manage it. It nudges you to think about how to make your RDM more _concrete_ and _actionable_. This creates efficiency and more value for your data.

## When Should You Write A DMP?

Working on a DMP at the start of your project will ensure that you are better informed of best practices in RDM and prepared to implement them. That being said, you can also write a DMP can during the project or when it's completed.

## DMPonline & DMP Templates

DMPonline is a tool that helps you create and maintain DMPs. With DMPonline, you can:

- register and sign in with your institutional credentials,
- write and collaborate on (multiple) DMPs,
- share DMPs or switch their visibility between private and public,
- request feedback from RDM Support,
- download DMPs in various formats.

DMPonline offers DMP templates from various institutions and funders, including:

- Utrecht University
- UMC Utrecht
- [NWO](https://dmponline.dcc.ac.uk/template_export/1753695087.pdf)
- [ZonMw](https://dmponline.dcc.ac.uk/template_export/1461074155.pdf)
- [ERC](https://dmponline.dcc.ac.uk/template_export/2088403152.pdf)
- [Horizon 2020](https://dmponline.dcc.ac.uk/template_export/1612436782.pdf)
- [Horizon Europe](https://dmponline.dcc.ac.uk/template_export/5992485.pdf)

These templates also contain example answers and guidance.

![](images/uu-dmp-template.JPG)

## Tips

!!! note "Tips"

- Contact your DoY data manager! They can (co)write your DMP and/or review it.
- If the DoY data manager is unavailable, you can still request feedback from RDM Support.

## Resources

- [Create your DMP online](https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online)
- [Data management planning](https://www.uu.nl/en/research/research-data-management/guides/data-management-planning)
- [Learn to write your DMP (online training)](https://www.uu.nl/en/research/research-data-management/training-workshops/online-training-learn-to-write-your-dmp)

## References

1. [https://www.uu.nl/en/research/research-data-management/guides/data-management-planning](https://www.uu.nl/en/research/research-data-management/guides/data-management-planning)

2. [https://www.kuleuven.be/rdm/en/faq/faq-dmp](https://www.kuleuven.be/rdm/en/faq/faq-dmp)

3. [https://rdm.uva.nl/en/planning/data-management-plan/data-management-plan.html](https://rdm.uva.nl/en/planning/data-management-plan/data-management-plan.html)

4. [https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online.html](https://www.uu.nl/en/research/research-data-management/tools-services/tool-to-create-your-dmp-online.html)
73 changes: 73 additions & 0 deletions book/data-pipelining.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Data Pipelining {.unnumbered}

A data pipeline is a series of (automated) actions that ingests raw data from various sources and moves the data to a destination for storage and (eventual) analysis.

Benefits of a data pipeline include:

- Time saved by automating the boring stuff!
- Reduced mistakes.
- Tasks broken down into smaller steps.
- Reproducibility!

## When do I need a data pipeline?

Here's a rule of thumb, just as an example:

If you have a task that needs to occur >= 3 times, you could think about automating it.

If automation is not possible, think about how you can make the task as efficient as possible.

## How can I implement a data pipeline? Some examples for inspiration

- If you data collection tools have APIs, they can be leveraged to extract data.

- For example, Qualtrics has the qualtRics R package & pyQualtrics Python library which contain functions to automate exporting surveys.

- If APIs are not available, you could use R/Python to automate the use of an internet browser using the RSelenium package / Selenium library. Imagine automating the clicks and typing of going to a specific website, logging in, clicking the download button.

- You can use Windows Task Scheduler / cron / the taskscheduleR R package / cronR to schedule your scripts to run automatically, on a recurring basis as well (if needed).

- You can also send emails with R & Python! Consider if you've ever had to contact participants because you noticed something wrong with their incoming data. You could implement these data checks with a script and automatically draft and send emails (from a template) to those participants who were flagged as having issues with their data.

## QualtRics R package

```
library(readr)
library(qualtRics)
qualtrics_api_credentials(api_key = "YOUR-QUALTRICS-API-KEY",
base_url = "YOUR-QUALTRICS-BASE-URL",
overwrite = TRUE,
install = TRUE)
readRenviron("~/.Renviron")
surveys <- all_surveys()
survey_results <- fetch_survey(surveyID = surveys$id[2], # you can also replace surveys$id[2] with "<SUVREY-ID>"
verbose = TRUE)
write_csv(survey_results, paste0("path/to/folder/", format(Sys.time(), "%d-%m-%Y-%H.%M"), "_survey_results.csv"))
```

## taskscheduleR package

```
library(taskscheduleR)
scheduled_script <- "path/to/folder/myscript.R"
## run script once within 120 seconds
taskscheduler_create(taskname = "extract-data-once", rscript = scheduled_script,
schedule = "ONCE", starttime = format(Sys.time() + 120, "%H:%M"))
## Run every 5 minutes, starting from 10:40
taskscheduler_create(taskname = "extract-data-5min", rscript = scheduled_script,
schedule = "MINUTE", starttime = "10:40", modifier = 5)
## delete tasks
taskscheduler_delete("extract-data-once")
```
Binary file added book/images/fair-1x4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added book/images/fair-2x2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions book/index.aux
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
\relax
\providecommand*\new@tpo@label[2]{}
\providecommand\hyper@newdestlabel[2]{}
\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
\global\let\oldnewlabel\newlabel
\gdef\newlabel#1#2{\newlabelxx{#1}#2}
\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
\AtEndDocument{\ifx\hyper@anchor\@undefined
\let\newlabel\oldnewlabel
\fi}
\fi}
\global\let\hyper@last\relax
\gdef\HyperFirstAtBeginDocument#1{#1}
\providecommand*\HyPL@Entry[1]{}
\HyPL@Entry{0<</S/D>>}
\newlabel{welcome}{{}{3}{}{chapter*.2}{}}
\@writefile{toc}{\contentsline {chapter}{Welcome!}{3}{chapter*.2}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces This illustration is created by Scriberia with The Turing Way community. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807}}{3}{figure.caption.3}\protected@file@percent }
Loading

0 comments on commit 733dbd1

Please sign in to comment.