generated from carpentries/workbench-template-rmd
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto-generated via {sandpaper} Source : a28e750 Branch : main Author : Sarah Kaspar <[email protected]> Time : 2022-12-21 14:54:31 +0000 Message : added first episodes
- Loading branch information
1 parent
c5e6485
commit d92ef89
Showing
9 changed files
with
306 additions
and
122 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
--- | ||
title: " What is sampling" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- How do you write a lesson using R Markdown and `{sandpaper}`? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Explain how to use markdown with the new lesson template | ||
- Demonstrate how to include pieces of code, figures, and nested challenge blocks | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## | ||
|
||
|
||
<p align="center"> | ||
<img src="/fig/sampling-frogs.png" width="500"/> | ||
</p> | ||
|
||
Let's start with an example, and thereby define some terminology. We have a lake with frogs in it, and there are light and dark green frogs. There’s a sunny side of the lake, and a shadowy area by the trees. Now imagine you want to estimate the fraction of light green frogs in the lake. There are too many frogs to count them all, so you catch a few and count how many of them are light coloured. This is a sample. A sample are randomly independently drawn events from a population of interest. The population of interest, in this case, are all the frogs in that lake. How can we draw randomly and independently? One obvious thing you could randomize in this experiment is the location at which you cath the frogs, because from the above picture you could get the impression that light-coloured frogs gather more in the shadows, while the dark-green frogs like the sun. Therefore, if we caught all the frogs in the same area, like in sample 1, this would probably overrepresent light frogs, thus not representing the population well. When randomizing the locations, this is less likely to be the case (see for example sample 2). You get similar problems if the observations are not independent. One example of dependent observations would be if you start with one frog, then catch the one right next to it, and so on. This is also likely to overrepresent one colour of frogs, and the reason why observations shouldn’t depend on each other. The sample size is the number of frogs in one sample. And the distribution is a set of rules that the random frog catches follow. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor | ||
|
||
Inline instructor notes can help inform instructors of timing challenges | ||
associated with the lessons. They appear in the "Instructor View" | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
--- | ||
title: "What is a probability distribution?" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is a probability distribution`? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Describe how a probability distribution maps outcomes to probabilities | ||
- Explain the difference between discrete and continuous distributions | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Overview probability distributions | ||
|
||
CONTENT STILL TO COME FROM VIDEO | ||
|
||
![](https://vimeo.com/647705308) | ||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor | ||
|
||
Inline instructor notes can help inform instructors of timing challenges | ||
associated with the lessons. They appear in the "Instructor View" | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: challenge | ||
|
||
## Challenge 1: Which of the following statements are true? | ||
|
||
|
||
|
||
1. A probability distribution assigns probabilities to possible outcomes of an experiment. | ||
2. The probabilities in a statistical distribution sum/integrate up to 1. | ||
3. If the experiments are not randomized, the results don't follow a statistical distribution. | ||
|
||
|
||
:::::::::::::::::::::::: solution | ||
|
||
## Solution | ||
|
||
Answers 1 and 2 are correct. To 3: If experiments are not randomized, the results still follow some distribution, but they are likely to not represent reality well. | ||
|
||
::::::::::::::::::::::::::::::::: | ||
|
||
|
||
## Challenge 2: Discrete distributions | ||
|
||
What is the probability of an outcome of X=1.5 in a discrete distribution? | ||
|
||
- 0 | ||
- 0.5 | ||
- 0.15 | ||
|
||
:::::::::::::::::::::::: solution | ||
|
||
The value $1.5$ is not discrete, and can therefore not occur in a discrete distribution. Its probability is zero. | ||
|
||
::::::::::::::::::::::::::::::::: | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
--- | ||
title: "The binomial distribution" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is the binomial distribution and? | ||
- What kind of data is it used on? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Overview probability distributions | ||
|
||
The binomial distribution is what we have just seen in the example: We use it when we have a fixed sample size and count the number of "successes" in that sample -- for example mutations in a genome, or the number of cells within a sample that show a certain phenotype. | ||
|
||
TRANSLATE VIDEO | ||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor | ||
|
||
Inline instructor notes can help inform instructors of timing challenges | ||
associated with the lessons. They appear in the "Instructor View" | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: challenge | ||
|
||
## Challenge 1: Which of the following statements are true? | ||
|
||
We are in a diagnostic laboratory that gets blood samples from incoming hospital patients and tests them for some disease. Which of these experiments can be modeled with a binomial distribution? | ||
|
||
1. Counting the total number of samples that get tested over one day. | ||
2. Counting the number of positive samples out of 50 samples that get tested successively. | ||
3. Measuring all the blood sample's volumes (in mL). | ||
|
||
|
||
:::::::::::::::::::::::: solution | ||
|
||
## Solution | ||
|
||
Counting the number of positive samples out of 50 samples that get tested successively. | ||
|
||
::::::::::::::::::::::::::::::::: | ||
|
||
|
||
## Challenge 2: Discrete distributions | ||
|
||
What is the probability of an outcome of X=1.5 in a discrete distribution? | ||
|
||
- 0 | ||
- 0.5 | ||
- 0.15 | ||
|
||
:::::::::::::::::::::::: solution | ||
|
||
The value $1.5$ is not discrete, and can therefore not occur in a discrete distribution. Its probability is zero. | ||
|
||
::::::::::::::::::::::::::::::::: | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
--- | ||
title: "Probability distributions in R" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is the binomial distribution and? | ||
- What kind of data is it used on? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
Before we look at more distributions, let's get some hands-on experience in R! | ||
R knows a whole range of distributions: [Here](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Distributions.html) is a list of them. | ||
|
||
For each distribution, R has four different function calls: | ||
For the binomial distribution, these all end with `binom`: | ||
- `dbinom`: density | ||
- `pbinom`: cumulative distribution function (percentage of values smaller than) | ||
- `qbinom`: quantile function (inverse of cumulative distribution) | ||
- `rbinom`: generates random numbers | ||
|
||
The first letter specifies if we want to look at the density, probability distribution/mass function, quantile or random numbers. The suffix specifies the distribution. | ||
|
||
The arguments depend on the distribution we are looking at, but always include the parameters of that function. | ||
|
||
**Calculating probabilities:** Let's use the example where we caught 10 frogs and count how many of them are light-colored. | ||
|
||
![](../images/binomial_frogs.png) | ||
|
||
For known parameters, we can calculate the the chances of counting exactly 5 light-colored frogs: | ||
|
||
```r | ||
n = 10 # number of frogs we catch | ||
p = 0.3 # true fraction of light frogs | ||
dbinom(x=5, size=n, prob=p) | ||
``` | ||
|
||
```{.output} | ||
[1] 0.1029193 | ||
``` | ||
|
||
We can ask for the probability of catching at most (or at least) 5 light frogs. In this case, we need the cumulative probability distribution starting with `p`: | ||
|
||
|
||
```r | ||
pbinom(q=5, size=n, prob=p) # at most | ||
``` | ||
|
||
```{.output} | ||
[1] 0.952651 | ||
``` | ||
|
||
```r | ||
pbinom(q=5, size=n,prob=p, lower.tail=FALSE) # larger than | ||
``` | ||
|
||
```{.output} | ||
[1] 0.04734899 | ||
``` | ||
|
||
Catching at least 5 light frogs is a rare event. | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: challenge | ||
|
||
## Challenge 1: Disease prevalence | ||
|
||
There is a disease with a known prevalence of 4%. You have a group of 100 randomly selected persons. Use the above functions to calculate | ||
|
||
1. the probability of seeing exactly 7 persons with the disease. | ||
2. the probability of seeing *at least* 7 persons with the disease. | ||
|
||
|
||
:::::::::::::::::::::::: solution | ||
|
||
## Solution | ||
|
||
1. Exactly 7 persons: | ||
|
||
```r | ||
dbinom(x=7,size=100,prob=0.04) | ||
``` | ||
|
||
```{.output} | ||
[1] 0.05888027 | ||
``` | ||
|
||
2. At least 7 persons: | ||
|
||
```r | ||
pbinom(q=6, size=100, prob=0.04, lower.tail=FALSE) | ||
``` | ||
|
||
```{.output} | ||
[1] 0.1063923 | ||
``` | ||
|
||
|
||
::::::::::::::::::::::::::::::::: | ||
|
||
|
||
## Challenge 2: Discrete distributions | ||
|
||
What is the probability of an outcome of X=1.5 in a discrete distribution? | ||
|
||
- 0 | ||
- 0.5 | ||
- 0.15 | ||
|
||
:::::::::::::::::::::::: solution | ||
|
||
The value $1.5$ is not discrete, and can therefore not occur in a discrete distribution. Its probability is zero. | ||
|
||
::::::::::::::::::::::::::::::::: | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,7 +59,10 @@ contact: '[email protected]' | |
|
||
# Order of episodes in your lesson | ||
episodes: | ||
- introduction.Rmd | ||
- 01-sampling.Rmd | ||
- 02-distributions.Rmd | ||
- 03-binomial.Rmd | ||
- 04-distributions-R.Rmd | ||
|
||
# Information for Learners | ||
learners: | ||
|
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.