Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it reasonable to use marker freq as the population frequency of the disease allele? #38

Open
changebio opened this issue Mar 1, 2022 · 12 comments

Comments

@changebio
Copy link
Contributor

In paramlink2, we need to set the dfreq for diseaseModel.
For example:
dm = diseaseModel(chrom = "AD", penetrances = c(0,1,1), dfreq = 1e-5).

@gaow
Copy link
Owner

gaow commented Mar 1, 2022

dfreq = population frequency of the disease allele? Then yes the marker freq should be the population frequency of the markers

@changebio
Copy link
Contributor Author

In common variant linkage analysis, How about the variants with high frequency, such as 0.4113 or 0.3247.
Screen Shot 2022-03-02 at 10 07 36 AM

@gaow
Copy link
Owner

gaow commented Mar 3, 2022

Common variant is a different context where your marker allele is not necessarily the disease allele (unobserved) ... It seems safe to set it to a low number -- take a look at Figure 1 of this paper, and in the discussion section about setting it to 0.01 for dominate gene and 0.1 for recessive.

@gaow
Copy link
Owner

gaow commented Mar 3, 2022

Also this review paper has a formula to estimate disease allele frequency using penetrance for disease, for phenocopies and prevalence.

@changebio
Copy link
Contributor Author

Common variant is a different context where your marker allele is not necessarily the disease allele (unobserved) ... It seems safe to set it to a low number -- take a look at Figure 1 of this paper, and in the discussion section about setting it to 0.01 for dominate gene and 0.1 for recessive.

It looks like 0.01 is a common setting for dominate model.

@changebio
Copy link
Contributor Author

Also this review paper has a formula to estimate disease allele frequency using penetrance for disease, for phenocopies and prevalence.

In Figure 1 of the review article, why are common variants (0.05) removed for family-based whole-genome sequencing analysis?

@changebio
Copy link
Contributor Author

changebio commented Mar 7, 2022

In APOE gene, I compared the models by using different frequencies. The conclusion is that the lower the frequency, the higher LOD score. The full line without circle dot is the result from the actual frequency. (the x-axis is from 0 to 0.45)
image

@gaow
Copy link
Owner

gaow commented Mar 7, 2022

In Figure 1 of the review article, why are common variants (0.05) removed for family-based whole-genome sequencing analysis?

This is a typical protocol for filter based variant discovery for Mendelian diseases. The variants to be identified usually have large penetrance. If penetrance is high and disease variants are common, the disease will no longer be rare Mendelian. That's why these variants are removed in the beginning.

@gaow
Copy link
Owner

gaow commented Mar 7, 2022

I compared the models by using different frequencies. The conclusion is that the lower the frequency, the higher LOD score.

could you clarify what you meant by "frequency"? the "disease frequency" we discussed? What's on the Y-axis -- are they LOD scores?

@changebio
Copy link
Contributor Author

I compared the models by using different frequencies. The conclusion is that the lower the frequency, the higher LOD score.

could you clarify what you meant by "frequency"? the "disease frequency" we discussed? What's on the Y-axis -- are they LOD scores?

The frequency means the disease frequency. Y-axis is the sum of LOD scores among different families.

@changebio
Copy link
Contributor Author

In Figure 1 of the review article, why are common variants (0.05) removed for family-based whole-genome sequencing analysis?

This is a typical protocol for filter based variant discovery for Mendelian diseases. The variants to be identified usually have large penetrance. If penetrance is high and disease variants are common, the disease will no longer be rare Mendelian. That's why these variants are removed in the beginning.

So it is not necessary to do common variant linkage analysis for family data? The analysis should focus on rare variants?

@changebio
Copy link
Contributor Author

The Figure I showed is based on common variants (maf>0.05).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants