-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
19 lines (8 loc) · 852 Bytes
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
High level overview of functions for MatrixeQTL/glmnet
Prioritize correctness,modularity,and readability (in that order)
Processing samples
Matrix eQTL needs to be run on subsampled SNP and expression matrices. This is complicated due to the sampling with replacement, as MatrixEQTL returns an error when a matrix has duplicated columns. Possible solutions are to sample without replacement, or to create new names for the columns, and then rework which samples are which
Instead, we're simply going to sample without replacement.
To perform k-fold cross validation, we must first split the samples up into k groups.
This should be done by subprocesses to reduce memory load.
MEQTL Child processes will be passed the entire dataset, along with criteria for removing test data, and the rest of the parameters for that run of MatrixEQTL