-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Welcome to the bcell_pipeline Wiki!
Some notes on running the pipeline.
pRESTO, and thus the pipeline as a whole, takes in two read files and their corresponding primers. The primer files must match the read file, otherwise pRESTO will not work. A yaml file giving metadata is also needed for each run, and a unique yaml file is created and used for each run.
pRESTO uses the default parameters for BuildConsensus. For MaskPrimers, we are using MP_UIDLEN=8 MP_R1_MAXERR=0.3 MP_R2_MAXERR=0.3 CREGION_MAXLEN=106 CREGION_MAXERR=0.25 These parameters must be modified accordingly depending on the size of the reads files. Mismatched parameters will result in a failed run.
Change-O IgBLAST is next. This script does not have variable parameters that need modifying.
TigGER and SHazaM are then performed which calculate novel V-gene alleles (TigGER) and output a tuned distance threshold for use by Change-O Clone (SHazaM). These scripts do not have variable parameters besides the input file. Change-O Clone is next performed. The only variable parameter is the distance threshold from SHazaM.
Note that a secondary pipeline can be used that uses a default distance threshold (in this case 0.12) for comparing datasets. This pipeline is in bcellss1.rf, rather than the primary bcell.rf. Note that for this pipeline you need to have a fastq file of your target sequences, and also you must update the link to this file (line 17 in prestosshc.rf). This secondary pipeline also differs in that will process only heavy chain sequences after the pRESTO step, and that it examines all heavy chain unique sequences from the pRESTO output, not just those with 2 or more UMIs.
The germline reconstruction output by Change-O Clone can then be used for multiple R analyses. The R scripts in this pipeline, based on Alakazam and SHazaM scripts, are all included in the Alakazam step.