Please note: All sequences are written in the 5' to 3' direction.
Batch file (BEV input) columns:
name | identifier | amplicon_seq | guide_seq | w | wc | exclude_bp_from_left | exclude_bp_from_right | plot_window_size |
---|---|---|---|---|---|---|---|---|
This document walks through the amplicon_seq and guide_seq columns. For all other parameter descriptions, please see the Base Editor Validation Pipeline documentation on GitHub.
Since CRISPResso only uses the guide_seq input for naming files and does not use the guide_seq for alignment, for simplicity and consistency,
Example:
The reference sequence that CRISPResso outputs is a certain number of nucleotides (determined by the quantification window parameter) upstream and downstream of the input guide sequence. Therefore, it will be in the guide sense direction.
Example:
The sequence that should be translated to determine the amino acid sequence for a particular allele may not necessarily be the same as the CRISPResso reference sequence. The sequence that should be translated is what should be entered into the translation_ref_seq column of the metadata input file for the BEV_allele_frequencies validation notebook. This reference sequence should be formatted such that any untranslated regions (if applicable) are in lowercase.
In this case, the forward DNA strand is being translated, so the reference sequence for translation is the reverse complement of the reference sequence that CRISPResso outputs. Therefore, in this case the reference sequence for the notebook metadata file is TTCCTCTTGCAGCAGCCAGACTGCCTTCCGGGTCACTGCCATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTC.
The rev_com parameter in the notebook input file determines which strand will be translated. The parameter is defined by the following:
In this case, since the guide sequence and CRISPResso reference sequence are on the reverse DNA strand, while the strand being translated is the forward DNA strand, rev_com is True.
Here is what the metadata file for the BEV_allele_frequencies notebook would look like for this example. Explanations for the rest of the columns can be found in the BEV_allele_frequencies notebook:
sg | sgRNA_sequence | translation_ref_seq | BEV_start | BEV_end | primer | frame | first_codon | last_codon | rev_com | BEV_ref | BEV_test |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | GCTCCTCCATGGCAGTGACC | [TTCCTCTTGCAGCAGCCAGACTGCCTTCCGGGTCACTGCC]ATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTC | 417 | 426 | F3_R2 | 1 | ATG | CTG | True | 417;418 | 425;426 |