@DocumentedFeature public class MarkDuplicates extends AbstractMarkDuplicatesCommandLineProgram
Modifier and Type | Class and Description |
---|---|
static class |
MarkDuplicates.DuplicateTaggingPolicy
Enum used to control how duplicates are flagged in the DT optional tag on each read.
|
static class |
MarkDuplicates.DuplicateType
Enum for the possible values that a duplicate read can be tagged with in the DT attribute.
|
AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator
Modifier and Type | Field and Description |
---|---|
java.lang.String |
BARCODE_TAG |
static java.lang.String |
DUPLICATE_SET_INDEX_TAG
The attribute in the SAM/BAM file used to store which read was selected as representative out of a duplicate set
|
static java.lang.String |
DUPLICATE_SET_SIZE_TAG
The attribute in the SAM/BAM file used to store the size of a duplicate set
|
static java.lang.String |
DUPLICATE_TYPE_LIBRARY
The duplicate type tag value for duplicate type: library.
|
static java.lang.String |
DUPLICATE_TYPE_SEQUENCING
The duplicate type tag value for duplicate type: sequencing (optical & pad-hopping, or "co-localized").
|
static java.lang.String |
DUPLICATE_TYPE_TAG
The optional attribute in SAM/BAM files used to store the duplicate type.
|
protected LibraryIdGenerator |
libraryIdGenerator |
int |
MAX_FILE_HANDLES_FOR_READ_ENDS_MAP |
int |
MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP
If more than this many sequences in SAM file, don't spill to disk because there will not
be enough file handles.
|
java.lang.String |
READ_ONE_BARCODE_TAG |
java.lang.String |
READ_TWO_BARCODE_TAG |
boolean |
REMOVE_SEQUENCING_DUPLICATES |
double |
SORTING_COLLECTION_SIZE_RATIO |
boolean |
TAG_DUPLICATE_SET_MEMBERS |
MarkDuplicates.DuplicateTaggingPolicy |
TAGGING_POLICY |
ASSUME_SORT_ORDER, ASSUME_SORTED, COMMENT, DUPLICATE_SCORING_STRATEGY, INPUT, METRICS_FILE, OUTPUT, pgIdsSeen, PROGRAM_GROUP_COMMAND_LINE, PROGRAM_GROUP_NAME, PROGRAM_GROUP_VERSION, PROGRAM_RECORD_ID, REMOVE_DUPLICATES
LOG, MAX_OPTICAL_DUPLICATE_SET_SIZE, OPTICAL_DUPLICATE_PIXEL_DISTANCE, opticalDuplicateFinder, READ_NAME_REGEX
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
Constructor and Description |
---|
MarkDuplicates() |
Modifier and Type | Method and Description |
---|---|
protected int |
doWork()
Main work method.
|
static void |
main(java.lang.String[] args)
Stock main method.
|
finalizeAndWriteMetrics, getChainedPgIds, openInputs, trackOpticalDuplicates
customCommandLineValidation, setupOpticalDuplicateFinder
getCommandLine, getCommandLineParser, getDefaultHeaders, getFaqLink, getMetricsFile, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
public static final java.lang.String DUPLICATE_TYPE_TAG
public static final java.lang.String DUPLICATE_TYPE_LIBRARY
public static final java.lang.String DUPLICATE_TYPE_SEQUENCING
public static final java.lang.String DUPLICATE_SET_INDEX_TAG
public static final java.lang.String DUPLICATE_SET_SIZE_TAG
@Argument(shortName="MAX_SEQS", doc="This option is obsolete. ReadEnds will always be spilled to disk.") public int MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP
@Argument(shortName="MAX_FILE_HANDLES", doc="Maximum number of file handles to keep open when spilling read ends to disk. Set this number a little lower than the per-process maximum number of file that may be open. This number can be found by executing the \'ulimit -n\' command on a Unix system.") public int MAX_FILE_HANDLES_FOR_READ_ENDS_MAP
@Argument(doc="This number, plus the maximum RAM available to the JVM, determine the memory footprint used by some of the sorting collections. If you are running out of memory, try reducing this number.") public double SORTING_COLLECTION_SIZE_RATIO
@Argument(doc="Barcode SAM tag (ex. BC for 10X Genomics)", optional=true) public java.lang.String BARCODE_TAG
@Argument(doc="Read one barcode SAM tag (ex. BX for 10X Genomics)", optional=true) public java.lang.String READ_ONE_BARCODE_TAG
@Argument(doc="Read two barcode SAM tag (ex. BX for 10X Genomics)", optional=true) public java.lang.String READ_TWO_BARCODE_TAG
@Argument(doc="If a read appears in a duplicate set, add two tags. The first tag, DUPLICATE_SET_SIZE_TAG (DS), indicates the size of the duplicate set. The smallest possible DS value is 2 which occurs when two reads map to the same portion of the reference only one of which is marked as duplicate. The second tag, DUPLICATE_SET_INDEX_TAG (DI), represents a unique identifier for the duplicate set to which the record belongs. This identifier is the index-in-file of the representative read that was selected out of the duplicate set.", optional=true) public boolean TAG_DUPLICATE_SET_MEMBERS
@Argument(doc="If true remove \'optical\' duplicates and other duplicates that appear to have arisen from the sequencing process instead of the library preparation process, even if REMOVE_DUPLICATES is false. If REMOVE_DUPLICATES is true, all duplicates are removed and this option is ignored.") public boolean REMOVE_SEQUENCING_DUPLICATES
@Argument(doc="Determines how duplicate types are recorded in the DT optional attribute.") public MarkDuplicates.DuplicateTaggingPolicy TAGGING_POLICY
protected LibraryIdGenerator libraryIdGenerator
public static void main(java.lang.String[] args)
protected int doWork()
doWork
in class CommandLineProgram