An even better duplication marking algorithm that handles all cases including clipped
and gapped alignments.
This tool differs with MarkDuplicates as it may break ties differently. Furthermore,
as it is a one-pass algorithm, it cannot know the program records contained in the file
that should be chained in advance. Therefore it will only be able to examine the header
to attempt to infer those program group records that have no associated previous program
group record. If a read is encountered without a program record, or not one as previously
defined, it will not be updated.
This tool will also not work with alignments that have large gaps or skips, such as those
from RNA-seq data. This is due to the need to buffer small genomic windows to ensure
integrity of the duplicate marking, while large skips (ex. skipping introns) in the
alignment records would force making that window very large, thus exhausting memory.