pub struct LdBG {
pub name: String,
pub kmer_size: usize,
pub kmers: HashMap<Vec<u8>, Record>,
pub scores: HashMap<Vec<u8>, f32>,
pub links: HashMap<Vec<u8>, HashMap<Link, u16>>,
pub sources: HashMap<Vec<u8>, Vec<usize>>,
pub noise: HashSet<Vec<u8>>,
pub verbose: bool,
}
Expand description
Represents a linked de Bruijn graph with a k-mer size specified at construction time.
Fields§
§name: String
§kmer_size: usize
§kmers: HashMap<Vec<u8>, Record>
§scores: HashMap<Vec<u8>, f32>
§links: HashMap<Vec<u8>, HashMap<Link, u16>>
§sources: HashMap<Vec<u8>, Vec<usize>>
§noise: HashSet<Vec<u8>>
§verbose: bool
Implementations§
Source§impl LdBG
impl LdBG
pub fn new(name: String, kmer_size: usize) -> Self
pub fn verbose(self, verbose: bool) -> Self
pub fn remove(&mut self, kmer: &[u8]) -> Option<Record>
Sourcepub fn score_kmers(self, model_path: &PathBuf) -> Self
pub fn score_kmers(self, model_path: &PathBuf) -> Self
pub fn infer_edges(&mut self)
pub fn correct_seqs(&self, seqs: &Vec<Vec<u8>>) -> Vec<Vec<u8>>
pub fn correct_seq(&self, g: &DiGraph<String, f32>, seq: &[u8]) -> Vec<u8> ⓘ
pub fn correct_seq_old(&self, seq: &[u8]) -> Vec<Vec<u8>>
Sourcepub fn assemble_all(&self) -> Vec<Vec<u8>>
pub fn assemble_all(&self) -> Vec<Vec<u8>>
Sourcepub fn assemble_at_bubbles(&self) -> Vec<Vec<u8>>
pub fn assemble_at_bubbles(&self) -> Vec<Vec<u8>>
Assemble contigs at superbubbles in the graph.
This function traverses all k-mers in the graph, identifies superbubbles, and assembles contigs from unique nodes within these superbubbles.
§Returns
A vector of contigs, where each contig is represented as a vector of bytes.
§Panics
- This function will panic if the node weight for a unique node cannot be retrieved.
- if
g.node_weight(*unique_node).unwrap().as_bytes()
returns None.
pub fn clean(self, threshold: f32) -> Self
Sourcepub fn clean_color_specific_paths(self, color: usize, min_score: f32) -> Self
pub fn clean_color_specific_paths(self, color: usize, min_score: f32) -> Self
Clean color-specific paths from the graph based on a minimum score threshold.
This function removes paths that are specific to a given color and have a score below the specified threshold.
§Arguments
color
- The color index to filter paths by.min_score
- The minimum score threshold for paths to be retained.
§Returns
A new instance of the graph with the specified paths removed.
§Panics
This function will panic if the assemble_forward
or assemble_backward
methods fail to assemble a contig.
Sourcepub fn clean_branches(self, min_score: f32) -> Self
pub fn clean_branches(self, min_score: f32) -> Self
This function removes tips from the graph. A tip is defined as a k-mer with an in-degree or out-degree of 0.
§Arguments
max_tip_length
- The maximum length of a tip to remove.min_score
- The minimum score threshold for k-mers to be considered part of a tip.
§Returns
The modified de Bruijn graph with tips removed.
§Panics
- This line could panic if
self.kmers.get(cn_kmer).unwrap().in_degree()
returnsNone
, meaning the in-degree of the k-mer is not available.
Sourcepub fn clean_tangles(self, color: usize, limit: usize, min_score: f32) -> Self
pub fn clean_tangles(self, color: usize, limit: usize, min_score: f32) -> Self
Clean tangles from the de Bruijn graph.
This function identifies and removes tangles from the de Bruijn graph based on a specified color, traversal limit, and minimum score threshold. A tangle is defined as a region in the graph where the in-degree and out-degree of a k-mer sum to 4 or more, indicating a complex branching structure.
§Arguments
color
- The color to filter k-mers by.limit
- The maximum number of nodes to traverse before giving up.min_score
- The minimum score threshold for k-mers to be considered part of a tangle.
§Returns
The modified de Bruijn graph with tangles removed.
§Panics
- This line could panic if
g.node_weight(node)
returnsNone
, meaning the node does not have an associated weight. - This line could panic if
crate::utils::canonicalize_kmer(current_kmer)
encounters an unexpected input that it cannot process. - This line could panic if
crate::utils::canonicalize_kmer(kmer)
encounters an unexpected input that it cannot process.
pub fn clean_hairballs(self) -> Self
Sourcepub fn clean_tips(self, limit: usize, min_score: f32) -> Self
pub fn clean_tips(self, limit: usize, min_score: f32) -> Self
This method will remove tips that have a score below the specified minimum score. A tip is defined as a region of the graph where there is only one path from the source to the sink.
§Arguments
limit
- The maximum number of nodes to traverse before giving up.min_score
- The minimum score for a tip to be kept.
§Returns
A new LdBG
with the specified tips removed.
§Panics
If self.kmers.get(cn_kmer)
returns None
, the call to unwrap()
will cause a panic.
Sourcepub fn clean_superbubbles(self, color: usize, min_score: f32) -> Self
pub fn clean_superbubbles(self, color: usize, min_score: f32) -> Self
This method will remove bubbles that have a score below the specified minimum score. A bubble is defined as a region of the graph where there are two paths from the same source to the same sink.
§Arguments
min_score
- The minimum score for a bubble to be kept.
§Returns
A new LdBG
with the specified bubbles removed.
§Panics
- When calling
unwrap
on the result ofg.node_weight(*node)
. If the node does not exist in the graph, this will cause a panic. - When calling
unwrap_or
on the result ofself.scores.get(&cn_kmer)
. If the canonical k-mer is not found in the scores, it will return the default value1.0
instead of panicking.
Sourcepub fn clean_contigs(self, min_contig_length: usize) -> Self
pub fn clean_contigs(self, min_contig_length: usize) -> Self
This method will remove contigs that are shorter than the specified minimum length and that are not connected to the rest of the graph.
§Arguments
min_contig_length
- The minimum length of a contig to keep.
§Returns
A new LdBG
with the specified contigs removed.
§Panics
This method will panic if the k-mer size is not set.
pub fn traverse_kmers_until_condition<F>( &self, start_kmer: &[u8], color: usize, limit: usize, stopping_condition: F, ) -> DiGraph<String, f32>
Sourcepub fn traverse_kmers(&self, start_kmer: &[u8]) -> DiGraph<String, f32>
pub fn traverse_kmers(&self, start_kmer: &[u8]) -> DiGraph<String, f32>
Traverse kmers starting from a given kmer and build a graph.
Sourcepub fn traverse_contigs(&self, start_kmer: &[u8]) -> DiGraph<String, f32>
pub fn traverse_contigs(&self, start_kmer: &[u8]) -> DiGraph<String, f32>
The traverse_contigs
function traverses kmers starting from a given kmer and builds a
directed graph of contigs. It marks all kmers in the start contig as visited,
then traverses forward and backward to build the graph, ensuring that each kmer is only
visited once. The function returns the constructed graph.
§Arguments
start_kmer
- A vector of bytes representing the start kmer.
§Returns
A directed graph of contigs.
§Panics
-
Unwrapping
Option
values:- If
graph.node_weight(node)
returnsNone
, the call tounwrap()
will panic. - If
self.last_kmer(this_contig)
orself.first_kmer(this_contig)
returnsNone
, the call tounwrap()
will panic. - If
visited.get(&canonical_kmer)
returnsNone
, the call tounwrap()
will panic.
- If
-
Indexing operations:
- If
contig.windows(self.kmer_size)
is called with akmer_size
larger than the length ofcontig
, it will panic.
- If
Sourcepub fn traverse_all_kmers(&self) -> DiGraph<String, f32>
pub fn traverse_all_kmers(&self) -> DiGraph<String, f32>
Traverse all kmers in the graph and return a new graph with all kmers merged. This function is useful for collapsing kmers that are separated by bubbles. The new graph will contain all kmers as nodes and edges between kmers as weights.
§Returns
A new graph with all kmers merged.
§Panics
Panics if the graph is not a directed graph.
Sourcepub fn traverse_all_contigs(&self) -> DiGraph<String, f32>
pub fn traverse_all_contigs(&self) -> DiGraph<String, f32>
Traverse all contigs in the graph and return a new graph with all contigs merged. This function is useful for collapsing contigs that are separated by bubbles. The new graph will contain all contigs as nodes and edges between contigs as weights.
§Returns
A new graph with all contigs merged.
§Panics
Panics if the graph is not a directed graph.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for LdBG
impl RefUnwindSafe for LdBG
impl Send for LdBG
impl Sync for LdBG
impl Unpin for LdBG
impl UnwindSafe for LdBG
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more§impl<T> Pointable for T
impl<T> Pointable for T
§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self
from the equivalent element of its
superset. Read more§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self
is actually part of its subset T
(and can be converted to it).§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset
but without any property checks. Always succeeds.§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self
to the equivalent element of its superset.