Module bio_types::annot

source ·
Expand description

Data types for positions and regions on named sequences (e.g. chromosomes), useful for annotating features in a genome. For example, these data types let you represent that TMA22 is on chromosome X, positions 461,829-462,426, on the forward strand. They also allow coordinate math on these annotations, e.g., that position chrX:461,839 is +10 within TMA22 and vice versa.

This module provides three concrete data types to represent a single position (Pos), a contiguous region (Contig), or a “spliced” region (Spliced) consisting of one or more exons separated by introns. All three data types implement a location trait Loc.

These data types are generic over the data type used to “name” the annotated reference sequence (e.g., the chromosome name). It’s possible to use an owned String, an interned Rc<String>, or an integer sequence identifier like the “target id” field in a BAM file.

These data types are also generic over the kind of strand information in the annotation. This allows annotations with required strand annotation (ReqStrand), optional strand annotation (Strand), or no strand annotation (NoStrand).

The example below shows how to create the TMA22 annotation and find where chrX:461,839 falls within this gene.

use bio_types::annot::contig::Contig;
use bio_types::annot::loc::Loc;
use bio_types::annot::pos::Pos;
use bio_types::strand::{ReqStrand,NoStrand};
let tma22: Contig<String, ReqStrand> = Contig::from_str("chrX:461829-462426(+)")?;
let p0: Pos<String, NoStrand> = Pos::from_str("chrX:461839")?;
let p0_into = tma22.pos_into(&p0).unwrap_or_else(|| panic!("p0 not within TMA22"));
assert!(p0_into.pos() == 10);

Modules§

  • Contiguous region on a named sequence, e.g., chromosome XI 334,915-334,412.
  • Trait shared across sequence locations – spliced, contiguous, or single-position.
  • Positions on a named sequence, e.g., 683,946 on chromosome IV.
  • Intern reference sequence (e.g., chromosome) names
  • Spliced region on a named sequence, e.g., the reverse strand of chromosome V, exon #1 at 166,885 through 166,875 and exon #2 at 166,771 through 166,237.

Enums§