Crate rust_htslib

Source
Expand description

Rust-Htslib provides a high level API to working with the common HTS file formats.

Htslib itself is the de facto standard implementation for reading and writing files for HTS alignments (SAM and BAM) as well as variant calls in VCF and BCF format.

For example, let’s say that we use samtools to view the header of a test file:

samtools view -H test/test.bam
@SQ    SN:CHROMOSOME_I    LN:15072423
@SQ    SN:CHROMOSOME_II    LN:15279345
@SQ    SN:CHROMOSOME_III    LN:13783700
@SQ    SN:CHROMOSOME_IV    LN:17493793
@SQ    SN:CHROMOSOME_V    LN:20924149

We can reproduce that with Rust-Htslib. Reading BAM files and printing the header to the the screen is as easy as

use rust_htslib::{bam, bam::Read};


let bam = bam::Reader::from_path(&"test/test.bam").unwrap();
let header = bam::Header::from_template(bam.header());

// print header records to the terminal, akin to samtool
for (key, records) in header.to_hashmap() {
    for record in records {
         println!("@{}\tSN:{}\tLN:{}", key, record["SN"], record["LN"]);
    }
}

which results in the following output, equivalent to samtools.

@SQ    SN:CHROMOSOME_I    LN:15072423
@SQ    SN:CHROMOSOME_II    LN:15279345
@SQ    SN:CHROMOSOME_III    LN:13783700
@SQ    SN:CHROMOSOME_IV    LN:17493793
@SQ    SN:CHROMOSOME_V    LN:20924149

We can also read directly from the BAM file and write to an output file

use rust_htslib::{bam, bam::Read};

let mut bam = bam::Reader::from_path(&"test/test.bam").unwrap();
let header = bam::Header::from_template(bam.header());
let mut out = bam::Writer::from_path(&"test/out.bam", &header, bam::Format::Bam).unwrap();

// copy reverse reads to new BAM file
for r in bam.records() {
    let record = r.unwrap();
    if record.is_reverse() {
        out.write(&record).unwrap();
    }
}

Pileups can be performed with

use rust_htslib::{bam, bam::Read};

let mut bam = bam::Reader::from_path(&"test/test.bam").unwrap();

// pileup over all covered sites
for p in bam.pileup() {
    let pileup = p.unwrap();
    println!("{}:{} depth {}", pileup.tid(), pileup.pos(), pileup.depth());

    for alignment in pileup.alignments() {
        if !alignment.is_del() && !alignment.is_refskip() {
            println!("Base {}", alignment.record().seq()[alignment.qpos().unwrap()]);
        }
        // mark indel start
        match alignment.indel() {
            bam::pileup::Indel::Ins(len) => println!("Insertion of length {} between this and next position.", len),
            bam::pileup::Indel::Del(len) => println!("Deletion of length {} between this and next position.", len),
            bam::pileup::Indel::None => ()
        }
    }
}

In both cases, indexed BAM files can be seeked for specific regions using fetch, constraining either the record iterator or the pileups:

use rust_htslib::{bam, bam::Read};

let mut bam = bam::IndexedReader::from_path(&"test/test.bam").unwrap();

bam.fetch(("CHROMOSOME_I", 0, 20)).unwrap();
// afterwards, read or pileup in this region

See

Modules§

  • Module for working with SAM, BAM, and CRAM files.
  • Module for working with VCF and BCF files.
  • Module for working with bgzipped file.
  • Module for working with faidx-indexed FASTA files.
  • Re-export hts-sys htslib bindings
  • Module for working with tabix-indexed text files.
  • Module with utility code.