rust_htslib

Module bcf

Source
Expand description

Module for working with VCF and BCF files.

§Performance Remarks

Note that BCF corresponds to the in-memory representation of BCF/VCF records in Htslib itself. Thus, it comes without a runtime penalty for parsing, in contrast to reading VCF files.

§Example (reading)

  • Obtaining 0-based locus index of the VCF record.
  • Obtaining alleles of the VCF record.
  • calculate alt-allele dosage in a mutli-sample VCF / BCF
use crate::rust_htslib::bcf::{Reader, Read};
use std::convert::TryFrom;

let path = &"test/test_string.vcf";
let mut bcf = Reader::from_path(path).expect("Error opening file.");
// iterate through each row of the vcf body.
for (i, record_result) in bcf.records().enumerate() {
    let mut record = record_result.expect("Fail to read record");
    let mut s = String::new();
     for allele in record.alleles() {
         for c in allele {
             s.push(char::from(*c))
         }
         s.push(' ')
     }
    // 0-based position and the list of alleles
    println!("Locus: {}, Alleles: {}", record.pos(), s);
    // number of sample in the vcf
    let sample_count = usize::try_from(record.sample_count()).unwrap();

    // Counting ref, alt and missing alleles for each sample
    let mut n_ref = vec![0; sample_count];
    let mut n_alt = vec![0; sample_count];
    let mut n_missing = vec![0; sample_count];
    let gts = record.genotypes().expect("Error reading genotypes");
    for sample_index in 0..sample_count {
        // for each sample
        for gta in gts.get(sample_index).iter() {
            // for each allele
            match gta.index() {
                Some(0) => n_ref[sample_index] += 1,  // reference allele
                Some(_) => n_alt[sample_index] += 1,  // alt allele
                None => n_missing[sample_index] += 1, // missing allele
            }
        }
    }
}

§Example (writing)

  • Setting up a VCF writer from scratch (including a simple header)
  • Creating a VCF record and writing it to the VCF file
use rust_htslib::bcf::{Format, Writer};
use rust_htslib::bcf::header::Header;
use rust_htslib::bcf::record::GenotypeAllele;

// Create minimal VCF header with a single contig and a single sample
let mut header = Header::new();
let header_contig_line = r#"##contig=<ID=1,length=10>"#;
header.push_record(header_contig_line.as_bytes());
let header_gt_line = r#"##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">"#;
header.push_record(header_gt_line.as_bytes());
header.push_sample("test_sample".as_bytes());

// Write uncompressed VCF to stdout with above header and get an empty record
let mut vcf = Writer::from_stdout(&header, true, Format::Vcf).unwrap();
let mut record = vcf.empty_record();

// Set chrom and pos to 1 and 7, respectively - note the 0-based positions
let rid = vcf.header().name2rid(b"1").unwrap();
record.set_rid(Some(rid));
record.set_pos(6);

// Set record genotype to 0|1 - note first allele is always unphased
let alleles = &[GenotypeAllele::Unphased(0), GenotypeAllele::Phased(1)];
record.push_genotypes(alleles).unwrap();

// Write record
vcf.write(&record).unwrap()

This will print the following VCF to stdout:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1,length=10>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  test_sample
1       7       .       .       .       0       .       .       GT      0|1

Re-exports§

Modules§

Structs§

Enums§

Traits§

  • A trait for a BCF reader with a read method.

Functions§