ppvcf  1.0.0
Library for the parallel parsing of VCF files
Public Member Functions | Private Member Functions | Private Attributes | List of all members
VCF Class Reference

Class for vcf file. More...

#include <ppvcf.hpp>

Public Member Functions

 VCF (char *vcf_path, const int n_threads_, const int block_size_)
 
 ~VCF ()
 
bool parse ()
 
vector< Variantget_variants () const
 

Private Member Functions

char * get_samples (char *const fmt_line, const size_t len)
 
tuple< uint8_t, uint8_t, bool > extract_genotype (char *start)
 
void fill_variant (const uint32_t var_idx)
 
void fill_genotypes ()
 

Private Attributes

htsFile * vcf
 htslib file
 
bcf_hdr_t * vcf_header
 htslib header
 
bcf1_t * vcf_record
 htslib record
 
int n_samples
 Total number of samples per variant.
 
vector< Variantvariants
 Variants read.
 
vector< str_w_lfmt_lines
 Genotype fields as cstring (from "GT" to the end of each line)
 
size_t block_size
 Number of variants to read at each iteration.
 
size_t to_parse
 Number of variants read in the last iteration.
 
int n_threads
 Number of threads to use.
 

Detailed Description

Class for vcf file.

VCF file.

Constructor & Destructor Documentation

◆ VCF()

VCF::VCF ( char *  vcf_path,
const int  n_threads_,
const int  block_size_ 
)
inline

Constructor

Parameters
vcf_pathis the path to the input vcf file
n_threads_is the number of threads to use
block_size_is the number of variants to read at each iteration

◆ ~VCF()

VCF::~VCF ( )
inline

Destructor

Member Function Documentation

◆ extract_genotype()

tuple<uint8_t, uint8_t, bool> VCF::extract_genotype ( char *  start)
inlineprivate

Extracts information from a cstring representing a genotype

Parameters
startis a cstring
Returns
a tuple containing the first allele, the second allele, and the phased flag

◆ fill_genotypes()

void VCF::fill_genotypes ( )
inlineprivate

Extracts and store the genotypes of all the variants read in the last iteration

◆ fill_variant()

void VCF::fill_variant ( const uint32_t  var_idx)
inlineprivate

Extracts and stores the genotype fields associated to the {var_idx}-th variant read in the last iteration

Parameters
var_idxis an index

◆ get_samples()

char* VCF::get_samples ( char *const  fmt_line,
const size_t  len 
)
inlineprivate

Extract the genotype information from a vcf line

Parameters
fmt_lineis a line of the input vcf file
lenis the length of fmt_line
Returns
a cstring representing the genotypes information

◆ get_variants()

vector<Variant> VCF::get_variants ( ) const
inline
Returns
the variants read in the last iteration

◆ parse()

bool VCF::parse ( )
inline

Parse block_size lines from the input vcf file

Returns
false if less than block_size lines have been read (due to EOF), true otherwise.

The documentation for this class was generated from the following file: