Computational Metagenomics - Dr. Alexander Sczyrba

Over 99% of the microbial species observed in nature cannot be grown in pure culture making them inaccessible to classical genomic studies. Metagenomics and single cell genomics are two approaches to study the “microbial dark matter”.

Metagenomics, the direct analysis of DNA from a whole environmental community, represents a strategy to reveal the diversity of the microbial world. Current sequencing technologies can generate more than one Terabyte of sequence data in a single experiment, allowing sequence-based metagenomic discovery of complete genes or even genomes from environmental samples. The Computational Metagenomics Group led by Dr. Alexander Sczyrba develops bioinformatics tools and pipelines for the analysis of metagenomics studies. As data sets are growing rapidly, a special focus of our research is the application of cloud computing technologies to dynamically scale the analysis on large compute resources.

A complementary approach to sequencing the DNA of a whole microbial community is single cell genomics. DNA sequencing from single amplified genomes of individual cells allows to study the genomes of uncultured species. However, the tremendous bias introduced by the amplification technique and possible sample contamination poses a challenge for the bioinformatics analysis. Research in the group currently focuses on bioinformatics approaches to pre-process sequence data and automatically detect possible sample contamination, a difficult task if the target genome is not closely related to any previously sequenced genome.

A promising approach for future metagenomic studies is the combination of high-throughput metagenome sequencing and large-scale single cell genomics. Both data sources can be combined into bioinformatics analysis to gain a better understanding of the phylogenetic diversity of microbial communities, their population structure and functionality.