PyroTRF-ID: a novel bioinformatics approach for the identification of terminal-restriction fragments using microbiome pyrosequencing data
Aim: In molecular microbial ecology, pyrosequencing is gradually supplanting classical fingerprinting techniques such as terminal-restriction fragment length polymorphism (T-RFLP) combined with cloning-sequencing for enhanced characterization of microbiomes. Here, the PyroTRF-ID software was developed as a high-throughput approach to combine pyrosequencing and T-RFLP for the description of microbial communities with optimized lab and computational efforts. In contrast to existing bioinformatics methods for phylogenetic affiliation of T-RFs, the proposed strategy aims at conserving the entire microbial information contained in the samples taken from the investigated environments. Methods: PyroTRF-ID was encoded on the Vital-IT high performance computing platform of the Swiss Institute for Bioinformatics for mapping and massive digital T-RFLP profiling of full pyrosequencing datasets, for comparing digital and experimental fingerprints obtained from the same DNA extracts, and for identifying contributions of phylotypes to T-RFs. The method was used to characterize bacterial communities in groundwater samples from aquifers contaminated by chloroethenes and in aerobic granular sludge biofilm from wastewater treatment systems. Each DNA extract was subjected to amplification of 500 bp fragments of 16S rRNA gene pools, T-RFLP with HaeIII endonuclease, 454 tag encoded FLX amplicon pyrosequencing and PyroTRF-ID analysis. Greengenes was selected as mapping database. Results: PyroTRF-ID was efficient for high-throughput mapping and digital T-RFLP profiling of pyrosequencing datasets. After denoising, 20 min were required to reprocess at least 15 datasets of 6’000 to 35’000 reads of 500 bp. Digital and experimental profiles were aligned with maximum cross-correlation coefficients of 0.71 and 0.92 for high-complexity groundwater respectively low-complexity synthetic wastewater environments. 63±23% respectively 61±12% of all experimental T-RFs (73±16 and 47±14 peaks per sample) were affiliated to phylotypes. Bacterial dynamics were then optimally elucidated by T-RFLP. Conclusions: PyroTRF-ID enables high-throughput matching of pyrosequencing and T-RFLP datasets, and affiliation of T-RFs to precise phylotypes. This methodology is efficient for optimizing laboratory and computational efforts for high-resolved description of microbial community dynamics in various systems such as the ones investigated in environmental and medical sciences. Acknowledgements This research was financed by the Swiss National Science Foundation, Grants No. 120536, 138148 and 120627. We are grateful to Ioannis Xenarios for support at the Vital-IT High Performance Computing Center of the Swiss Institute for Bioinformatics.
Record created on 2012-07-27, modified on 2016-08-09