Tools developed & maintained by members of the lab

curatedMetagenomicData is a Bioconductor package providing uniformly processed and manually annotated human microbiome profiles for thousands of people. Microbial taxonomy (from MetaPhlAn2) and metabolic functional potential (from HUMAnN2) can be analyzed with respect to numerous participant characteristics and health outcomes, simply and reproducibly on a normal laptop.

MetaMLST is a software tool that performs an in-silico Multi Locus Sequence Typing (MLST) Analysis on metagenomic samples. MetaMLST achieves cultivation- and assembly- free strain level tracking. MetaMLST is able to detect and trace all the species to which the standard MLST protocol is applicable.

StrainPhlAn is a computational tool for performing strain-level population genomics on large metagenomic datasets by profiling microbes from known species with strain level resolution and providing comparative and phylogenetic analyses of strains retrieved from metagenomic samples.

MetAML is a computational tool for metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. It provides also species-level taxonomic profiles, marker presence data, and metadata for 3000+ public available metagenomes.

PanPhlAn is a strain-level metagenomic profiling tool for identifying the gene composition and in-vivo transcriptional activity of individual strains in metagenomic samples. PanPhlAn's ability for strain-tracking and functional analysis of unknown pathogens makes it an efficient tool for culture-free infectious outbreak epidemiology and microbial population studies.

MetaPhlAn2 is the greatly updated version of our computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data with species level resolution. MetaPhlAn2 is also able to identify specific strains (in the not-so-frequent cases in which the sample contains a previously sequenced strains) and to track strains across samples for all species.

GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.

MetaRef is an online resource to comprehensively catalog and characterize clade-specific microbial genes. We identify and provide all core genes associated with all microbial species and genera with available reference genomes (final or draft). A subset of these gene families are consistently present in one or more taxonomic clades, which allows us to further indicate them as marker genes

PhyloPhlAn is a software tool for accurately determine taxonomic identities and evolutionary relationships of new microbial genomes.

The first version of MetaPhlAn focused on specie-level profiling for bacteria and archea and was initially developed to effieicnelty analyze the large amount of shotgun metagenomics data produced by the Human Microbiome Project.

LDA Effect Size (LEfSe) is an algorithm for high-dimensional biomarker discovery and explanation that identifies genomic features (genes, pathways, or taxa) characterizing the differences between two or more biological conditions (or classes). It emphasizes both statistical significance and biological relevance, allowing researchers to identify differentially abundant features that are also consistent with biologically meaningful categories (subclasses).

Other tools with contributions by members of the lab

ShortBRED is a pipeline to take a set of protein sequences, group them into families, extract a set of distinctive strings ("markers"), and then search for these markers in metagenomic data and determine the presence and abundance of the protein families of interest.

microPITA is a computational tool enabling sample selection in two-stage (tiered) metagenomic studies.

HUMAnN is a pipeline for efficiently and accurately determining the presence/absence and abundance of microbial pathways and functioanl modules in a community from metagenomic data. Sequencing a metagenome typically produces millions of short DNA/RNA reads. HUMAnN takes these reads as inputs and produces gene and pathway summaries as output.