PanPhlAn - strain detection and characterization

Pangenome-based Phylogenomic Analysis (PanPhlAn) is a strain-level metagenomic profiling tool for identifying the gene composition and in-vivo transcriptional activity of individual strains in metagenomic samples. PanPhlAn’s ability for strain-tracking and functional analysis of unknown pathogens makes it an efficient tool for culture-free infectious outbreak epidemiology and microbial population studies.

Software repository and supporting material

Software repository of PanPhlAn:

Available species pangenome databases:

The PanPhlAn tutorial:

User support (email-based group and discussion forum):!forum/panphlan-users

For comments and question please write to our
user support group:
or contact directly the Segata lab.


If you find this tool useful in your research, please cite our paper:

Matthias Scholz* Doyle V. Ward*, Edoardo Pasolli*, Thomas Tolio, Moreno Zolfo, Francesco Asnicar, Duy Tin Truong, Adrian Tett, Ardythe L. Morrow, and Nicola Segata (* Equal contribution)
Strain-level microbial epidemiology and population genomics from shotgun metagenomics.
Nature Methods, 13, 435–438, 2016.

Example of E. coli strain profiling

Characterization of the German 2011 E. coli outbreak strain

PanPhlAn profiling of the German outbreak metagenomes using a reference database in which the target outbreak genome is missing. (a) Hierarchical clustering. The heatmap displays presence/absence gene-family profiles of 110 reference strains (bright colored columns) and of 12 metagenomically detected strains (darker columns). Most outbreak samples cluster together due to almost identical profiles (right), with four samples (left) showing different profiles due to the presence of additional dominant E. coli strains overlying the target outbreak strain. (b) Functional analysis of outbreak-specific gene-families (Fisher exact test) confirmed that the outbreak strain is a combination of a EAEC pathogen (pAA plasmid) with acquired Shiga toxin and antibiotic resistance genes, complemented with a set of enriched virulence-related functions and pathway modules.