Proteus mirabilis is a Gram-negative bacterium recognized for its unique swarming motility and urease activity. A previous proteomic report on four strains hypothesized that, unlike other Gram-negative bacteria, P. mirabilis may not exhibit significant intraspecies variation in gene content. However, there has not been a comprehensive analysis of large numbers of P. mirabilis genomes from various sources to support or refute this hypothesis. We performed comparative genomic analysis on 2,060 Proteus genomes. We sequenced the genomes of 893 isolates recovered from clinical specimens from three large US academic medical centers, combined with 1,006 genomes from NCBI Assembly and 161 genomes assembled from Illumina reads in the public domain. We used average nucleotide identity (ANI) to delineate species and subspecies, core genome phylogenetic analysis to identify clusters of highly related P. mirabilis genomes, and pan-genome annotation to identify genes of interest not present in the model P. mirabilis strain HI4320. Within our cohort, Proteus is composed of 10 named species and 5 uncharacterized genomospecies. P. mirabilis can be subdivided into three subspecies; subspecies 1 represented 96.7% (1,822/1,883) of all genomes. The P. mirabilis pan-genome includes 15,399 genes outside of HI4320, and 34.3% (5,282/15,399) of these genes have no putative assigned function. Subspecies 1 is composed of several highly related clonal groups. Prophages and gene clusters encoding putatively extracellular-facing proteins are associated with clonal groups. Uncharacterized genes not present in the model strain P. mirabilis HI4320 but with homology to known virulence-associated operons can be identified within the pan-genome. IMPORTANCE Gram-negative bacteria use a variety of extracellular facing factors to interact with eukaryotic hosts. Due to intraspecies genetic variability, these factors may not be present in the model strain for a given organism, potentially providing incomplete understanding of host-microbial interactions. In contrast to previous reports on P. mirabilis, but similar to other Gram-negative bacteria, P. mirabilis has a mosaic genome with a linkage between phylogenetic position and accessory genome content. P. mirabilis encodes a variety of genes that may impact host-microbe dynamics beyond what is represented in the model strain HI4320. The diverse, whole-genome characterized strain bank from this work can be used in conjunction with reverse genetic and infection models to better understand the impact of accessory genome content on bacterial physiology and pathogenesis of infection.

Original languageEnglish
Issue number4
StatePublished - Jul 2023


  • Proteus mirabilis
  • microbial genomics
  • population structure


Dive into the research topics of 'Uncharacterized and lineage-specific accessory genes within the Proteus mirabilis pan-genome landscape'. Together they form a unique fingerprint.

Cite this