Ribosomal Multilocus Sequence Typing

Genomic data is used by scientists and doctors around the world to respond to the threat of infectious diseases. In this series of blog posts, we’ll highlight research carried out by our group that uses genomic data to study disease-causing bacteria. The discoveries you are making through the Genome Detectives project will help us carry out more studies like these in the future! 

Bacterial ribosome structure

 

This post discusses Ribosomal Multilocus Sequence Typing (rMLST), a genotyping scheme developed for bacterial taxonomic analyses and strain typing. You can read the full scientific paper describing how rMLST was developed here: “Ribosomal Multilocus Sequence Typing: Universal Characterisation of Bacteria from Domain to Strain” (Jolley, et al., 2012).

 

Taxonomy and strain typing

Taxonomy is the study of naming and classifying groups of organisms based on genetic ancestry and shared characteristics. There are eight major taxonomic ranks, forming a hierarchy of increasingly larger groupings of species. The most specific level of categorisation is the species level, and the most general is the domain level. All life is part of one of the three domains, Eukaryotes, Archaea, and Bacteria.

 

Taxonomic rankings

 

Microbiologists often classify bacterial species further, into strains, particularly when studying species that cause disease (pathogens). A strain is a subtype or variant of a given species. Systems used to classify bacteria of a given species into different strains are known as strain typing methods.

Precise and reproducible identification and classification of bacteria is important for epidemiological research (study into the distribution, patterns, and determinants of disease) and diagnosis of diseases.

 

The bacterial ribosome and rps genes

Ribosomes are a type of organelle, a specialist structure found within a living cell that performs a specific function. The function of ribosomes is to create the proteins that are encoded by the cell’s genes. In bacteria, ribosomes consist of two major subunits (the large subunit and small subunit), each of which is made from ribosomal RNA (rRNA – a complex organic molecule) and various ribosomal proteins (RPs). The genes encoding RPs are known as rps genes.

The rps genes are ideal targets for a bacterial classification scheme because they are present in all bacteria. This is because they encode proteins which are under natural selection for conservation of their function over evolutionary time. Natural selection constrains the extent to which the genes can change (because large changes can change the function of the encoded protein) and so the genes are present in, and perform the same function in, all bacterial species. However, small variations in these genes allow them to be used for classification.

 

Ribosomal Multilocus Sequence Typing

Ribosomal Multilocus Sequence Typing (rMLST) is an approach that uses the variation within 53 genes encoding the bacterial ribosome protein subunits (the rps genes) to classify bacteria into different taxonomic ranks and strains.

rMLST was developed by Jolley and colleagues in 2012. Prior to this, there was no single approach to taxonomy or strain typing that could encompass all levels of classification, from domain to strain. A variety of methods had previously been used, including:

  • Early methods that grouped bacteria by observable characteristics
  • Multilocus Sequence Typing (MLST) – characterisation using DNA sequence variation in 6 to 8 ‘housekeeping’ genes involved in basic cell functions
  • 16S rRNA gene – characterisation using the gene encoding the ribosome small subunit rRNA molecule.

The MLST and 16S rRNA schemes are effective at resolving bacterial taxonomies, however do not provide the necessary resolution to differentiate between two strains of the same species. There are two reasons for this: firstly, different bacteria with distinct characteristics can have identical or very similar DNA sequences in the small number of genes considered by these schemes; secondly, in the case of MLST, two bacterial species or strains may differ in the core metabolic genes that are present. The 53 genes used in rMLST provide resolution to strain level, and so can be used for both taxonomy and strain typing.

 

Developing and testing the scheme

To develop the rMLST scheme, Jolley and colleagues used genome sequence data from almost 2000 samples, from across the entire bacterial domain. They identified the rps genes in each genome, and each unique allele (an alternative form of a gene) was given an arbitrary allele number. The allele numbers at each of the 53 genes are used to define a unique rMLST profile for each sample, and this profile is used to determine the strain or taxonomic classification of that sample.

To test the scheme, they generated phylogenetic trees (a branching diagram that shows evolutionary relationships) using rMLST profiles. The resulting phylogenetic tree (shown in the diagram below) is consistent with other taxonomic classifications, but also shows a higher level of resolution at the tips of the branches to a sub-species level.

 

rMLST phylogenetic tree created by Jolley and colleagues

 

rMLST on the PubMLST database

PubMLST is an online open-access database of bacterial genome data, which also offers many in-built tools for analysing this data. The rMLST scheme is available as part of PubMLST, where it is used to identify the taxonomy and strain of samples uploaded to the database. You can explore the rMLST tool on PubMLST here.

 

Summary

High resolution characterisation of bacteria (that is, determination of strains within a species) is important for public health and clinical situations. The rMLST scheme is a combined taxonomic and strain typing method for bacteria, which uses the allele variation at 53 rps genes to classify a given sample. The scheme can generate a classification at all resolutions from the domain level to the sub-species strain level.

 

If you are keen to discover more about the rMLST scheme, you can read the paper by Jolley, et al., here.

 

 


Images (in order): Schuwirth, et al., 2005 via NCBI; Swain, 2023; Jolley, et al., 2012.

Read another post: