Core Genome Molecular Sequence Typing in the Gonococcus
Genomic data is used by scientists and doctors around the world to respond to the threat of infectious diseases. In this series of blog posts, we’ll highlight research carried out by our group that uses genomic data to study disease-causing bacteria. The discoveries you are making through the Genome Detectives project will help us carry out more studies like these in the future!
This post discusses research into the core genome of Neisseria gonorrhoeae, the bacterial species that causes gonorrhoea, and how the core genome can be used to categorise the bacteria into lineages. You can read the full scientific paper, “Neisseria gonorrhoeae Population Genomics: Use of the Gonococcal Core Genome to Improve Surveillance of Antimicrobial Resistance” (Harrison, et al., 2020), here.
Gonorrhoea and antimicrobial resistance
The bacterial species Neisseria gonorrhoeae (the gonococcus) causes the sexually transmitted infection gonorrhoea. Gonorrhoea is globally prevalent, with as many as 130 million cases estimated to occur worldwide every year1. It causes unpleasant symptoms such as discharge from the vagina or penis and abdominal pain, and if it is left untreated can cause serious complications such as infertility.
Neisseria gonorrhoeae infections are treated with antibiotics (also called antimicrobials). However, antimicrobial resistance (AMR) in the gonococcus is increasing rapidly, limiting the options for successful treatment in many cases of gonorrhoea. It is important to increase our understanding of how AMR in the gonococcus evolves and spreads, in order to inform healthcare policies and research into new treatments for gonorrhoea.
Gonococcal population structure
In bacteria, population structure refers to the evolutionary relationships between different lineages (subtypes or variants), how these lineages interact, and how genes are shared between them. Understanding of population structure can help to predict how characteristics such as AMR develop and spread between lineages, and so help to inform strategies for limiting their transmission across the global bacterial population.
Bacterial populations can be clonal or non-clonal: clonal populations form distinct lineages which can be distinguished from one another genetically, while non-clonal populations do not form distinct lineages due to frequent horizonal gene transfer (HGT, the sharing of genes between different lineages). Neisseria gonorrhoeae has a non-clonal population structure.
This figure shows an example of a phylogenetic tree from a clonal (left) and non-clonal (right) bacterial population. Phylogenetic trees are diagrams showing the evolutionary relationships between different lineages: the dots represent lineages, and the lines between them show how they are related. The clonal population has distinguished lineages, while in the non-clonal population the lineages are highly inter-related due to the sharing of genes by HGT. Image adapted from Bacigalupe, 2017.
Molecular typing schemes
Molecular typing schemes are used to divide a bacterial population into lineages, based on the alleles (different forms of a given gene) found at certain genes. Prior to Harrison and colleagues’ study, molecular typing schemes used in the gonococcus were:
- Multilocus sequence typing (MLST) – this scheme indexes the diversity at seven ‘housekeeping’ genes (genes required for basic cell functions).
- Neisseria gonorrhoeae multiantigen sequence typing (NG-MAST) – this scheme uses two genes which encode proteins found on outer membrane of the bacterial cells.
- Neisseria gonorrhoeae sequence typing for antimicrobial resistance (NG-STAR) – this scheme uses seven genes which have been associated with AMR in the gonococcus.
Because these schemes only use a few genes, some gonococci may be assigned to the same lineage even though they actually have different ancestry. This is due to the non-clonal population structure and frequent HGT between different lineages. This means that these typing schemes lack resolution in determining the population structure.
The core genome and cgMLST scheme
In this study, Harrison and colleagues developed a new, high-resolution, typing scheme: the core genome multilocus typing scheme (N. gonorrhoeae cgMLMST v1.0).
The core genome of a bacterial species is all the genes found in all the individual cells of that species. Some cells may have additional genes, known as accessory genes.
To identify the core genome, the researchers used the genome sequences of 3750 Neisseria gonorrhoeae samples, taken from across the globe and across five decades from 1970 – 2018. They used computer programs to define all the genes present in each genome. Genes were defined as part of the core genome if they were found in over 95% of all the samples – this was a total of 1668 genes. Each sample was then further analysed to define which allele was present for all of these genes, and this information was used to give a resulting core genome sequence type: the lineage according to the cgMLST scheme.
Using the core genome to categorise Neisseria gonorrhoeae into lineages in this way massively improves the resolution of the population structure compared to previously used molecular typing methods.
The cgMLST scheme is included as part of the bacterial genomes database PubMLST, meaning any additional Neisseria gonorrhoeae samples uploaded to the database are automatically assigned to a lineage.
Using the cgMLST scheme to track AMR
Implementation of a core genome typing scheme can allow genome data-based analysis of the transmission of different lineages globally, and this can be used to investigate important characteristics of the gonococcus such as antimicrobial resistance.
Some cgMLST lineages defined in this study are resistant to key antibiotics. For example, the lineage ‘Ng_cgc400_3’ shows high levels of resistance to a group of antibiotics called cephalosporins – this group includes ceftriaxone, which is currently the recommended treatment for gonorrhoea. This lineage is therefore a key priority for monitoring, in order to understand how it is spreading.
Summary
The use of the core genome for classification of Neisseria gonorrhoeae lineages has improved the resolution at which the structure of the global gonococcal population can be analysed. Identification and comparison of lineages using the cgMLST scheme will improve our ability to monitor the emergence, persistence, and spread of clinically-important characteristics such as antimicrobial resistance.
If you are keen to read about the development of the core genome typing scheme in more detail, you can find the paper by Harrison, et al., here.
1. World Health Organisation, 2021.
Images (in order): Infection Update, 2021; Bacigalupe, 2017.