Skip to main content

Table 4 Challenges for the integration of metagenomics into public health

From: Metagenomics for pathogen detection in public health

Challenge

Description

Relevance

Solution

Reference(s)

Multiple technologies

Next-generation sequencing can be performed on multiple platforms each with different characteristics, and each constantly under improvement

Difficulty comparing results from different platforms and with those from older techniques

Pipelines must be constantly updated to account for new techniques

[74, 76, 92]

Universal approach not yet possible

Different platforms should be utilized depending on the question asked

  

Continuously evolving technology requires skilled workforce rather than established pipelines

  

Computational resources

Our ability to generate DNA sequence data has rapidly surpassed our computational abilities to analyze the data

Significant requirements for storage of DNA sequence

Perform analysis using a staged approach

[69, 93]

  

Assembling and identifying short reads from next-generation sequencing is computationally intensive

Cloud computing

 

Suitable reference databases

Multiple reference databases are available, which may generate different results depending on the database used

Certain features of a metagenomic sample might be missed if the wrong database is used

HMP aims to sequence multiple references genomes associated with the human body

[94]

  

Limited by the diversity represented in each database

HMP currently has a total of 6,500 reference sequences generated

 

Short read lengths

Read lengths depend on sequencing platform used

Makes de novo assembly more complicated

Read lengths are continually increasing

[92, 95]

  

More difficult to identify large-scale genomic variations and repetitive regions

Third-generation sequencing platforms promise much longer read lengths

 

Causation

Finding a pathogen in a disease sample does not imply causation

Important to determine causation before changing public health management

Follow-up studies are required - for example, using animal models, or serological or epidemiological methods.

[11, 75, 96]

  

False association can lead to costly, useless or even potentially harmful therapies

Results must be independently validated

 

Contamination

Metagenomics can detect contaminants from cell cultures, reagents and laboratory equipment

Contaminants may be incorrectly associated with the disease of interest

Negative controls must be used

[97]

Researchers must consider the plausibility of the findings

   

Results must be independently validated

 

Privacy

Host nucleic acids are almost always sequenced in metagenomics studies

Host genetic sequences are confidential

Host DNA to be available only to researchers in HMP

[92, 98]

  

Human subjects might be traceable from their DNA sequences

Only microbiome data are released to the public

Â