Most emerging infectious diseases that affect humans, such as COVID-19, are zoonotic, that is, caused by viruses that originate in other animal species. Therefore, early identification of high-risk viruses can help in prevention and epidemiological surveillance. We are talking about a great challenge, and only a minority of the 1.67 million animals that are estimated to exist could infect humans.
A new study just published in the journal PLOS Biology suggests that machine learning, a type of artificial intelligence, is capable of predicting the probability of these zoonoses from the genomic information of viruses. To develop their model, the researchers gathered a data set of 861 species of viruses from 36 families. Machine learning models assigned a probability of human infection based on the patterns observed in the virus genomes. The authors then applied the best-performing model to predict zoonotic potential patterns in other viruses sampled from a wide variety of animal species.
The researchers found that viral genomes may have generalizable characteristics that are independent of the taxonomic relationships of the viruses and may involve pre-adaptation of viruses to infect humans . In addition, they managed to develop machine learning models capable of identifying possible zoonoses using viral genomes.
Computer models are only a preliminary step in identifying zoonotic viruses with the potential to infect humans, and they represent a first screening: the viruses identified by the models will require confirmatory laboratory testing before making decisions about their potential risk. Furthermore, while these models predict whether viruses could infect humans, the ability to infect is only part of the broader zoonotic risk, which is also influenced by the virulence of the virus in humans, the ability to transmit human-to-human, and human-to-human transmission. ecological conditions at the time of human exposure.
“Our findings show that the zoonotic potential of viruses can be inferred to a surprisingly large degree from the sequence of their genome. By highlighting the viruses with the greatest potential to become zoonotic, genome-based classification allows further ecological and virological characterization to be targeted more effectively, ”the authors note.
“These findings add a crucial piece to the already surprising amount of information that we can extract from the genetic sequence of viruses using AI techniques,” says Simon Babayan, a researcher at the University of Glasgow, UK. “A genomic sequence is typically the first and often the only information we have about newly discovered viruses, and the more information we can extract from it, the sooner we can identify the origins of the virus and the zoonotic risk it may pose. As more viruses are characterized, the more effective our machine learning models will be in identifying rare viruses that must be closely monitored and prioritized for preventive vaccine development. “
Referencia: Mollentze N, Babayan SA, Streicker DG (2021) Identifying and prioritizing potential human-infecting viruses from their genome sequences. PLoS Biol 19(9): e3001390.