Categories
Categories

New AI Platform Elucidates Regulatory Activity in the Genome

Researchers investigating the complex gene regulatory mechanisms involved in healthy and disordered biological processes now have a new tool in their kit. Researchers at the University of California, San Diego (UCSD), and elsewhere have developed deep learning software that they claim can be adapted to work for various genomics projects. Details of the software, dubbed genomic elements with neural nets or EUGENe, are provided in Nature Computational Science in a paper titled, “Predictive analysi
Nov 20th,2023 339 Views

Researchers investigating the complex gene regulatory mechanisms involved in healthy and disordered biological processes now have a new tool in their kit. Researchers at the University of California, San Diego (UCSD), and elsewhere have developed deep learning software that they claim can be adapted to work for various genomics projects. Details of the software, dubbed genomic elements with neural nets or EUGENe, are provided in Nature Computational Science in a paper titled, “Predictive analysis of regulatory sequences with EUGENe.”

According to the paper, EUGENe’ comprises various modules and subpackages for extracting and transforming sequence data, instantiating and training computational models, and evaluating and interpreting how the models behave after training. “The major goal of EUGENe is to streamline the end-to-end execution of these three stages to promote the effective design, implementation, validation, and interpretation of deep-learning solutions in regulatory genomics,” the scientists wrote. 

Adam Klie, a PhD student at UCSD School of Medicine and the study’s first author, designed the software to mitigate those challenges which he also experienced in his own work. “A lot of existing platforms require many hours of coding and data wrangling to use,” he said. EUGENe is much simpler to operate. “[Y]ou give an algorithm a sequence of DNA and ask it to make predictions about anything you’d expect that DNA could predict, such as whether a particular DNA sequence is functional or whether it regulates a gene in a certain biological context.” Scientists can use the software to explore the various properties of the sequence in question and what happens when things are modified. 

The researchers put EUGENe through its paces by attempting to reproduce the results of three regulatory genomics studies that use different types of sequencing data. These datasets came from an assay of plant promoters, RNA binding protein specificity data, and ChIP-sequencing data from the ENCODE project. Analyzing different types of data would typically require mixing and matching multiple technology platforms. However, the scientists were able to successfully adapt EUGENe to each data type and reproduce the findings of each study. 

At the moment, the solution works with DNA and RNA data but “does not have dedicated functions for handling protein sequence or multimodal inputs,” the researchers wrote. They plan to expand it to include new data types such as single-cell sequencing. 

They will also make the solution available more broadly to the scientific community. “Deep learning can provide valuable insights into the biological machinery driving this variety, but it can be challenging to implement for researchers without extensive computer science expertise,” Carter said. “We wanted to create a platform that can help genomics researchers streamline their deep learning data analysis to make predictions from raw data.”

CONTACT US
FirstName*
LastName*
Email*
Message*