
IBM researchers have identified numerous DNA patterns shared by areas of the human genome which they believe have been wrongly dismissed as functionless by scientists.
The researchers believe that these regions of the human genome that were assumed largely to contain evolutionary leftovers, or "junk DNA", may actually hold significant clues that can add to scientists'
understanding of cellular processes.
The breakthrough occurred after the researchers discovered that these regions contain numerous short DNA "motifs" or repeating sequence fragments which are also present in the parts of the genome that give rise to proteins.
Ajay Royyuru, head of the Computational Biology Center at IBM Research, said: "Our goal is to apply advanced computational techniques to analyze the workings of processes and systems, in this case the function of the human genome."
The IBM team used a computational formula called pattern-discovery, often employed to mine useful information from very large repositories of data in business and scientific applications, to sift through the approximately six billion letters in the non-coding regions of the human genome and look for repeating sequence fragments, or motifs.
Among the millions of discovered motifs, the team identified approximately 128,000 that also occur in the coding region of the genome and are significantly over-represented in genes involved in specific biological processes such as cell communication, regulation of transcription, transport and others.