DNA Research Sheds Light on Unknown Parts of the Human Genome – University of Copenhagen

Health > News > 2017 > DNA Research Sheds Lig...

18 May 2017

DNA Research Sheds Light on Unknown Parts of the Human Genome


New research conducted at the University of Copenhagen sheds light on the human genome’s so-called ‘dark matter’, which have so far been considered junk DNA. The result shows that the newly discovered DNA elements hold great potential for better understanding a number of diseases.

Using advanced computer algorithms researchers at the University of Copenhagen, Herlev University Hospital, the Technical University of Denmark and the University of Washington, Seattle have studied unknown parts of the genomes of humans and vertebrates. The study is about to be published in the internationally acclaimed scientific journal Genome Research.

Traditionally, research within the health and medical sciences has focussed on approx. one per cent of the three billion building stones in the genome which code for proteins, while the remaining 99 per cent has been considered so-called junk DNA or ’dark matter. This is so because it has previously been difficult to examine other genes than the protein-codes genes, which are well-preserved evolutionarily and are thus easier to decode and analyse. 

"Using advanced algorithms as opposed to traditional methods we get a more comprehensive picture. This has made it possible to analyse RNA structural patterns, and it paves the way for brand new perspectives on diseases", says Project Manager Jan Gorodkin, who is Head of the Center for non-coding RNA in Technology and Health and Professor at the Department of Veterinary and Animal Sciences at the Faculty of Health and Medical Sciences. 

Fact box

DNA is the genetic material found in all living cells, and it contains the recipe for cell reproduction and the production of proteins, among other things. DNA is the information-carrying molecule from which RNA molecules are formed. And while the molecules that code for proteins are more accessible for analysis, the remaining – the non-coding RNA – are more difficult to analyse. This is so because the RNA molecules fold into structures once the building stones in the molecules match. New research makes it possible to ‘open’ the RNA structures and analyse so far unknown amounts of information.

Concretely, technological breakthroughs have made it possible to identify approx. 500,000 new genomic regions containing brand new information on the structure of the human genome. Certain parts of the DNA of the genome are genes of which some code for proteins, while others are non-coding. Through evolution the genes have been inherited from one species to the next, but each step in this process has also caused variations, and the coding regions are the best-preserved. 

"Unlike protein-coding genes, the actual sequence of the DNA in the non-coding RNA genes can vary through evolution, while the structure of the RNA molecule remains unchanged. And it is the structure that is vital to the function of the molecule. The specially developed algorithms have made it possible to uncover and analyse this type of information from the “dark” genes", says Assistant Professor Stefan Seemann, who has headed the extensive analytical work.

The study has made intense use of computer calculations. "The advanced algorithms come at a cost: They are calculation-intensive. Project calculations have taken more than 150 CPU core years, which corresponds to 150 years for a single processor",’ says Jan Gorodkin.

Fact box

The genomes of humans and vertebrates consist of approx. three billion building stones, of which only approx. one per cent code for proteins. The remaining approx. 99 per cent of the building stones have previously been considered junk DNA, and they have as a rule not contributed to our understanding of diseases. However, recent years’ research indicates that there are other types of genes than the protein-coding genes in the so-called junk DNA – namely non-coding RNA genes. The junk DNA is also called the ’dark matter’ of the genome, because its role is unknown.

This Research Paves the Way for New Discoveries
The activity of non-coding RNA genes is often very limited, and the researchers have therefore used particularly sensitive sequencing technology to study a section of the regions in which non-coding RNA is formed. Using this method, the researchers have managed to detect more than 600 different RNA molecules from low-activity genes in foetal brains. The discovery supports their assumption that the brain codes for many more gene products than expected. This part of the study was conducted by professor Niels Tommerup from the Department of Cellular and Molecular Medicine at the University of Copenhagen and Professor Flemming Pociot’s research groups at Herlev Hospital.

‘The fact that we find new RNA molecules in brain tissue containing evolutionarily preserved RNA structure suggests that they must hold important functions. We can now begin to examine how these non-coding RNA genes are involved in both normal brain development and function and in the many known brain diseases’, says Professor Niels Tommerup.

The research also reveals that many of the evolutionarily preserved non-coding RNA structures is placed close the genes whose functions have already been uncovered. Precisely the interplay between the various elements can help the researchers understand the role of the genes in various forms of disease development. The analyses were conducted by Stefan Seemann in cooperation with, among others, research groups headed by Professor Christopher Workman from the Technical University of Denmark and Professor Walter L. Ruzzo from the University of Washington.

Through the study the researchers have discovered new connections, which may be used to further explore the disease associations in ‘dark matter’. At the same time, the discovered RNA structures help give the researchers a better understanding of the genome of production animals, which in time can be used to develop model animals for disease studies.

The project is funded by Innovation Fund Denmark (main grant) and the Lundbeck Foundation. The computer calculations were made possible by funding from the former Danish Center for Scientific Computing and access to the Danish supercomputer Computerome.

Read the entire study in Genome Research here: http://genome.cshlp.org/content/early/2017/05/08/gr.208652.116.full.pdf+html