23 February 2018

New Knowledge About the Human Genome Paves the Way for New Drugs

Protein research

Leading Danish and international protein researchers have mapped two large unexplored parts of the human genome. Their work paves the way for the development of many new drugs.

The development of new drugs focusses on only 60 percent of the potential drug targets, a study just published in the internationally acclaimed scientific journal Nature Reviews – Drug Discovery concludes. 

The study that builds on extensive data analysis conducted using super computers – a technique called data mining – has examined huge amounts of literature within the health and medical sciences and other evidence sources in order to identify the most and least studied proteins for drug targets, respectively. The study is the first to provide a solid, comprehensive and useful picture of all the proteins that can be used to develop new drugs.

The researchers included 20,000 proteins in the study and are now able to conclude that 8,000 of these more or less have not been mapped and studied by researchers or pharmaceutical companies (that is two out of five). This paves the way for new drug research with great untapped potential. 

‘We have used highly advanced computer analysis of data to shed light on the parts of the human genome that are rarely researched. We can see that they hold great potential, and we hope the analysis can motivate drug researchers to do some pioneer work. This may prove significant to future drug innovation’, says Professor Søren Brunak from the Novo Nordisk Foundation Center for Protein Research.

New Gold Veins for Drug Research
Many diseases are caused by dysfunctional proteins that have been damaged by genetic flaws. The vast majority of drugs try to prevent these proteins from being active and thus to reduce their impact on the disease in question.

It is therefore vital to drug development to be able to study and identify the proteins that are instrumental in diseases. Proteins with great potential are often referred to as drug targets and may after extensive clinical trials be approved for use as drugs.

The study shows that 40 percent of all potential drug proteins have not been subjected to thorough and prioritised study. The researchers have therefore divided the 20,000 proteins into four categories and ranked their potential as future drugs.

According to the analysis, the mapping also paves the way for new so-called repositioning opportunities, where already approved drugs can be tested on new factors. This means that proteins in drugs only approved for one therapy area now with advantage can be tested for treatment of other diseases.

The combination of categorisation and rankings works almost like a treasure map for drugs, and therefore the project has also received funding for stage two.

‘In stage two of the project we aim to improve our tools for studying the biological functions of drug targets based both on scientific texts and on large experiments", says Professor Lars Juhl Jensen, who is responsible for the sequencing of millions of articles using advanced data mining techniques.

Big Data Sheds Light on the Dark Sides of the Genome
Since the 1990s researchers affiliated to The Human Genome Project have tried to map the human genome, and in 2014 the National Institutes of Health Common Fund took steps to mapping the genes in the human genome that code for proteins through the project Illuminating the Druggable Genome.

The researchers initially believed that more than 100,000 genes were able to code for proteins, but the mapping showed that there is only around 20,000. The drugs available today relate to less than 1,000 drug targets. According to Søren Brunak, the potential of drug design based on these proteins is almost exhausted, and exploring new territory is therefore important.

The objective of sequencing the human genome is typically to determine which genes are related to a given disease. Here the researchers look at specific gene patterns in families and entire population groups to determine what causes certain diseases. 

Fact Box: Data mining is looking for patterns and structures in large amounts of data. Through e.g. algorithms or direct observation the aim is to identify relations between the data points in order to better visualise or in the long term use the complex information.

Read the Analysis here: https://www.nature.com/articles/nrd.2018.14

The Analysis is based on research collaboration between: 

NIH National Center for Advancing Translational Sciences, Novo Nordisk Foundation Center for Protein Research, The Faculty of Health and Medical Sciences, University of Copenhagen, The University of New Mexico, the University of Miami, the European Bioinformatics Institute and the Icahn School of Medicine at Mt. Sinai. 

Contact Information

Andreas Westergaard, Communications Advisor
Faculty of Health and Medical Sciences, University of Copenhagen
Phone: +45 53 59 32 80