Rationale for a novel direction in biocomputing:

Over the last 15 years, modern genetics has discovered more than 1000 inherited defects responsible for monogenic (caused by one gene) diseases. These diseases include illnesses with exotic names, such as, familial adenomatous polyposis, and autosomal dominant nonsyndromic sensoneural deafness. For the individuals affected these entities represent highly debilitating and / or death-provoking illnesses. They often present throughout an entire family tree and as such can be tracked down by techniques known as chromosomal linkage analysis in hospital-based genetic studies. However, for more than 80% of cervical cancers and the greater part of colon cancer in men over the age of forty years, the disease origins are multifactorial and multigenic, i.e. caused by many genes and environmental factors. The causality of these diseases currently defies recognition due to an absence of appropriate mathematics and computer power.

Precancerous colonic polyp
Figure 1: Patients with Familial Adenomatous Polyposis manifest 100’s and 1000’s of precancerous colonic polyps similar to that illustrated. Without colectomy, colon cancer and death is inevitable.


The Human Genome Project has recently delivered more than 99% coverage of every DNA nucleotide (code element) contained in the Human genetic code. In parallel, the pharmaceutical and biotech industries have spent many billions of pounds in an attempt to learn more about gene expression and the mechanisms of drug action.

These industries are currently inundated by masses of data from many different sources and are mostly unable to take advantage of the experimental and clinical data at their disposal - again due mostly to a lack of appropriate mathematical tools for data integration and extraction of highly-complex combinations of elements. Each of the latter usually cannot be deciphered by traditional statistics, but together contribute to a particular disease state. These same problems are also common to the biomedical research community and represent major challenges integral to improvements in human medication, disease prevention, earlier diagnosis and, most importantly, the delivery of more effective healthcare.

The success of modern medicine over the last two to three millennia can largely be attributed to the human body's ability to cure itself (Figure 2). Modern medicine remains a largely intuitive science based upon a clinician’s ability to recognize disease symptoms. Unfortunately, an identical pathology can originate from different causes, while the clinician, particularly in general practice, cannot realistically be expected to be all-knowing and able to correctly diagnose all manner of disease entities.

Medieval medical practices
Figure 2: The medieval origins of today’s medical profession.


As our technologies improve, modern medicine will become increasingly mathematically-driven. Most members of the community are now familiar with home testing for diabetes-related glucose levels and pregnancy (Figure 3).

Blood sugar and pregnancy test
Figure 3: Blood sugar and pregnancy testing at home.


Currently available testing procedures in clinical biochemistry allow the general practitioner to confirm and / or improve his or her diagnosis based on a few dozen quantitative measurements, for example, derived from blood samples (Figure 4).

A test tube of blood
Figure 4: Traditional blood biochemistry to assist the general practitioner.


However, genomic technologies now allow us to screen many thousands of data points for an individual patient in a single test (Figure 5). We then must look at these data points for trends to detect biological markers, for example, during clinical trials or when deciding which particular cancer patient might best be sent home for more-loving palliative care rather than cost-intensive, but inappropriate, chemotherapy in hospital.

cDNA biochip
Figure 5: cDNA biochips (similar to small silicon chips in your compute) testing 10,000’s of genes in parallel in a single assay.


Noteworthy is the knowledge that just 1,000 different measurements from a single patient are capable of generating more unique combinations of markers, than there has been seconds elapsed in the history of the universe, i.e. if one assumes the universe to have existed for 13.7 billion years! Thus, we can conclude that the solutions of importance to modern medicine will be computationally intensive, no less so than modelling the universe or nuclear physics research today.

Many people were alarmed to learn that human beings possessed a mere 30,000 genes in their genome, and those almost identical to those of mice. In a human-centric view of the universe, we may feel superior, but the facts suggest otherwise. The almost limitless diversity in facial features, the differences between feet and brains, embryos and old people lie in the combinations of biomolecules produced by the genetic code and not entirely within the DNA itself. In a living cell, such molecular interactions may potentially number in the trillions. When searching for the ‘cause of common cancers’, one can be confident it will not be an answer, but rather combinations of many subtle effects. Thus, computational efforts need to be better funded as part of our efforts to cure human diseases and improve human well-being.

Two biomolecules interacting
Figure 6: Two biomolecules interacting via the ‘patches’ highlighted in blue.


Currently, many hundreds of millions of pounds in research expenditure derived from governments and charitable organisations are under exploited due to a lack of appropriate analysis tools in the post-genome era.

In an effort to remedy this situation, the UK Department of Trade and Industry and the Regional Development Authority, One Northeast, have combined forces to provide seed capital for the establishment of an Institute specializing in the emerging discipline of biosystems informatics. Not only will this initiative help bring cutting-edge technology to the northeast of England, but it will set-out to make meaningful in-roads with respect to improved medication and understanding of multigenic disease. The latter is thought to represent as much as 98% of all human ailments. To achieve this aim, the Institute will focus on the analysis and modelling of biological complexity and apply this mathematical knowledge to improvements in medicine and healthcare delivery. Such activities are currently much sought after in biomedical, biotechnological and pharmaceutical research, as we attempt to better integrate and subsequently mine massive datasets.

Applications include:

  • earlier diagnosis of life-threatening disease;
  • better tailoring of medicines to an individual’s needs;
  • monitoring new treatments in clinical and pre-clinical trials;
  • predicting disease and treatment outcomes ;
  • improved economic and practical efficiencies in healthcare delivery;
  • better targeting of high cost treatments to those patients most likely to benefit;
  • multivariate analysis for bioprocess engineering.

Surprisingly, the mathematical solutions to afford a better understanding of these problems will be tractable by similar informatic and software approaches.

The Institute will build upon significant expertise located at the Universities of Durham, Newcastle-upon-Tyne, Northumbria, Sunderland and Teesside. From its outset, the Institute will be industry-facing and collaborate on cutting-end problems of relevance to improved medication and healthcare delivery.

DTI website Science & Industry Council - Strategy for Success Website One NorthEast website University of Durham website University of Newcastle upon Tyne website Northumbria University website University of Sunderland website University of Teesside website Nonlinear Dynamics Ltd website Biosystems Informatics Institute