Institute for Systems Biology
  Home: Systems Biology in Depth: Challenges of Systems Biology Print Page
Systems Biology in Depth
Premise of Systems Biology
Promise of Systems Biology
Methodologies of Systems Biology
Challenges of Systems Biology
Perspectives of Systems Biology
 Challenges of Systems Biology
Challenges of Systems Biology

Systems biology research is still in its infancy. Maturation of the field will proceed as the many challenges that it faces are addressed and successfully solved. The most pressing challenges fall roughly into the following four categories:

Experimental -- pertains to strategies for designing experiments and collecting reliable data

Technological -- pertains to the development of new instrumentation for making rapid, highly parallel, inexpensive and accurate measurements of informational molecules and their sequence, structure, modifications or processing, localization and interactions with other components large and small.

Computational -- pertains to the development and refinement of network theory and effective engineering of simulation tools, so that descriptive networks can be replaced by more accurate dynamic models of the system’s molecular interactions.

Sociological -- pertains to effective communication across disciplines, the dynamics of research teams, difficulties obtaining funding, and the like.

Experimental challenges:

Paradoxically, systems biology research suffers from having both too much data and not enough. The volumes of data generated, for example, by large-scale microarrays and yeast two-hybrid experiments present two types of challenges:

1) Data space is infinite and to gather relevant information for a particular system, hypotheses must be carefully formulated to search relevant data space.
2) Most large data sets are replete with technical and biological noise, or various types of systematic error. Technical noise arises from methodological irreproducibility or sample variation. Biological noise (i.e., the “false positive problem) stems from the stochasticity of cell populations and the extreme sensitivity of some of the techniques employed. The integration of global data sets from different laboratories may represent a serious challenge because of different approaches to the same data measurement and different levels of data error. Thus, data retrieved from outside sources, though potentially useful, may not be integratable to one’s own data, nor may it be directly pertinent to the system being explored. Hypotheses must be carefully formulated to search relevant data space. Metrics are needed to evaluate, validate and integrate these large datasets, else conclusions drawn from them may be misleading. It is anticipated that universally applicable standards and attention to data quality will increase the information content of high-throughput datasets. However, extracting relevant and meaningful conclusions from the data will require application of the systems biology cycle--hypothesis formation, experimental design, modeling data into interaction networks, successive refinements of models through perturbation studies, etc.

On the other hand, biology is still “data-poor.” This is obviously true if data space is infinite. The available experimental data for each species addresses only a very sparse patchwork of cellular processes under a limited number of experimental conditions. The current platforms for high-throughput analyses are expensive and inaccessible to many researchers. In some cases, experimentation is limited by sample availability, as is true for many medical applications. Some experimental methods will not detect a system’s rare elements or transient interactions, with the result that important data will be missed (i.e., the “false negative” problem). Moreover, only one type of measurement is typically performed within an experimental platform, whereas what one really wants is the simultaneous measurement of multiple parameters in the same sample at the same time. To this end, minaturization technologies such as nanosensors and microfluidics devices are being invented for the purpose of reducing sample volume, increasing sample throughput, increasing measurement accuracy and multiplexing different types of measurements on the same sample, perhaps even on a single cell!

Systems biology also suffers from the criticism that has historically beset biochemistry, namely, that for most data collection strategies, the cell must be destroyed in order for interactions between its molecular components to be assessed, and this calls into question whether the inferences drawn are indeed accurate reflections of the dynamics of the living cell. This difficulty is being partially addressed by advances in less invasive technologies such as molecular imaging. As a general point, to effectively investigate the “in between” domain of parts and wholes, that is, the molecular interactions and networks, effective reporter assays must be developed so that system perturbations can be selectively targeted to specific cells in a population or specific networks in a cell; and effects accurately and quantitatively measured as dynamic changes that occur over both space and time.

Technical challenges:

A requirement of systems biology is that to specify systems, millions of measurements must be made. These include measurements to characterize mRNAs, proteins, small molecules and other cellular components. These measurements must identify, quantify, characterize component variations (e.g., mRNA splicing or protein processing or modification, localize, measure turnover rates. The need to minaturize, parallelize, automate and integrate the separate components of procedures, as well as to increase the throughput and reduce the cost of measurements, pushes us toward the development of measurement tools employing microfluidic and nanotechnology tools. These will lead to the digitalization of biology— the ability to obtain information from single molecules or the information content of single cells. We must also develop more powerful and sensitive molecular imaging techniques to characterize molecular behavior in vivo. These challenges are beginning to be met by the NanoSystems Biology Alliance.

Computational challenges:

A goal of systems biology is to formulate initial working models for the biological networks that are predictive of both the kinetic and equilibrium behavior of the system in question. This is especially true for regulatory networks, which involve macromolecular complexes of transcription factors and batteries of genes which are expressed or silenced based on the DNA-protein binding interactions between the gene’s cis-regulatory elements and components of the transcription factor complex.

Because of the complexity of the feedback loops involved, it may not be clear which biomolecular species to measure and to what accuracy. This problem might be overcome if we could obtain a more clear understanding of which DNA sequence motifs function as regulatory elements, and if we could dissect large regulatory networks into smaller subnetworks or modules by using various models of connectivity such as GO annotations.

All-by-all genome comparisons of species of varying evolutionary relatedness might assist with the identification of non-coding sequences that are important to gene regulation. Advances in network topology theory and visualization tools might enable biologists to assemble data into network models that better portray the kinetics of molecular interactions across diverse types of elements. Simulations based on these models should enhance and refine experimental design, thereby speeding up application of the systems biology cycle. Unfortunately, we are just at the beginning in terms of the data that needs to be collected and the algorithm development that must occur. The ISB has formulated 11 computational or mathematical challenges in contemporary biology:

  1. How to fully decipher the (digital) information content of the genome
  2. How to do all-vs-all comparisons of 1000s of genomes
  3. How to extract protein and gene regulatory networks from 1 & 2
  4. How to integrate multiple high-throughout data types dependably
  5. How to visualize & explore large-scale, multi-dimensional data
  6. How to convert static network maps into dynamic mathematical models
  7. How to predict protein function ab initio
  8. How to identify signatures for cellular states (e.g. healthy vs. diseased)
  9. How to build hierarchical models across multiple scales of time & space
  10. How to reduce complex multi-dimensional models to underlying principles
  11. Text searching to bring the literature and experimental data together

Sociological challenges:

How do you get biologists and engineers with expertise in experimental design and execution to join forces with computer scientists and mathematicians with expertise in algorithm development, and have everyone agree on the choice of model system, biological process, and strategies for investigating it? For systems biology research to be effective, that’s what has to happen.

First, it is helpful to have all of the team members in one location, rather than spread across several academic departments.

Second, interactions among team members will be more or less productive, depending on the amount of cross-disciplinary training and experience possessed by the team members; their willingness to learn new languages of science (e.g., non-biologists learning the language of biology) and sets of concepts; and the respect each has for the knowledge and experience of the others.

Experimentalists want to be seen as more than “technicians” and computational biologists as more than just “programmers.” Thus, the organizational culture must be developed in a way that communication works well within the teams.

Team-oriented science poses predictable difficulties in terms of ownership of data, proper attribution/credit in terms of journal authorship, and career advancement for the individual members. For people to be motivated to invest significant energy into systems biology research, there must be effective metrics for success that reward individual team members accordingly.

Because systems biology is new and considered risky, funding the research through federal grants is more difficult, given the conservative temperament of many study sections. The problem is exacerbated by the high costs involved with global data collection and analysis. A viable project might well exceed the limits of a standard R01 NIH research grant. Thus, sustainable sources of funding must be identified and procured.

Scientists pursuing systems biology research are meeting these challenges by establishing their own institutes or organizations. There is a “self selection” for interested team-oriented investigators who want to practice biology in a cross-disciplinary or inter-disciplinary context. Such organizations poise themselves midway between academia and industry. They keep the freedom of relatively unconstrained intellectual pursuit and long-terms goals, yet establish implementation strategies closer to what one might find in industry (i.e., interdisciplinary teams and centralized high-throughput facilities.

Even after the field of systems biology is established, fulfilling its promise will require that changes occur across a broader context of inquiry and training. For example, primary care physicians will need to become more sophisticated about genetics and its effects on preventive, predictive and personalized medicine.

Hans V. Westerhoff


HOME | ABOUT ISB | NEWS | CAREERS | CONTACT ISB | SITE MAP | TERMS OF USE | INTRANET
© 2010, Institute for Systems Biology, All Rights Reserved