Lee Hood Group

Dr. Leroy Hood

MD, Johns Hopkins School of Medicine
PhD, Biochemistry, California Institute of Technology

"Systems biology and medicine – not only in the lab but in the everyday lives of people – challenges the imagination and will transform the 21st Century."

–Leroy Hood, MD, PhD, President and Co-Founder

Lee Hood Group

The Hood group is integrating biology, technology and computation to create a predictive, personalized, preventive and participatory approach to medicine. This P4 Medicine will use a systems or holistic approach and new computational and mathematical tools to analyze the enormous amounts of molecular, cellular, phenotypic and medical data that now can be generated for each individual. By viewing medicine as an informational science, P4 medicine will draw on an understanding of the networks underlying health and disease. The goals are to treat and prevent disease by identifying perturbations in biological networks, and countering those perturbations through therapeutic intervention. Furthermore, a systems approach to biology will create knowledge with far-reaching implications for agriculture, energy production, environmental protection, and many other human activities.

Research Overview

The Hood group is developing strategies, technologies and knowledge that will lead to medicine that is predictive, personalized, preventive and participatory -- P4 Medicine. The premise of their work is that diseases result from perturbations of biological networks. These perturbations may arise from biological changes, such as mutations in the digital information of the genome, or from environmental influences, such as toxins or bacteria. These disease-perturbed networks both cause and reflect the progression of a disease. Thus, diseases can be diagnosed, treated and prevented by understanding and intervening in the networks that underlie health and illness.

Within a decade, P4 Medicine will enable the creation of a virtual cloud of billions of data points around each individual. The goal of ISB is to develop the analytic tools needed to translate this enormous data cloud into straightforward predictions about health and disease in each person.

The Hood group is working on several technology-driven projects that will help realize the promise of P4 Medicine.

It has pioneered a new approach to the identification of disease genes through the complete genome sequencing of families affected by particular diseases. Use of pedigree information makes it possible to correct a significant fraction of DNA sequencing errors, identify rare genetic variants (some of which contribute to diseases) and reduce the search space for disease genes.

It is establishing the computational infrastructure needed to analyze the thousands and eventually millions of human genome sequences that will become available over the next ten years. These computational tools will enable large-scale comparative analyses of human genomes and their attendant molecular, cellular and phenotypic data.

It is supporting a human proteome project that will parallel the human genome project. Using selected reaction monitoring (SRM) mass spectrometry, it has created targeted assays for virtually all human proteins, which opens up fascinating opportunities for identifying blood and tissue biomarkers.

It is developing clinical assays that use genomic, proteomic and cellular analyses, including the use of induced pluripotent stem (iPS) cells, to explore development and to stratify disease. Through collaborations with the non-profit P4 Medicine Institute and Ohio State Medical School, these assays are being used to improve health in two pilot projects involving wellness and heart failure.

The group is using and developing several high-throughput technologies that will be invaluable in research aimed at P4 Medicine, including:

A surface plasmon resonance instrument capable of making 1,000 measurements at a time to screen for effective antibodies and analyze other protein/protein interactions.

A Nanostring instrument designed for the digital counting of RNA and miRNA molecules now being used to conduct highly sensitive protein assays. We are also exploring the use of this instrument to develop highly sensitive Eliza protein assays.

A Fluidigm microfluidic array platform that will be used to quantify mRNAs and miRNAs in various single-cell studies.

The research ongoing in the Hood lab requires a cross-disciplinary mix of scientists. The lab also engages in important collaborations within ISB to develop P4 medicine and with external partners such as the P4 Medicine Institute of Seattle to bring clinical assays to patients at diverse medical centers and the Gladstone Institute in San Francisco and Mass General Hospital in Boston in neurodegenerative diseases.

Research Focus

Prion Disease
Studies in the Hood lab of neural degeneration in mice illustrate the power of the systems approach to disease. Members of the Hood team injected infectious prions -- proteins that induce disease by causing other prion proteins to assume new configurations -- into the brains of inbred mice. They then gathered data about brain gene expression as the disease progressed. Using microarrays, they compared patterns of gene expression in the diseased animals to those in normal animals at ten or more time points during the disease progression to identify differentially expressed genes (DEGs). They also looked at gene expression levels in eight different combinations of mouse-and-prion strains to eliminate by subtractive procedures biological "noise" unrelated to the disease. In this way, they reduced 7,400 DEGs identified in the original screen to 333 DEGs that encoded the core prion neurodegenerative response.

The results were striking. Two-thirds of the 333 DEGs were components of four major biological networks that had been previously identified as playing a role in prion disease. The remaining DEGs defined six previously unidentified networks. The dynamics of these networks explained virtually every aspect of the cellular pathology of prion disease.

Results of the gene expression and network analyses in the mouse prion model suggested that relevant biomarkers for disease progression would be found in the blood. Use of these markers made it possible to diagnose prion disease before symptoms appear, to follow the progression of the disease, and to stratify the disease into distinct types. Thus, these studies produced critical insights into how to use blood as a window into health and disease.

Neurodegenerative diseases
Hood and his coworkers are applying the systems approach they developed for prion disease in mice (see "Research Focus") to several other projects. They are studying two other neurodegenerative diseases in mice, Huntington's disease and frontotemporal dementia. They are applying a systems approach to the cancer glioblastoma in both mice and humans. They also are using systems analyses to study liver toxicity in mice caused by various chemical toxins.

Using iPS cells to study cardiomyocyte differentiation
In a new project, the group is causing iPS cells to generate fully differentiated cardiomyocytes. The iPS cells come from two individual members of a family with a cardiomyopathy-one a normal individual and the second a diseased individual. All the members of the family have had their genomes sequenced. By following the differentiation of iPS cells from one individual with the cardiomyopathy and a sibling without the cardiomyopathy, the group can measure gene expression activity, the epigenomic modification of genes, and populations of proteins, metabolites, mRNAs, and miRNAs. These measurements can then be assessed in relation to the known genome sequences, with the goal of integrating all the data into a network model of how normal and diseased myocardiocytes develop.

Family Sequencing
Another area of major effort in the Hood lab, which integrates biological discovery with computational tool development, is the analysis of whole genome sequences from the members of families that have genetic diseases. We are convinced that this will allow us to easily detect the genes encoding simple Mendelian diseases and perhaps to identify modifier genes or those encoding complex genetic diseases.

The family genome projects illustrate how one can address one of the central problems of systems biology-biological complexity in biological systems and the signal to noise issues arising from large data sets. We have suggested that filters and integrators are useful devices in dealing with noise. Filters work to immediately reject some events as impossible, or at least very unlikely, by imposing assumptions about the signal. Integrators transform the information flow by aggregating individual events (of the same data types or different data types) into larger units to yield a fundamentally new type of information (see Idekur et al (referenced below) for a discussion of these new concepts in the context of the family genome studies).

Whole genome sequencing is transformational because of the data precision offered, the ability to detect rare variants, and the ability to delineate with precision parental haplotypes in the children. In principle whole genome sequences assay all genetic markers, hence coding regions as well as non-coding regions can be searched for genetic elements encoding disease. Genome-wide association studies suggest there may be many such non-coding elements, for example, regulatory motifs and splicing signals. Whole genome sequencing in the context of pedigrees enables recombination and linkage analyses that are vital for precise and confident location of candidate genes in affected families. In addition the accuracy of the family genome studies permitted us to make the first ever estimates of the intergenerational mutation rate (~30 mutations per child or 1/108 mutation rate--See Roach et al below).

A number of software modules are being flexibly integrated into the workflow to automate the complex tasks related to exploring and analyzing whole genome pedigree data. Key modules are Error analysis, Mutations, Inheritance State Analysis, Phasing, and Prior probabilities. The workflow will be made freely available to the scientific community.

In the first application of this approach the group sequenced the genomes of a family of four harboring an unknown gene with recessive inheritance causing Miller syndrome (Roach et al., 2010). No previously existing tools could be adapted to this challenge in high-throughput data analysis, so a computational workflow was developed in tandem with data analysis. Key innovations included being able to detect genotyping errors and inheritance state analys is. These encompass the filters and integrators mentioned above. The combination of highly accurate sequence data and precise pedigree analysis allowed the previously difficult feat of identifying the defective gene in a recessive disorder from only a single family of four (for Miller syndrome, this is DHODH).

Now the group is using complete genome sequencing of families to identify variants in genes that interact with the primary defective gene (i.e., genetic modifiers) of the neurodegenerative disorder, Huntington's disease. Indeed, we are now analyzing 65 complete genome sequences from members of Huntington's Disease-affected families. Future studies will tackle even more complex, multigenic disorders such as Alzheimer's and Parkinson's disease, after appropriate disease stratification into their distinct types. In parallel with these specific disease studies, the group continues to develop and refine the computational tools needed for large-scale comparative analyses of human genomes, and the eventual utilization of whole genome sequence analysis in the clinic.