Boosting the Powers of Genomic Science
With two new methods, UC San Diego scientists hope to improve genome-wide association studies
As scientists probe and parse the genetic bases of what makes a human a human (or one human different from another), and vigorously push for greater use of whole genome sequencing, they find themselves increasingly threatened by the unthinkable: Too much data to make full sense of.
In a pair of papers published in the April 25, 2013 issue of PLOS Genetics, two diverse teams of scientists, both headed by researchers at the University of California, San Diego School of Medicine, describe novel statistical models that more broadly and deeply identify associations between bits of sequenced DNA called single nucleotide polymorphisms or SNPs and say lead to a more complete and accurate understanding of the genetic underpinnings of many diseases and how best to treat them.
“It’s increasingly evident that highly heritable diseases and traits are influenced by a large number of genetic variants in different parts of the genome, each with small effects,” said Anders M. Dale, PhD, a professor in the departments of Radiology, Neurosciences and Psychiatry at the UC San Diego School of Medicine. “Unfortunately, it’s also increasingly evident that existing statistical methods, like genome-wide association studies (GWAS) that look for associations between SNPs and diseases, are severely underpowered and can’t adequately incorporate all of this new, exciting and exceedingly rich data.”
Dale cited, for example, a recent study published in Nature Genetics in which researchers used traditional GWAS to raise the number of SNPs associated with primary sclerosing cholangitis from four to 16. The scientists then applied the new statistical methods to identify 33 additional SNPs, more than tripling the number of genome locations associated with the life-threatening liver disease.
Generally speaking, the new methods boost researchers’ analytical powers by incorporating a priori or prior knowledge about the function of SNPs with their pleiotrophic relationships to multiple phenotypes. Pleiotrophy occurs when one gene influences multiple sets of observed traits or phenotypes.
Dale and colleagues believe the new methods could lead to a paradigm shift in CWAS analysis, with profound implications across a broad range of complex traits and disorders.
“There is ever-greater emphasis being placed on expensive whole genome sequencing efforts,” he said, “but as the science advances, the challenges become larger. The needle in the haystack of traditional GWAS involves searching through about one million SNPs. This will increase 10- to 100-fold, to about 3 billion positions. We think these new methodologies allow us to more completely exploit our resources, to extract the most information possible, which we think has important implications for gene discovery, drug development and more accurately assessing a person’s overall genetic risk of developing a certain disease.”
An electrophoresis gel separates nucleic acid molecules into constituent DNA, RNA and proteins, based on size and electrical charge.
A new study, to be published in the Feb. 7, 2013 issue of the American Journal of Human Genetics, expands and deepens the biological and genetic links between cardiovascular disease and schizophrenia. Cardiovascular disease (CVD) is the leading cause of premature death among schizophrenia patients, who die from heart and blood vessel disorders at a rate double that of persons without the mental disorder.
“These results have important clinical implications, adding to our growing awareness that cardiovascular disease is under-recognized and under-treated in mentally ill individuals,” said study first author Ole Andreassen, MD, PhD, an adjunct professor at the University of California, San Diego School of Medicine and professor of psychiatry at the University of Oslo. “Its presence in schizophrenia is not solely due to lifestyle or medication side effects. Clinicians must recognize that individuals with schizophrenia are at risk for cardiovascular disease independent of these factors.”
Led by principal investigator Anders M. Dale, PhD, professor of radiology, neurosciences, psychiatry and cognitive science at UC San Diego School of Medicine, an international team of researchers used a novel statistical model to magnify the analytical powers of genome-wide association studies or GWAS.
These are studies in which differing bits of sequential DNA – called single nucleotide polymorphisms or SNPs – in persons and groups are compared to find common genetic variants that might be linked to a trait or disease. The researchers boosted the power of GWAS by adding information based on genetic pleiotropy, the concept that at least some genes influence multiple traits or phenotypes.
“Our approach is different in that we use all available genetic information for multiple traits and diseases, not just SNPs below a given statistical threshold,” said Dale. “This significantly increases the power to discover new genes by leveraging the combined power across multiple GWAS of pleiotropic traits and diseases.”
The scientists confirmed nine SNPs linked to schizophrenia in prior studies, but also identified 16 new loci – some of which are also associated with CVD. Among these shared risk factors: triglyceride and lipoprotein levels, waist-hip ratio, systolic blood pressure and body mass index.
“Our findings suggest that shared biological and genetic mechanisms can help explain why schizophrenia patients have a greater risk of cardiovascular disease,” said study co-author Rahul S. Desikan, MD, PhD, research fellow and radiology resident at the UC San Diego School of Medicine.
“In addition to schizophrenia, this new analysis method can be used to examine the genetic overlap between a number of diseases and traits,” Desikan said. “Examining overlap in common variants can shed insight into disease mechanisms and help identify potential therapeutic targets for common diseases.”
Genomic “Hotspots” Offer Clues to Causes of Autism, Other Disorders
An international team, led by researchers from the University of California, San Diego School of Medicine, has discovered that “random” mutations in the genome are not quite so random after all. Their study, to be published in the journal Cell on December 21, shows that the DNA sequence in some regions of the human genome is quite volatile and can mutate ten times more frequently than the rest of the genome. Genes that are linked to autism and a variety of other disorders have a particularly strong tendency to mutate.
Clusters of mutations or “hotspots” are not unique to the autism genome but instead are an intrinsic characteristic of the human genome, according to principal investigator Jonathan Sebat, PhD, professor of psychiatry and cellular and molecule medicine, and chief of the Beyster Center for Molecular Genomics of Neuropsychiatric Diseases at UC San Diego.
“Our findings provide some insights into the underlying basis of autism—that, surprisingly, the genome is not shy about tinkering with its important genes” said Sebat. “To the contrary, disease-causing genes tend to be hypermutable.”
Image courtesy of University of Washington
Next time you’re visiting a hospital or medical center and step inside to wash your hands, take a look at the label on the antibiotic soap you’re using. Often it will sport a disclaimer saying it does not work against Clostridium difficile, a bacterium so difficult to control these days that its resistance to modern medicine is duly noted in its name.
Drug-resistant microbes are a huge and growing problem. More than 60,000 Americans die each year from infections caused by bacteria unaffected by modern antibiotics. C. difficile is among the most worrisome superbugs, responsible for killing 14,000 Americans each year. Some reports say the actual mortality number is more than double that.
In a new paper, published in the journal Nature Genetics, British geneticists sequenced a variety of C. difficile strains, producing a genomic chart that maps how the drug-resistant bacteria first appeared in North America, and then spread to other parts of the world.
The researchers determined that a particularly virulent form of C. difficile emerged in Pittsburgh around 2001, most likely in a hospital where patients received heavy doses of antibiotic treatment. It quickly spread to Oregon, Arizona, New Jersey and Maryland, with major hospital outbreaks in those states.
Then it spread beyond, the strain appearing in South Korea and Switzerland after 2007.
Given the remarkable abilities of bacteria to mutate quickly and often – an evolutionary adaptation that has rendered them the singularly most successful life form in Earth’s history – the emergence of the Pittsburgh superbug wasn’t an unfortunate, isolated event.
Genomic data reveals that another C. difficile superbug arose independently in the United States around 2001 (exact whereabouts unknown) and subsequently spread to Montreal in 2003 and the Netherlands in 2006.
Both strains of C. difficile enjoy a single enzyme mutation that binds to fluroquinolones, a subset of a common class of powerful antibiotics. The binding renders the antibiotics ineffective. The strains also have genes that pump the antibiotics out of their cells. Fluroquinolones were widely prescribed in the late-1990s and early 2000s; much less so now due to drug-resistance and associated side effects.
The Nature Genetics paper is another cautionary tale (one of many) about the risks and consequences of overusing antibiotics. It suggests that bacteria of all sorts, not just C. difficile, develop resistance faster, more easily and more often than previously suspected.
Unfortunately, solutions to this growing global health threat are far less forthcoming.
Methylome modifications offer new measure of our “biological” age
Women live longer than men. Individuals can appear or feel years younger – or older – than their chronological age. Diseases can affect our aging process. When it comes to biology, our clocks clearly tick differently.
In a new study, researchers at the University of California, San Diego School of Medicine, with colleagues elsewhere, describe markers and a model that quantify how aging occurs at the level of genes and molecules, providing not just a more precise way to determine how old someone is, but also perhaps anticipate or treat ailments and diseases that come with the passage of time.
The findings are published in the November 21 online issue of the journal Molecular Cell.
“It’s well known that people age at different rates,” said Kang Zhang, MD, PhD, professor of ophthalmology and human genetics at the Shiley Eye Center and director of the Institute for Genomic Medicine, both at UC San Diego. “Some people in their 70s look like they’re in their 50s, while others in their 50s look like they’re in their 70s.”
However, identifying markers and precisely quantifying the actual rate of aging in individuals has been challenging. For example, researchers have looked at telomeres – repeating nucleotide sequences that cap the ends of chromosomes and which shorten with age – but have found that other factors like stress can affect them as well.
In the new Molecular Cell paper, Zhang and colleagues focus on DNA methylation, a fundamental, life-long process in which a methyl group is added or removed from the cytosine molecule in DNA to promote or suppress gene activity and expression. The researchers measured more than 485,000 genome-wide methylation markers in blood samples of 656 persons ranging in age from 19 to 101.
“It’s a very robust way of predicting aging,” said Zhang, one that was subsequently validated on a second sampling of several hundred blood samples from another cohort of human individuals.
The scientists found that an individual’s “methylome” – the entire set of human methylation markers and changes across a whole genome – predictably varies over time, providing a way to determine a person’s actual biological age from just a blood sample.
“It’s the majority of the methylome that accurately predicts age, not just a few key genes,” said co-senior author Trey Ideker, PhD, a professor of medicine and chief of the Division of Medical Genetics in the UC San Diego School of Medicine and professor of bioengineering in the Jacobs School of Engineering. “The methylation state decays over time along the entire genome. You look in the body, into the cells, of young people and methylation occurs very distinctly in some spots and not in others. It’s very structured. Over time, though, methylation sites get fuzzier; the boundaries blur.”
They do not, however, blur at the same rate in everybody. At the molecular level of the methylome, the researchers said it was clear that individual bodies age at varying rates, and even within the same body, different organs age differently. Moreover, cancer cells age differently than their surrounding normal cells. The findings, according to the study authors, have broad practical implications. Most immediately, they could be used in forensics to determine a person’s age based only upon a blood or tissue sample.
More profoundly, said Zhang, the methylome provides a measure of biological age – how quickly or slowly a person is experiencing the passage of time. That information has potentially huge medical import. “For example, you could serially profile patients to compare therapies, to see if a treatment is making people healthier and ‘younger.’ You could screen compounds to see if they retard the aging process at the tissue or cellular level.”
Ideker said assessing an individual’s methylome state could improve preventive medicine by identifying lifestyle changes that might slow molecular aging. He noted, however, that much more research remains to be done.
“The next step is to look to see whether methylation can predict specific health factors, and whether this kind of molecular diagnosis is better than existing clinical or physical markers. We think it’s very promising,” Ideker said.
Most recent advances in sequencing have celebrated the big picture: the successful mapping, for example, of entire genomes or large, significant series of gene.
In a paper published in the July 22 issue of Nature Biotechnology, an international team of researchers that includes Gregory A. Daniels at the UC San Diego Moores Cancer Center and Louise C. Laurant in the UCSD School of Medicine’s Department of Reproductive Medicine uses a novel sequencing method called Smart-Seq to deeply scrutinize the genetic information contained in a single cell.
The achievement is important. Many clinically relevant cells exist in only small numbers and require singular analysis. “Cancer researchers around the world will now be able to analyze these cells more systematically to enable them to produce better methods of diagnosis and therapy in the future,” said senior study author Rickard Sandberg of the Ludwig Institute for Cancer Research and Karolinska Institutet in Sweden.
You can read the full news release from LICR here.
Pictured: A prostate cancer cell
It’s not just our DNA that makes us susceptible to disease and influences its impact and outcome. Scientists are beginning to realize more and more that important changes in genes that are unrelated to changes in the DNA sequence itself – a field of study known as epigenetics – are equally influential.
A research team at the University of California, San Diego – led by Gary S. Firestein, MD, professor in the Division of Rheumatology, Allergy and Immunology at UC San Diego School of Medicine – investigated a mechanism usually implicated in cancer and in fetal development, called DNA methylation, in the progression of rheumatoid arthritis (RA). They found that epigenetic changes due to methylation play a key role in altering genes that could potentially contribute to inflammation and joint damage. Their study is currently published in the online edition of the Annals of the Rheumatic Diseases.
“Genomics has rapidly advanced our understanding of susceptibility and severity of rheumatoid arthritis,” said Firestein. “While many genetic associations have been described in this disease, we also know that if one identical twin develops RA that the other twin only has a 12 to 15 percent chance of also getting the disease. This suggests that other factors are at play – epigenetic influences.”
DNA methylation is one example of epigenetic change, in which a strand of DNA is modified after it is duplicated by adding a methyl to any cytosine molecule (C) – one of the 4 main bases of DNA. This is one of the methods used to regulate gene expression, and is often abnormal in cancers and plays a role in organ development.
While DNA methylation of individual genes has been explored in autoimmune diseases, this study represents a genome-wide evaluation of the process in fibroblast-like synoviocytes (FLS), isolated from the site of the disease in RA. FLS are cells that interact with the immune cells in RA, an inflammatory disease in the joints that damages cartilage, bone and soft tissues of the joint.
Beyond Base-Pairs: Mapping the Functional Genome
Regulatory sequences of mouse genome sequenced for first time
Popularly dubbed “the book of life,” the human genome is extraordinarily difficult to read. But without full knowledge of its grammar and syntax, the genome’s 2.9 billion base-pairs of adenine and thymine, cytosine and guanine provide limited insights into humanity’s underlying genetics.
In a paper published in the July 1, 2012 issue of the journal Nature, researchers at the Ludwig Institute for Cancer Research and the University of California, San Diego School of Medicine open the book further, mapping for the first time a significant portion of the functional sequences of the mouse genome, the most widely used mammalian model organism in biomedical research.
“We’ve known the precise alphabet of the human genome for more than a decade, but not necessarily how those letters make meaningful words, paragraphs or life,” said Bing Ren, PhD, head of the Laboratory of Gene Regulation at the Ludwig Institute for Cancer Research at UC San Diego. “We know, for example, that only one to two percent of the functional genome codes for proteins, but that there are highly conserved regions in the genome outside of protein-coding that affect genes and disease development. It’s clear these regions do something or they would have changed or disappeared.”
Chief among those regions are cis-regulatory elements, key stretches of DNA that appear to regulate the transcription of genes. Misregulation of genes can result in diseases like cancer. Using high-throughput sequencing technologies, Ren and colleagues mapped nearly 300,000 mouse cis-regulatory elements in 19 different types of tissue and cell. The unprecedented work provided a functional annotation of nearly 11 percent of the mouse genome, and more than 70 percent of the conserved, non-coding sequences shared with other mammalian species, including humans.
As expected, the researchers identified different sequences that promote or start gene activity, enhance its activity and define where it occurs in the body during development. More surprising, said Ren, was that the structural organization of the cis-regulatory elements are grouped into discrete clusters corresponding to spatial domains. “It’s a case of form following function,” he said. “It makes sense.”
While the research is fundamentally revealing, Ren noted it is also just a beginning, a partial picture of the functional genome. Additional studies will be needed in other types of cells and at different stages of development.
“We’ve mapped and understand 11 percent of the genome,” said Ren. “There’s still a long way to march.”
A big step writ small
A step in that direction is reported this week by Pieter C. Dorrestein, PhD, associate professor at the UC San Diego Skaggs School of Pharmacy and Pharmaceutical Sciences, and colleagues in a paper published in the Proceedings of the National Academy of Sciences.
The scientists describe a new, highly sensitive, broadly applicable and cost-effective technique using mass spectrometry to profile the metabolic activity of live microbes directly from a Petri dish without any sample preparation.
Though most people will never see it at work, the new visualization platform is a significant advance in understanding the space and time dynamics of interacting microbial colonies and communities. It’s a big step writ small, akin perhaps to the qualitative difference between studying a dinosaur fossil and watching a whole herd of frolicking sauropods.
Jonathan Sebat, PhD, assistant professor of psychiatry and cellular and molecular medicine at the UC San Diego School of Medicine has received the 2012 Roche and Nature Medicine Award for Translational Neuroscience. He was honored today at a symposium in Switzerland.
The award highlights young researchers who have made innovative and ground-breaking scientific discoveries in the field of translational neuroscience, especially related to autism spectrum disorders.
Sebat is being recognized for his work in better understanding the genetics of autism. Specifically, he, Michael Wigler of Cold Spring Harbor Laboratory and colleagues discovered that rare spontaneously occurring copy number variants are strongly associated with autism.
“This discovery was a key turning point in autism genetics and has now focused attention squarely on rare genetic variants, and spontaneous mutations in particular,” Sebat said.
Since the 2007 paper, researchers have made rapid progress in identifying individual mutations that confer high risk of disease, including autism.
Sebat, who is also chief of the Beyster Center for Molecular Genomics of Neuropsychiatric Diseases and a member of the Institute for Genomic Medicine, both at UC San Diego, has also made key discoveries in schizophrenia, including the fact that schizophrenia and autism share some of the same genes.