Scientists have mapped the sequence of our genes – all 35,000 of them. So what now? U of T researchers are at the forefront of what some are calling the New Biology
If the future has a smell, it may be the homey aroma of baking bread.
If you walk into the labs of the Banting and Best department of medical research – possibly ground zero for the explosion of research that some call the New Biology – that comforting scent is the first thing you notice. Researchers in the institute use yeast – yep, the same stuff you make bread or beer with – as a “model organism” to help them probe the origins of life and disease. Not that yeast is the only model organism around. Other U of T scientists study mice, for instance. Still others study raw proteins, stripped of the organisms that produced them. Others are still concentrating on winkling out the secrets of human DNA. And others work in silico – studying via computer the patterns of knowledge that are emerging from the mountains of data thrown up by the sequencing of the human genome.
Earlier this year – as you will know, unless you were on an interplanetary leave of absence – two teams of scientists, one an international consortium and the other an American group, published the sequence of the human genome (the base set of genetic instructions that make us what we are). There were loud huzzahs on both sides of the Atlantic, and we are now officially in the post-genomic age. The future, once again, is upon us; the frontiers of biology and medicine are being pushed forward. Dispatches from the front speak of strange new disciplines: proteomics, functional genomics, bioinformatics, evolutionary genomics, model organism genetics.
Now that the genome sequence is more or less known, “there are a whole lot of ‘ics,'”says James Friesen (PhD 1963), chair of the Banting and Best and a driving force – along with Cecil Yip, vice-dean, research, in the Faculty of Medicine – in U of T’s planned Centre for Cellular and Biomolecular Research. “There are proteomics, functional genomics, phenomics….” He trails off with a wave of his hand. Friesen thinks “functional genomics” is the best catch-all phrase, because it includes the ideas of function and of genes. Essentially, he says, the goal is to go beyond the human genome to understand what exactly it does, now that we know what it is: “What is all that information telling the cell, and how is it put together?”
Regardless of the label you choose, the University of Toronto has been quietly gathering its forces for the new revolution in biology. There is now a host of individual researchers, as well as interlocking partnerships and collaborations that draw their members from the university and the hospital research institutes lining University Avenue, who are on the new knowledge frontier. “That’s where U of T in general and” – another wave of the hand – “the research institutes across the street have a huge amount of strength,” says Friesen.
Up until now, the key player in all of this has been the famous double helix of deoxyribonucleic acid, or DNA. It got the lion’s share of attention, not because of its intrinsic interest, but because it was perceived to be the key to biology. Understand DNA and you would understand How People Work. And not only people, but trees, insects, moulds – everything, in short, that lives on the face of the earth. Genes, the theory went, were all; know the genome and you would know biology.
Unfortunately, that was naive; knowing what one author called the Code of Codes puts you further forward, but you still have to understand the rest of the system. Some of the DNA – the five per cent or so that we call genes – contains the code to make proteins. The proteins do most of the rest of the work: they form the body’s structures, they carry the body’s messages and they are what goes wrong when we get sick. Proteins in bacteria and viruses are the targets for our drugs. Other bits of DNA contain so-called regulatory sequences that tell genes when to turn on and off, and are often guided by message proteins or other molecules in a startlingly complicated feedback loop. And, surprisingly, much of our DNA has no apparent function, which probably means we just don’t know.
Knowing every protein on the planet
DNA has occupied the spotlight only because it was an essential first step. “If there was a magic way to study proteins, you wouldn’t have to bother with DNA at all,” says biochemist Aled Edwards of the Banting and Best. “The Human Genome Project has really quantified our ignorance of biology.” Now we have the code, but the surface has only been scratched; almost everything else about how we work still needs to be discovered.
Propping a foot against a nearby wall, a cup of coffee on the table in front of him, Edwards allows that he may have a short attention span. For more than five years, since before he arrived at U of T from McMaster University in 1997, he’s been preaching a gospel in the wilderness: what’s needed, he has been saying, is to characterize all the building blocks of proteins. Be careful what you wish for – he’s now the leader in that endeavour. And, he says, now he’s bored. “To take a dumb idea and make it accepted ? that’s what I got a bang out of,” he says. “Now we’ve just got to sit down and do it.”
The “dumb idea” is this: every protein is composed of a sequence of chemicals called amino acids. But those acids form a finite number of specific substruc-tures, perhaps as many as 50,000, called “domains.” These protein domains are where the action happens, says Edwards. They are where other molecules dock, to bring messages or lock on to form part of a larger structure. The domains are like charms on a bracelet or beads on a string – mixing and matching them is what gives a protein its function. So knowing what each one looks like is a giant step toward knowing how a given protein works. In the long run, says Edwards, knowing how proteins work will mean better and more effective medicine. “Almost every disease is caused by a screw-up in a protein,” he says.
Right now, scientists know the function of less than 20 per cent of the proteins in the body, making it difficult to find and fix “screw-ups.” “You wouldn’t go to a mechanic if he admitted he only knew what 20 per cent of the parts did,” says Edwards. “So we want to determine the three-dimensional structure of every protein on the planet.” With colleague Cheryl Arrowsmith of the Ontario Cancer Institute in Toronto, a professor of medical biophysics, Edwards is leading an international project involving the Argonne National Laboratory in the U.S. to ferret out the domain structures.
How proteins interact
Jack Greenblatt, a molecular biologist, is taking another approach. For Greenblatt, the new frontier is the way proteins interact. Sure, it may help to know how the protein snaps together, but the interplay between proteins is the central motif of biology. Essentially, much of drug research to date has been the search for molecules that interact with proteins in some useful way. But those searches were retail – one or two proteins at a time – and the value of the human genome sequence, says Greenblatt, is that it allows wholesale study of protein interactions. “We started to think about this five or six years ago,” he says, “and we realized we could get all sorts of information by looking at protein interactions on a whole-genome scale.”
It is not a trivial problem. There are about 4,000 genes in the bacterium E. coli, 6,000-odd in yeast and 35,000 or more in Homo sapiens. Even if you just look at two proteins at a time, you’re talking millions of possible links. And Greenblatt and his colleagues are trying to pull out multiple interactions – perhaps thousands at a time. “It gets to be a fairly demanding project,” he says, “but that’s what we want to do.”
The treasures of chromosome 7
Before you can work on proteins, though, it helps to have the corresponding gene. And there’s a lot more work to be done to get the genes, says molecular geneticist Stephen Scherer, associate professor of molecular and medical genetics, and associate director of The Centre for Applied Genomics (TCAG) at the Hospital for Sick Children. For one thing, neither of the two genome projects has a complete, start-to-finish sequence of human DNA. For another – something that’s often lost sight of – neither project is exactly universal: they studied DNA from a total of a dozen or so individuals. Your mileage, as they say in the car ads, may vary. “There has been a lot of hype,” says Scherer.
“We still have a lot of work to do.”
That’s why he and his colleagues are still beavering away to complete the world’s understanding of the human chromosome 7. Despite the genome projects, actually finding and sequencing the genes on a chromosome is still a time-consuming, difficult and – when you find one – thrilling chore. Chromosome 7 is the home of dozens of genes linked to disease, including some linked to autism, a suite of genes related to a form of epilepsy, and the famous CFTR gene whose mutation causes cystic fibrosis. (CFTR was one of the first disease genes found, and was discovered by TCAG director Lap-Chee Tsui and colleagues.) “There are a lot of really great diseases to study on chromosome 7,” says Scherer.
And, despite the genome projects, the treasures hidden on chromosome 7 (let alone the other chromosomes) are not exhausted. Outside Scherer’s small but tidy office hangs a much-annotated map of 7. It is, he says, the most complete of any of the chromosome maps – thanks largely to the Chromosome 7 Project – but it still has gaps. “Based on our numbers,” he says, the genome projects “missed between 20 and 25 per cent of all the genes on 7 – and we’re missing some genes, too.” Think of the genome as an encyclopedia containing the information on how to make and operate a human being, he says. Some fiend has cut the 23 books into individual words and now – with the two genome projects – “it’s roughly in order, but there are missing pages, rips in pages and some chapters in the wrong order.”
In the long run, though, Scherer sees a day when much of the data we’re now painstakingly sifting through will be used routinely. “In 15 years,” he says, “it will not be unusual for an autopsy to include DNA sequencing.” Drugs will be tailored to match individual DNA sequences. Diseases will be diagnosed by looking at the DNA of pathogens. It may even be possible to cure or prevent DNA-based diseases, such as Alzheimer’s or autism. But, for the near future anyway, it’s more of the same for geneticists: “The next three or four years will be cleaning up the genome, finishing the sequences and finding all the genes,” says Scherer. One major change is that much of the work will be done on the computer screen. “Most of the people in my group now spend at least 50 per cent of their time working on computers,” says Scherer.
How the cell behaves
Molecular biologist Tony Pawson, another of the genomic frontiersmen, doesn’t think that “wet science” is going to go the way of the dodo. But more and more work will be done in silico, if only because there’s so much data “you rapidly run out of the ability of the human mind to remember everything, let alone how it’s put together.” Biologists have been slow to join the silicon revolution, in part because there’s just too much complexity and – until now – not enough hard data. But Pawson thinks we’re on the verge of having enough information to be able to create what he calls a “virtual cell.”
“The challenge – now that we have all of the genome – is to figure out in a comprehensive way how the cell is wired together,” says Pawson. That’s what he and others at Mount Sinai Hospital’s Samuel Lunenfeld Research Institute have been working on for years: how the cell’s internal communication system works. “The next challenge is to understand how cell A differs from cell B,” he adds. “And the long-term challenge is to be able to describe these processes in sufficient detail that you can make a mathematical model of how the cell behaves.”
That mathematical model would be Pawson’s “virtual cell,” in which researchers would be able to watch signalling pathways operate, observe the complicated interplay of proteins and DNA, and see what happens when things go wrong. The genome projects have produced a flood of data on DNA; Pawson and colleagues are working on a similar project for protein interactions. The Biomolecular Interactions Network Database (BIND), he says, is intended to help answer his first challenge – understanding the cell’s wiring.
Applying computer power to biology
The BIND database will help, but “it’s not the end, it’s a beginning,” says Peter Lewis, head of U of T’s multidisciplinary program in proteomics and bioinformatics. Moving forward is going to take a new breed of specialist: the bioinformatician. These new professionals will keep track of the enormous quantities of data generated by people like Edwards, Greenblatt and Scherer. But they’ll do more than just act as high-tech librarians, says Lewis. They’ll be able to mine the data fields for new relationships and facts. As well, new experimental techniques, such as microarray analysis and mass spectrometry, are throwing up “horrendously complicated” results; bioinformaticians will play an increasing role in analysing experimental data.
The roar of the mouse genome
But, of course, there’s more to the study of biology than DNA, proteins or even the living cell. Somehow, all of that is orchestrated to make a living creature: a worm, a mouse or a human being. Janet Rossant, a professor of molecular and medical genetics and co-head of the program in development and fetal health at Mount Sinai Hospital’s Samuel Lunenfeld Research Institute, has spent years studying the development of mice.
“Model organism genetics” – studying one type of creature in the hope of understanding another one entirely – relies on the fact that evolution is thrifty. Once evolved, a gene or a protein, or even a protein complex, is rarely thrown away. So, for instance, scientists studying the fruit fly have found a cascade of protein interactions dubbed the “hedgehog pathway.”
In the fly, the hedgehog pathway governs the formation of body segments. But we humans also have a hedgehog pathway, conserved over millions of years of evolution. If our hedgehog pathway is disrupted, we become susceptible to cancers, such as basal cell carcinoma, as well as several inherited diseases.
So it makes sense to study mice, which are, after all, closer to us than flies. And the deluge of new information, including the sequencing of the mouse genome, has made it possible to do almost industrial-scale functional genomics on mice, says Rossant. She and colleagues are trying to look at the function of not one, not two, but all the genes in the mouse. The idea is to make random mutations in mice and then see if they develop the clinical picture of any human illness, such as hypertension. The mice go through a battery of tests, “essentially like going to the doctor for a checkup” to see if they develop symptoms, says Rossant. When mice show signs of disease, it’s simple, in principle, to track down which genes are mutated and thus possibly responsible for the disease. The power of the technique rests on the ability to study many mice (in the first six months of the pro-ject, Rossant and colleagues screened 2,000) and swiftly to find the genes involved.
Strolling down the bread-scented corridors of the Banting and Best, James Friesen says he has already seen the future. For yeast researchers, the post-genomic age is now five years old; the complete genome of yeast, all 6,000 genes, has been known for that long. In fact, one project now underway at U of T is to create 6,000 varieties of yeast, each with one of the genes knocked out, and then make 36 million pairwise crosses to see how the various genes affect each other. The only problem, says Friesen, is that “the more you know, the more you don’t know.”
This is a good time to be doing post-genomic research, Friesen muses, and Toronto is a good place to be doing it. “We have one of the biggest biomedical research establishments on the continent and, indeed, in the world,” he says. That, combined with good government support for research and the welcoming nature of the city itself, has meant Friesen and his colleagues are able to hire some of the top young guns of the post-genome frontier. Some of them – like Edwards – are already on board, says Friesen, and all told between 25 and 30 new people are coming in over the next few years. “Our science has landed us in the right place,” he says.