RSS 2.0
  • Home
  • About
  • Aligners
  • Genomes
  • VarScan
  •  

    Class Visit to Our Lady of Lourdes

    November 20th, 2008

    Yesterday I visited Our Lady of Lourdes to talk to Rich Falkler’s 8th grade science class about my work at the WashU genome center. It’s a Catholic school in Clayton, a relatively well-to-do area juxtaposed between St. Louis City and west county. Weeks ago, I asked the teacher about what knowledge base the students would have in genetics, and was impressed to learn that their curriculum included not only Mendel and Punnett squares, but also polygenic traits, sex-linked genes, and genetic engineering. In fact, the first two chapters of their Prentice Hall textbook cover heredity and modern genetics.

    The Challenge: Trying Not to Bore 8th Graders

    Although I was interested in science as a kid, I don’t remember feeling excited in my middle school science classes. I suppose that’s an age when college is a far-off unknown and there are bigger fish to fry. Certainly the curriculum of my 9th grade biology class was not as comprehensive as what these students had, but then again, that was almost fifteen years ago. I said as much to Patty, the school parent (and relative) who invited me to the class. She reminded me, in half-jest, “you went to public school.” Ouch.

    I didn’t expect to wow them, but hoped I’d do enough, at least, to avoid deterring twenty bright young minds from a future career in science. So earlier in the week I consulted with the closest thing we have to kids around here: the Technology Development group. They offered some advice about how to keep kids entertained:

    • Put statistics in terms that they understand (i.e., translate megabases to Harry Potter books)
    • Talk about money (i.e., how many new corvettes could you buy for the price of a 454 machine)

    My friends in TechD also hooked me up with some cool lab items to pass around: 454 picotiter plates, Solexa flowcells, Agilent DNA-sizing plates, and other items. The kids really enjoyed these and fortunately none were dropped or pocketed while being passed around (perish the thought!). I also brought a flash drive with some high-res images of 3730 sequencers, 454/Solexa machines, sequence traces, etc. The class had a Mac computer with a “smart board” (small projection screen) that made these very useful.

    High Marks for Curiosity

    The teacher told me in advance that his students were very inquisitive, so I opened the floor for questions. The two adults in the room (Rich and Patty) offered up a few to get things going: what’s the difference in sophistication between what we do and FBI DNA forensics labs (a lot), how does our funding picture look with the new administration (not bad), and what exactly do I do (good question)? The students offered some excellent questions as well. They wanted to know why so many recessive alleles (like O blood type) persist in populations. One had seen a show about a girl with hirsutism (hair all over her body), and asked if the cause was genetic. Another just wanted to hear about the freakiest genetically-mutated organisms I’d seen in my time.

    That Reminds Me Of A Story…

    So as it turned out, I ended up just telling stories – about fruit flies that couldn’t fly, or the day one of our sequencers caught on fire (a slight dramatization). The best story I told them was about mice on McDonalds – an experiment in which researchers at Case Western fed two strains of mice the equivalent of a Big Mac and large Coke every day. One strain, as you’d expect, got the obesity/diabetes/heart disease package as a result. The other strain? Just fine. No weight gain, no health problems.

    “Some people are like that too,” I said. “They can eat whatever they want, all the time, and never gain a pound.”

    “Katie!” one of them said loudly, accusing a girl in the front row.

    Looking Forward to Next Year

    I talked for about 45 minutes, and fielded questions for perhaps half an hour. Earlier this week I’d worried that I wouldn’t be able to fill the time, but thanks to inquisitive students, it was easy to do so. I worried, too, about presenting topics (like evolution) that might not be as welcome in a Catholic school classroom. It was a relief to realize that this was not a problem. In fact, in response to a question I posed about human versus chimpanzee genomes, a student mentioned that “supposedly we are descended from them.” Science and religion did not seem incompatible at all.

    I left the class with a couple of posters about the human genome and DNA analysis techniques. Rich Falkler mentioned the possibility of another visit next year. I’m certainly willing, and also told him about DNA day, when our genome center’s Outreach department sends an army of “DNA Ambassadors” to area classrooms. Next year I’ll probably be one of them.

    AddThis Social Bookmark Button

    AML: A New Era of Cancer Genomics

    November 6th, 2008

    Next-generation sequencing technologies are said to be ushering in a new era of cancer genomics. A powerful demonstration of the new paradigm for cancer research came out today in Nature. It’s the much-anticipated publication of our AML project, in which we used the Illumina/Solexa platform to sequence the entire genome of a woman who died of acute myeloid leukemia. See my colleague David Dooling’s post on Politigenomics for links to some of the news coverage. This study offers two important milestones to the field:

    1.) The first complete genome of a woman. Patient 933124 follows James Watson and J. Craig Venter into the archives of whole-genome history.

    2.) The first cancer genome to be completely sequenced on a next-generation platform.

    The basic biological problem is simple. This patient had AML. AML is a disease state that was initiated, and driven, by mutations in her genome. Her blood cells should be dominated by the clone of the most effective cancer genome. So, we sequence both tumor and germline DNA. We find any mutations in the tumor that are not in the germline, and identify which of those are novel, protein-coding variants. Among these should be, must be, the mutations that initiated and drove the development of cancer.

    Major Informatic Challenges

    Yet even with new technologies, this was no simple task. The sheer volume of data generated for this project is what amazes me. It took 98 full Solexa runs (4 libraries totaling ~5.86 billion reads) to reach our target diploid coverage in the tumor, which was 90%. The data was generated over a period of several months, during which both the sequencing technologies and the informatic algorithms were constantly evolving. The AML project offered me my first view of both 454 and Solexa data, and it presented our group with numerous challenges. Disk space. Computing power. Short read alignment. Variant calling. You name it.

    The Power of the Unbiased Approach

    It seems like a lot of work just for ten mutations. That’s how many validated, somatic, nonsynonymous mutations we found in this AML genome. Eight of these ten mutations, however, implicated new genes that were not previously linked to AML. Four of the genes are in gene families strongly associated with cancer pathogenesis (PTPRT, CDH24, PCLKC, and SLC15A1). The other four genes (KNDC1, GPR123, EBI2, and GRINL1B) are not known to contribute to cancer pathogenesis, but they have potential roles in metabolic pathways that may act to promote cancer growth. These are also four genes that would almost certainly have been excluded from a candidate gene approach.

    Mutation Frequencies by 454 Read Count

    Another interesting application of massively parallel sequencing was the estimation of mutation frequencies in the tumor sample using the Roche/454 platform. For each of the 10 somatic mutations, as well as 2 germline control variants, we performed PCR-targeted 454 resequencing in samples from the primary tumor, the relapse tumor, and the germline. The idea here was to profile the relative proportions of clonal cells that made up each sample. We got some results that the first author of the study, Tim Ley, described more than once (in our meetings) as “absolutely beautiful.” All of the somatic SNPs were at 50% frequencies in the tumor, as you’d expect for heterozygotes. They hovered slightly lower (around 40%) in the relapse sample, which was known to be less pure (i.e. ~78% blasts), but if you correct for the blast count, they reach 50% as well. Intriguingly, the somatic variants were detected in the germline sample as well at frequencies of 5-13%, suggesting that the skin sample was contaminated by a small fraction of leukemic cells. The one non-beautiful result was FLT3, which had frequencies of around 35% in tumor and 31% in relapse. It may be that the FLT3 ITD mutation was not present in all tumor cells; perhaps it was introduced later than the others.

    Yes, We Can Find Indels in Short Reads

    One of the significant bio-informatic challenges in which I became intimately involved was the detection of indels, which is theoretically possible but practically very difficult in fragment (non-paired) reads that are only ~36bp long. We ended up combining a few different approaches and found over 700 putative small indels, more than half of which were already in dbSNP. We attempted to validate 28 of these by 3730 sequencing. Two were the previously-known mutations in FLT3 and NPM1. Two were false positives. The other 26 were real, but present in the germline, which was a bummer since we thought they’d be somatic. Those are the breaks. Fortunately, indel detection is one area that will be helped dramatically by improvements to the sequencing technologies, namely longer reads and paired-end protocols.

    A New Paradigm for Cancer Genomics

    I think that most of all, this work was important because it established the feasibility of sequencing entire genomes with massively parallel / short read technologies and getting valuable results from it. It also drove us to develop and apply new algorithms (like decision trees) to analyze the data. I expect that we’ll begin to see a number of whole-genome-sequencing approaches to the study of cancer and other disease that take advantage of this new paradigm. The question of whether or not we can do science on a whole-genome scale has been answered. In the words of our next president, “yes we can!”

    AddThis Social Bookmark Button

    Drop the pipette and go vote, man!

    November 3rd, 2008

    I regret to inform you that I am not one of the coveted swing voters in Missouri, where polls show Obama and McCain within 1% of one another. Somehow, Missourians managed to pick the winner in all but one presidential election for the past 100 years. I’m not certain that this something to be proud of; if anything, it suggests that we vote for who we think will win, not who we’d like to win.

    It is reasonable to assume that the next president of the United States will have a substantial effect on the direction of primary research in this country. Under President Bush we’ve seen a plateau and then a decline of basic research funding, with dire consequences on the research community. With NIH applications funding at the 8% level, the government budget for basic research is a key issue for me. Unfortunately, neither candidate has offered a satisfactory solution to the research budget crisis. Obama has promised to double the NIH budget over 10 years (not enough), and McCain’s campaign refuses to give any firm number for a budget increase (not helpful at all!). Furthermore, McCain’s live-debate promises of an “across the board spending freeze” seem to suggest that there will be no budget increase at all.

    There are, of course, a number of scientific issues where the candidates’ positions differ. The journal Science recently made a laudable effort to profile candidate positions on a dozen important issues. Unfortunately, McCain’s campaign has been less forthcoming about his positions on numerous scientific issues (e.g. stem cells, intelligent design), and this gives me pause. In these cases I can only assume that he’s withholding a position with which most scientists would disagree.

    Campaign promises are one thing. Basic understanding of effective scientific policy is another. According to a recent editorial in Nature, Obama has surrounded himself with a better cadre of scientific advisers. That speaks to a desire not only to have the best scientific advice, but to listen to it. Speaking of scientific understanding, don’t even get me started on Sarah Palin, whose casual dismissal of the importance of fruit fly research has the entire model organism community in an uproar.

    I suspect that most scientists, like me, have already made a decision for this election. Thus my message is not intended to sway anyone, but to remind scientists to GET OUT AND VOTE! Yes, there are important analyses to be done, crucial lab work to be performed, career-making manuscripts to write. Nevertheless, all of us can spare an hour or two to make our democratic voices heard.

    AddThis Social Bookmark Button