CSHL 2010: Genomes Get Personal

September 22, 2010 by Dan Koboldt

Last week I attended the third annual “Personal Genomes” meeting at Cold Spring Harbor. The meeting opened with a keynote talk by NHGRI director Eric Green, who reminded us that finding the pathway to genomic medicine is the central mission of NHGRI. He mentioned several of the past successful initiatives that have yielded key findings concerning human genetic variation and its relationship to phenotype: The HapMap Project (common variation), the ENCODE Project (functional variation), and the 1,000 Genomes Project (rare variation), to name a few. He showed the absolutely stunning growth of the NHGRI-hosted genome-wide association study (GWAS) catalog, which currently holds ~2,600 associations from 780 publications.

Dr. Green also discussed the dichotomy of genetic architecture underlying human diseases, and took the position that while we’ve made substantial progress studying rare, monogenic, mendelian disorders (predominantly caused by coding mutations), we face a more daunting task with common, complex, multigenic diseases because he believes that these arise from primarily noncoding mutations.

Theme 1: Human Mutation Rates

Several talks addressed the topic of mutation rate in human genomes. Donald Conrad, who will be joining the WashU Genetics Department next year, presented mutation rate as a quantitative trait based on 1,000 Genomes Project trio data. Three of the primary sources of variation in mutation rate are age (males have 3x-6x higher rates), environment, and genetic variation (e.g. inherited aging disorders).

Lee Hood gave an excellent keynote on “Systems Genetics and P4 Medicine”, part of which was a discussion of mutation rate. His group uses whole-genome sequencing (WGS) of family cohorts (in this case, the Miller syndrome family quartet), focusing on the ~2.3 GBP of non-repetitive reference sequence. Using the family information and inheritance modeling, they identify de novo mutations in the offspring, which manifest as errors of Mendelian inheritance. Validation using a custom capture array for 60,000 candidate sites followed by deep sequencing showed that only 1/1,000 “new” mutations in the offspring were real; the vast majority proved to be sequencing errors. That works out to a mutation rate of 1.1 x 10-8, or roughly 70 mutations per child.

Lynn Jorde (Univ. of Utah) later gave a talk on directly estimating human mutation rate by WGS, also using the Miller syndrome quartet. Sequencing by Complete Genomics yielded >50x fold coverage per subject; there were ~4 million positions in the 1.8 Gbp of “useful” reference sequence in which at least one subject differed from the reference. Only 330,000 or so SNPs were novel (not known to dbSNP), and 20% of these proved to be sequencing errors. More array validation, more calculations, and the same answer as given by Dr. Hood: a mutation rate of 1.1 x 10-8.

Theme 2: Personal Cancer Genomes

Cancer genomes were another focus of the meeting. Sean Grimmond (Univ. of Brisbane, Queensland, Australia) presented some of his group’s work on pancreatic cancer as part of the International Cancer Genome Consortium (ICGC). Pancreatic is one of the most deadly forms of cancer; about 90% of patients diagnosed die within one year. Brisbane has assembled a very nice workflow from sample collection to sequencing, that includes pathology review, tumor dissection, QA, and microarray analysis to determine tumor cellularity. The sequencing strategy (WGS, exome, and RNA-seq) differs between high-cellularity (70-100%) and low-cellularity (~30%) tumors. The ultimate deliverable is a “tumor report” documenting cellularity estimates, microarray findings, cytogenetics, what sequencing was done, and what mutations were found.

James Brugarolas (UT Southwestern Medical Center) described the genome evaluation and functional studies of a patient with clear cell renal carcinoma. I learned a bit more about this form of cancer – 85% of tumors prove to be the “clear cell” carcinoma; common lesions include 3p loss (VHL gene) and 5q35 gain. This particular tumor underwent Illumina whole-genome sequencing to 35x coverage; some 46 somatic mutations were validated. One of these was in a gene whose protein product complexes with mTOR, the central player in a known cancer pathway. The tumor was successfully xenografted to a mouse model; some 43/46 somatic mutations were retained, and all had higher frequencies (similar to our findings on basal-like breast cancer). The xenograft let them test a few different cancer drugs – erlotinib (an EGFR inhibitor that had no effect), sunitinib (the front-line therapy for these patients, also no effect), and others. Intriguingly, however, the tumor was sensitive to an mTOR inhibitor compound.

Rick Wilson (The Genome Center at Washington University) gave a talk on whole-genome sequencing of leukemia patients at WashU. Of the 50+ leukemia patients sequenced to date, most have less than 20 valid protein-altering mutations. For most patients, low-resolution cytogenetic screens are the paradigm for disease classification and treatment decisions. Favorable-risk patients (17% of cases) undergo light chemotherapy. For adverse-risk patients (22% of cases), an all0-matched bone marrow transplant is the standard of care. That leaves a large body of patients (~61%) with “intermediate” risk according to cytogenetics; here, the correct treatment decision is harder to make. Better stratification of intermediate-risk patients is the first goal. Dr. Wilson related a fascinating case study, a 39-year-old female with suspected acute promyelotic leukemia, in which rapid-turnaround WGS was able to provide an accurate diagnosis that was not obtained by conventional FISH, and ultimately guided her treatment.

Theme 3: Genome Regulation and Epigenetics

Peter Laird (Univ. Southern California, LA) led us out of the genome to the epigenome with his talk on mining the cancer methylome. He argued that the first steps in oncogenesis may be epigenetic changes, specifically, the dysrgeulation of genes due to abnormal methylation. Dr. Laird presented what he’s calling the first cancer methylome – a tumor sample and matched normal control that underwent bisulfite treatment and sequencing to ~30x coverage. As expected, bisulfite sequencing yielded very accurate estimates of DNA methylation (r=0.97 with Illumina Infinium) but was able to do so across the complete human genome with base-pair resolution.

Theme 4: Exome Sequencing

There is a ton of exome sequencing going on. I saw at least two posters describing “whole” exome sequencing in 1,000 cases and 1,000 controls. I put “whole” in quotes because it’s not true at this point; people really shouldn’t be going around saying that the “whole exome” was sequenced. It’s more like 80-90% of known genes. Rick Lifton spoke about some of the valuable applications of exome sequencing – finding dominant reproductive lethal mutations, unraveling recessive traits with high locus heterogeneity, characterizing somatic mutations in cancer, and identifying rare variants associated with common disease. He described recently published work in which recessive mutations in WDR62 were linked to severe brain malformations by exome sequencing. Matt Bainbridge gave a nice overview of the exome sequencing currently under way at Baylor. So yes, it turns out that groups outside of WashU are doing exome sequencing too.

A Formula for Dosing Humans with Rat Poison

May 11, 2010 by Dan Koboldt

In 1920, a mysterious epidemic broke out in the cattle populations of the United States and Canada. It was a severe disease of internal hemorrhaging that struck quickly and inexplicably; ranchers were soon distraught at the losses to their herds. Two years later, Frank Schofield connected the disease to sweet clover hay, which had been widely used as cattle fodder since the beginning of the century. But the agent behind hemorrhagic sweet clover disease remained elusive. sweetclover

The turning point came in 1933, when a farmer drove to the University of Wisconsin with a truckload of spoiled hay and blood from a cow that had died after eating some of it. The farmer’s plight caught the interest of Karl Link, an associate professor of agricultural chemistry. Seven years later, Link and his colleagues announced the purification and synthesis of dicumarol, the hemorrhagic agent in spoiled sweet clover hay. It seems that a series of wet summers had led to the infection of sweet clover fields by mold. In response, the sweet clover plants produced coumarin, a natural compound that defends against fungal infection. With the support of the Wisconsin Alumi Research Foundation (WARF), Link and his colleagues synthesized over 100 analogues based on dicumarol’s structure. In 1946 they developed the highly potent form that was patented by the WARF organization. It was a compound that smelled, appropriately, like freshly mown hay. It was a toxin deadly enough to be used as a rat poison. They named it warfarin.

Blood Thinner or Rat Poison?

At first, warfarin was considered too toxic for human use. It was marketed as a rodenticide, and became a popular rat poison. In the 1951, a navy recruit took a large dose of warfarin to attempt suicide. Surprisingly, he lived, and clinical trials soon thereafter showed that warfarin could be administered safely to humans. The idea of warfarin therapy became widely known in 1955, when it was given to President Eisenhower after a heart attack. Today, warfarin is most frequently prescribed oral blood thinner, and the eleventh most-prescribed drug overall. It’s given to patients where unwanted clotting is a risk — after surgery, stroke, pulmonary embolism, or deep-vein thrombosis (DVT). Unfortunately, warfarin has a narrow therapeutic range. Too little, and it has no effect on clotting. Too much, and the patient could suffer internal hemorrhaging. To further complicate things, the correct warfarin dose is influenced by a number of factors – clinical ones (weight, age, INR), diet, heritage, etc.

Warfarin Pharmacogenetics and Clinical Trial

It became apparent that genetic factors play a critical role in effective dose of warfarin. Two genes in particular have been demonstrated to modulate warfarin response: VKORC1, which encodes a component of the vitamin K epoxide reductase (VKOR) complex that is targeted by warfarin; and CYP2C9, the cytochrome P450 enzyme primarily responsible for metabolizing the drug. Numerous other genes have been implicated as well, though none have proven more informative than VKORC1 and CYP2C9 genotypes. The clear genetic component, and the as-yet-unraveled complexities of correct dosing, are probably why warfarin has become the poster-child for pharmacogenetics.

Brian Gage, M.D.

Last month, a team led by Brian Gage at Washington University in St. Louis published an elegant formula for warfarin dosing that takes clinical and genetic factors into consideration, in conjunction with a web site (www.WarfarinDosing.org) where clinicians can use it to calculate and track patient doses. This month, the National Heart, Lung, and Blood Institute (NHLBI) announced a five-year, $3.7 million clinical trial to assess warfarin risks and benefits. The Genetics InFormatics Trial of Warfarin (GIFT) trial, to be led by Brian Gage and his colleagues, will enroll knee- and hip-replacement patients at our own Barnes-Jewish Hospital to improve upon the warfarin dosing formula.

It’s the most interesting story I’ve heard that begins with a farmer, a cow, and the state of Wisconsin. Strange how the mysterious sweet clover disease, described as “an insidious hemorrhagic disease” by the Merck Veterinary Manual, would yield a compound so valuable for human health.

References
Lenzini P, Wadelius M, Kimmel S, Anderson JL, Jorgensen AL, Pirmohamed M, Caldwell MD, Limdi N, Burmester JK, Dowd MB, Angchaisuksiri P, Bass AR, Chen J, Eriksson N, Rane A, Lindh JD, Carlquist JF, Horne BD, Grice G, Milligan PE, Eby C, Shin J, Kim H, Kurnik D, Stein CM, McMillin G, Pendleton RC, Berg RL, Deloukas P, & Gage BF (2010). Integration of genetic, clinical, and INR data to refine warfarin dosing. Clinical pharmacology and therapeutics, 87 (5), 572-8 PMID: 20375999

Capture and Subassembly with Jay Shendure

January 15, 2010 by Dan Koboldt

Yesterday our 2010 Genetics Seminar Series kicked off with Jay Shendure (Univ. Washington) whose twelve-exome paper landed in Nature late last year. His talk covered three very different applications of next-generation sequencing: high-throughput mutational studies of core promoters, sub-assembly of Illumina reads to 454-length contigs, and exome capture to unravel Mendelian disorders.

Mutational Profiling

First, Dr. Shendure described some interesting experiments under way in his lab to elucidate the function of non-coding regulatory variants – specifically, single nucleotide changes in the core promoter that alter gene transcription. The approach is called “saturation mutagenesis” and involves generating every possible mutant in a construct, and then assaying the effect of each construct on transcription. By leveraging high-density Agilent arrays and next-generation sequencing, Shendure and his colleagues performed saturation mutagenesis in vitro in high-throughput fashion. Their process involves three steps:

Synthesize mutant constructs on an Agilent array. The oligos (probably ~150 bp) include the core promoter region surrounding a gene’s transcription start site (TSS). They generate a single mutation (SNP or single-base indel) per construct, and label each construct with a sequence barcode downstream of the TSS.
Cleave mutant templates from the array, amplify, and sequence on Illumina to measure relative construct abundance.
Perform in vitro transcription, then Illumina RNA-Seq, to measure the expression of each construct.

Dr. Shendure noted that there was some sequencing bias between barcodes, so they used multiple barcodes (6) per mutant construct and normalized the results. Then, by combining the construct abundance data (Seq) and the expression data (RNA-Seq) for mutants and comparing them to the results for the wild-type construct, they could assess the functional impact of each synthesized mutation on transcription.

As far as results go, Dr. Shendure showed a histogram: on the X-axis was each base of the core promoter region that they evaluated, and on the Y-axis, the effect of mutating that position on transcription. Most of the values were negative, indicating that mutations reduced transcriptional activity, particularly around the TATA box and INR site. Essentially, the plot neatly described the footprint of RNA polymerase binding, with the most effective mutations centered on the TSS. Intriguingly, the single-base deletion mutants consistently showed the greatest reduction of transcription, suggesting, perhaps, that indels in promoter regions are likely to be functional variants.

Short Read Subassembly

The next area of interest was very pertinent to groups with access to next-generation sequencing, but not the 454 “length matters” platform. While Illumina read lengths are still growing (most groups currently run 75- or 100-bp protocols), they still cannot rival the ~450 bp reads consistently produced on 454 Titanium. And yet, many applications of NGS benefit from longer reads – de novo assembly, metagenomics, and the core promoter assays I’ve just described, to name a few. Thus, Shendure and his group sought to combine some Tech D cleverness with Illumina’s incredible read depth to generate localized assemblies of kilobase-length fragments.

First, they sheared DNA into fragments that were a few kilobases long, ligated adapters to the ends of each fragment, and did a round of amplification. Now they had many copies of each fragment with adapters on each end. The fragments are concatemerized, then somehow randomly sheared to variable-length pieces of the original fragment such that each piece has one of the original adapters on one end. A new adapter is ligated to the sheared end. Then there’s another round of PCR, followed by Illumina paired-end sequencing. The resulting paired-end reads (75-mers) have a “read2” that’s the same for all pieces of the same kilobase-fragment, but a read1 that comes from some random location within the fragment.

Then, it’s possible to perform a localized assembly for each kilobase fragment. It’s an interesting approach, but here’s the problem: after assembly, in their proof-of-principle experiment, they achieved a median contig size of 350 bp. Granted, the per-base quality was very high (85% of bases had Q>40), but the lengths are unimpressive. As Dr. Shendure joked, they managed to get similar read lengths to a 454 run and make it cost just as much. There’s still a lot of work to do. Or they could just pick up one of those cute little GS-Juniors.

Human Exomes and Mendelian Disease

Finally, Dr. Shendure gave an overview of last year’s elegant Nature paper, in which exome sequencing of four individuals, followed up by careful downstream informatics, correctly identified the causative gene. Their defined “exome” was 30 Mb, which they targeted using two solid-phase array capture chips. Illumina sequencing of the exome capture generated about 6.4 gigabases per individual. Exome sequencing makes a lot of sense in certain Mendelian disorders, where (1) the pattern of inheritance, e.g. autosomal recessive, is known, and (2) the causative mutations occur in a single gene.

By sequencing the exomes of multiple individuals, isolating what we’d call “tier 1” variants – Nonsynonymous, nonsense, splice site, or frameshift-indel – and then removing all known common variants from public databases, Dr. Shendure and colleagues can reduce 20,000 gene candidates down to a handful. It worked out beautifully in the Nature paper – all four individuals had rare, tier 1 mutations in the same gene.

But in another cohort (4 individuals from 3 kindreds with Miller syndrome, a rare developmental disorder) Dr. Shendure and colleagues discovered the danger of overfiltering. They removed all variants from dbSNP 129, but when they limited the scope to only mutations predicted to be “damaging” or “deleterious”, the number of genes dropped to zero. Apparently the deleteriousness of at least one of the causal mutations wasn’t predicated correctly.

Obviously, the need is for better filters of common variants. But with projects like the 1,000 Genomes in full swing, I wonder, will filtering out using dbSNP get better, or worse? Already, as Shendure pointed out, certain genes have basically a SNP reported at every position. I know that TP53 does. What’s more, with the advent of next-generation sequencing, I hate to tell you, but people are going to be reporting a lot of false positives. I guarantee it. So when you filter all of the variants, you might actually remove the ones you’re looking for.

References
Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, & Shendure J (2009). Targeted capture and massively parallel sequencing of 12 human exomes. Nature, 461 (7261), 272-6 PMID: 19684571

Cancer Genomics Meeting in St. Louis

December 3, 2009 by Dan Koboldt

The Genome Center at Washington University is currently hosting a remarkable two-day event focused on the study of cancer genomics. Yesterday there was a symposium on the School of Medicine campus, featuring speakers from major genome centers around the world, who delivered an excellent series of talks on recent advances in cancer genome research. Here were the highlights.

Mutational Signatures in Lung Cancer (Peter Campbell, WTSI)

First up was Peter Campbell from Wellcome Trust Sanger Institute, who presented their first cancer genome – a small lung cancer cell line called NCI-H209. Using the ABI SOLiD platform, they sequenced NCI-H209 and a matched B-cell sample from the same individual to 30-40x coverage. Extensive PCR-based validation yielded almost 23,000 somatic substitutions and over a hundred structural events (indels and rearrangements). As expected, the mutational spectrum was enriched for G->T and C->A changes associated with adduct formation on guanine nucleotides induced by benzopyrene, the chemical mutagen found in tobacco smoke. Dr. Campbell also described some of the complex rearrangements observed in the paired-end sequencing data, which were particularly convincing when overlaid with spectral karyotyping images.

Next-Gen Sequencing Strategies to Study Cancer Genomes (Elaine Mardis, WU)

Next was our own Elaine Mardis, who gave an excellent overview of the strategies developed here to apply NGS to cancer genomes. She described five key elements to success in this arena:

Genomic characterization prior to sequencing. For example, at WashU we type tumor and normal samples on genome-wide SNP arrays, which yield tumor purity/ploidy estimates, LOH information, and a dense set of SNPs for tracking the coverage of genomes by Illumina sequencing.
Resource characterization. The tissue preservation method, DNA/RNA quality and quantity, and pathology information are all critical components. Also important are high-quality clinical data (diagnosis, chemotherapy/radiation protocols, and outcome), informed consent, IRB approval, and additional cases of the same cancer subtype for recurrency screening.
Data production capacity. US genome centers seem to have this, either in the form of Illumina (WashU and Broad) or ABI SOLiD (Baylor). It’s not just the throughput of the machines, either – it’s the ability to construct sequencing libraries from ever-shrinking DNA inpus. Tumor samples are precious, and the ability to use only a tiny amount of DNA or RNA while achieving informative results is one of the key areas of focus of tech development groups.
Informatics and bioinformatics. We have entire groups devoted to LIMS, pipeline automation, medical genomics, and sequence data submission. Other important elements of bioinformatics that Elaine touched on were data display interfaces for collaborators and high-end data storage and computational infrastructure.
Validation and recurrent site screening. This the essential coup de grace for tumor genome characterization, in which we validate somatic mutations and identify those that are recurrent in other samples of the same subtype, the best indication that we currently have of pathological relevance.

Elaine also discussed the rapid scaling up of TCGA (which is adding 20 tumor types thanks to ARRA funds) and other projects, which will only exacerbate the challenges of scale that NGS platforms have already presented.

Integrating Genomics with Biology (Richard Gibbs, Baylor)

Richard Gibbs gave an action-packed talk of some relevant work going on at Baylor, both for cancer and inherited diseases. They are applying an intriguing if controversial multiple-platform strategy for whole genome sequencing: deep (20-30x) coverage on ABI SOLiD and light (6-10x) coverage on 454. “We’re just telling people that if you do it twice, you’ll get it right,” Dr. Gibbs said. One interesting project is an investigation of Charcot-Marie Tooth (CMT) syndrome, a recessive inherited disorder where the locus is unknown. Whole-genome sequencing of an affected individual on ABI SOLiD identified a few dozen novel missense mutations; among them lurked the causal variant, which was found to segregate with the disease in a family cohort.

Dr. Gibbs also gave an overview of their investigations into heritable variants in pediatric cancers (in collaboration with MD Anderson). There’s also a lot of work under way for TCGA, not just the 6K capture project, but also adjunct analyses of gene expression, DNA copy number, microRNA, and DNA methylation data being generated on TCGA samples.

Insights into Rare Tumors (Steven Jones, BC Cancer Agency)

Steven Jones from BC Cancer Agency retold the story of the rare tongue adenocarcinoma that I heard at AGBT 2009. What I didn’t know about BCCA is that under the Canadian universal healthcare system, they see all of the cancer patients in the surrounding population of over 4 million citizens. One of these was a rare one – an 80 year old man with adenocarcinoma of the tongue. It was removed surgically, of course, but in a short time metastasized to the lungs. The clinician prescribed erlotinib, an EGFR inhibitor, but unfortunately the patient did not respond. To help the patient, and also make some advances in tech development, Jones and his colleagues did whole-genome and RNA-Seq of the tumor samples and matched normals. There were just four somatic mutations: two in known cancer genes and two in zinc finger proteins (these remain unexplained). Transcriptome and copy number analysis showed that the tumor had loss of PTEN and down-regulation of SMAD4. Unfortunately, it had recently been shown that tumors lacking PTEN and TP53 don’t respond to TK inhibitors like erlotinib. However, this particular tumor showed an amplification of Ret, and as it happened, the drug bank had a single drug, sunitinib, that was known to inhibit Ret. The patient’s response, initially, was quite dramatic – all of the metastases vanished. Sadly, several months later they turned up again, and this time were resistant even to sunitinib. Still, the results of this effort were promising, because genomic information was used to keep cancer at bay, if only for a short time.

Genomic Medicine in Pediatric Brain Tumors (Chinc C. Lau, Baylor)

Ching Lau of Baylor presented genomic studies of medulloblastoma (MBM), which accounts for 20% of all brain tumors and has a 60% survival rate. Classification of MBM patients in the past was relatively crude – based on the amount of residual tumor post-surgery and metastatic status. Using gene expression profiling, Lau and colleagues identified 4-5 distinct clusters. Two clusters were associated with known cancer pathways – SHH signaling and WNT activation. The same four clusters could also be isolated by unsupervised miRNA clustering. Also, gene expression analysis showed that ERBB2 expression correlates with outcome (higher expression = poor prognosis).

Finally, Dr. Lau mentioned some future directions for targeted cancer therapy. One of these that I readily admit I don’t understand: cytotoxic T-cells with Chimeric TCRs. Evidently these are T-cells that recognize and attack cancer cells in the body. There was a short movie, courtesy of Dr. Lau’s collaborators, in which we saw these specially programmed immune cells recognizing and attacking a tumor that was roughly four times their size. It was like watching ants swarm a piece of fruit on the sidewalk, and very compelling.

Evolution of a Breast Cancer Tumor (Samuel Aparicio, BC Cancer Agency)

Dr. Aparicio presented a study recently published in Nature and already discussed on Massgenomics. However, he did discuss the continuing challenge of mutation heterogeneity in tumors – we can no longer refer to mutations as present or absent, but instead should report their frequency, which represents the proportion of clones with each mutation. The question of how deep we need to sequence to find the very rare variants has yet to be answered.

Breast Cancer Genomics (Matthew Ellis, Siteman Cancer Center)

Matthew Ellis, our collaborator from the Siteman Cancer Center, presented very recent work we’ve done on a basal subtype breast cancer. A quartet of samples were sequenced in this study – the primary breast tumor, the matched normal tissue, the brain metastasis (from which the patient died), and finally, a mouse xenograft model developed in “humanized” NODSCID mice. We validated some 50 tier 1 mutations, all of which were detected (at some level) in all four samples. Deep read counts for these mutations in each sample revealed some interesting stories about the progression of the cancer from tumor to metastasis.

Genomic Signatures and Cancer (Todd Golub, Broad / Dana Farber)

Todd Golub of the Broad Institute and Dana Farber Cancer Center presented his group’s work on Hepatocellular Carcinoma (liver cancer), which is the fifth most common cancer worldwide. It’s a disease of growing concern on the African and Asian continents, and presents numerous challenges. Molecular classification “is a mess,” Dr. Golub said, and recurrence is common. The problem is that there are few frozen samples with long-term outcome information. Thus, Dr. Golub and his group applied the Illumina DASL assay – which enables very small, highly multiplexed, locus-specific PCR – to perform expression profiling in formalin-fixed paraffin-embedded (FFPE) samples. They achieved up to 90% success across 6,000 genes in samples that were 25 years old. Doing so opened up a vast bank of viable samples for gene expression profiling, from which Dr. Golub and colleagues made some interesting findings.

The AML Genome (Tim Ley, WashU and Siteman Cancer Center)

Tim Ley gave the last talk, which highlighted the work that he and colleagues at WashU began around a decade ago on the disease acute myeloid leukemia (AML). Our goal, he said, was to find 95% of the mutations that occur in at least 5% of AMLs. To do so will require whole genome sequencing of at least 30 genomes, according to statistics from my colleague Mike Wendl. Two of these (AML-1 and AML-2) are already done and published, and a number of others are currently under way. One intriguing bit of work that Dr. Ley described was on the “Mouse APL” project, a knock-in mouse with the PML-RARA gene fusion backcrossed 10+ generations to CBL/BL6 mice. This yielded inbred strains of mice, some of which developed AML after ~6 months, presumably after acquiring “cooperative” mutations. One mouse was sequenced to 15x coverage, and among the handful of somatic nonsynonymous mutations found, one was recurrent, not only in the APL mice, but also in the same gene in human tumors.