RSS 2.0
  • Home
  • About
  • Aligners
  • Genomes
  • Subscribe
  • VarScan
  •  

    The Current State of dbSNP

    January 24th, 2012

    Contents: dbSNP GrowthBuild 135 StatsVariant CompositionFunction ClassesSNPs and IndelsCoding/Noncoding Tiers
    Less than a decade ago, the leading experts estimated that there were approximately 10 million SNPs in the human genome. Those were the early days of post-genome research, when “The SNP Consortium” was formed and began BAC overlap comparisons to routinely identify and report SNPs. Believe it or not, in my old lab there were binders full of paper records documenting the evidence for each newly discovered SNP. These variants were submitted to a central repository of human sequence variation hosted at NCBI, appropriately named dbSNP.

    Growth of dbSNP

    The database has grown substantially, already exceeding the 10 million mark by 2006:

    dbSNP Growth from Sequencing

    I highlighted some of the key driving forces of this growth that I happen to know about. These include the “BAC overlap” project of the SNP Consortium and similar SNP discovery efforts (2001-2003), The HapMap Project Phases I (2003-2005) and II (2005-2007), the advent of next-generation sequencing, of course, and most recently the 1,000 Genomes Project. You probably noticed a few trends in the figure above:

    1. Less-frequent dbSNP updates. In 2003-2004 when the HapMap consortium direly needed new loci, dbSNP was updating almost every month. New build releases have slowed down considerably, probably because (1) they’re less critical, and (2) it’s a much bigger job.
    2. Overall, and quite obviously, there’s been a rapid increase in submissions over time, with some phases of near exponential growth.
    3. The relationship between submissions (blue) and unique refSNP clusters (red). You’ll note that dbSNP gets more and more submissions, of which a shrinking fraction are truly novel loci.

    Still, by 2009, there were about 18 million unique SNPs, nearly twice the predicted number. And large variant discovery projects fueled by next-generation sequencing, such as the 1,000 Genomes Project were just ramping up.

    The Current State: dbSNP Build 135

    Downloading the dbSNP database is not for the faint of heart. Even for bioinformaticians, the file formats offered (ASN1?) are somewhat intractable compared to BED files. I prefer instead to wait until the excellent team at the UCSC Genome Browser Database releases their annotation tracks for dbSNP builds, which contain the necessary information in far more accessible formats. They have just done so for build 135, and I did some quick-and-dirty parsing to come up with some statistics.

    dbSNP Variant Composition

    You might be surprised to learn that dbSNP contains not just SNPs, but several types of DNA sequence variation:

    dbSNP Variant Composition

    In the current build there are 54,212,076 unique variants with RS numbers, of which 47.8 million, or 88%, were single nucleotide polymorphisms. The remainder comprises insertion-deletion variants (indels, 11%), multiple nucleotide polymorphisms (MNPs, 0.1%), as well as ~420,000 other classes (named, mixed, and microsatellite). The named variants are old-school genetic markers (e.g. DS128384). Mixed polymorphisms are messy loci where multiple variant types (e.g. DNP and indel) are seen. Microsatellites, of course, are long stretches of repetitive sequences, such as di-nucleotide or tri-nucleotide repeats, whose length varies between individuals. Among these are the 15 short tandem repeats (STRs) utilized for forensic DNA profiling in CODIS, the FBI’s national DNA database.

    dbSNP Function Classification

    Variants in dbSNP are classified by their relationship to NCBI’s view of known protein-coding genes. There are about a dozen “function class” categories, but they can be grouped together into five types of sequences:

    Gene locations of dbSNP variants

    You will note that the vast majority have function classification of “Unknown” suggesting that these are non-coding variants not immediately adjacent from NCBI protein-coding genes. Even for variants in or around genes, 90% are classified as intronic. If we break down the variants that are in coding regions according to dbSNP:

    Breakdown of dbSNP coding variants

    You can see that the majority of coding variants (just over half a million) are classified as “missense”, meaning that they’re predicted to cause an amino acid substitution in the encoded protein. Most of the remainder are silent (synonymous), though there are also around 40,000 variants predicted to cause premature termination (nonsense) or a shift in translation frame (frameshift) in the encoded protein.

    Honing in on SNPs and Small Indels

    For next-generation sequencing analysis, I’m generally interested in two types of variation represented in dbSNP: SNPs and small (<50 bp) indels.

    The other types are either uncommon or too large to be readily detected with short reads, and further, there are curated, devoted databases that probably do a better job of representing them (e.g. Database of Genomic Variants for large indels and structural variants). Further, although the dbSNP functional classification is useful, we use an internal “tiering” system to represent variants according to their locations in the genome:

    • Tier 1 variants affect coding sequences, including exons, splice sites, and non-coding RNA genes
    • Tier 2 variants occur in evolutionarily conserved or putative regulatory sequences
    • Tier 3 variants are in non-coding, non-conserved, unique regions of the genome
    • Tier 4 variants are in repetitive regions of the genome

    Every base in the reference sequence falls into one, and only one tier. Build 36 (hg18) of the human reference sequence is broken down to the right. There are 44 megabases of “tier 1″ coding sequence in the human genome; that’s 1.53%, straight out of the textbooks. Tier 2 comprises 248 megabases, or 8.6%, which is slightly higher than the 5% expected rate of evolutionary conservation, probably because we’re fairly inclusive with what constitutes a putative regulatory element.

    Distribution of SNPs and Indels by Tier

    Next, we look at the distribution of dbSNP’s ~48 million SNPs and ~6 million small indels among the four tiers of genome space:

    dbSNP variants by tier

    Strikingly, less than 10% of variants of both types fall into regions that are “interpretable” whereas the rest are in noncoding regions. The proportions of variants in tier 1 (1.3% of SNPs, 1% of indels) remains lower than the tier 1 fraction (above right), presumably due to purifying selection against changes to coding sequences. Many studies have shown this through far more careful analyses that account for ascertainment bias, population allele frequency, and other factors. It’s just fascinating to see the signature of natural selection in your basic pie chart.

    I’m uncertain why the distributions in tiers 3 and 4 differ between variant types above, but there are likely a number of contributing factors. From a biological perspective, indels are both less frequent and subjected to greater natural selection than SNPs. From a technical perspective, SNP discovery algorithms are far more mature than indel discovery algorithms, owing in part to the difficulties of detecting the latter in relatively short sequence reads. We are currently, and have always been, better at finding SNPs than indels. With luck, the “accuracy gap” between SNPs and indels will diminish as sequencing technologies and detection algorithms continue to evolve.

    References

    Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, & Sirotkin K (2001). dbSNP: the NCBI database of genetic variation. Nucleic acids research, 29 (1), 308-11 PMID: 11125122

    AddThis Social Bookmark Button

    Genomic Structural Variation: Methods & Protocols

    January 18th, 2012

    The draft human genome sequence, completed more than a decade ago, was an important starting point for understanding genetic variation in humans. Intensive efforts to characterize single-nucleotide polymoprhisms (SNPs), and later the discovery of extensive copy number variation (CNV) and structural variation, have highlighted the complex and dynamic nature of the our genome.

    Genomic Structural Variants Earlier this month, Springer Press published  Genomic Structural Variants, an outstanding new volume edited by Lars Feuk of Uppsala University in Sweden. This book provides an in-depth description of key developments in our understanding of structural variation and its implications for human disease.

    Most, if not all of these advances have been driven by technological innovation. The editor writes, “Over the past decade, the introduction of array-based technologies has revolutionized genomics and genetic diagnostics… Now, we are on the brink of a paradigm shift in genetics with the advent of massively parallel sequencing in research and diagnostics.”

    The contributors include James Lupski, Stephen Scherer, Ira Hall, Aaron Quinlan, Deanna Church, Bauke Ylstra, Richard K. Wilson, and a number of other distinguished scientists. Here are the topics covered, with my own brief summary and a link to each article.

    Genome Architecture
    What Genomic Disorders Have Taught Us
    SV in Subtelomeres
    Complex Regions of the Genome

    Nature of Structural Variation
    SV Effect on Gene Expression
    CNV Population Genetics
    SV and Somatic Mosaicism

    Detection of SV
    Interpreting SVs in Personal Genome Sequencing
    Massively Parallel Sequencing Approaches for SVs
    Array-based Prenatal Diagnosis
    GSV in Mammals

    SV and CNV in Human Disease
    Microdeletion and Microduplication Syndromes
    SV in Intellectual Disability
    CNV in Autism Spectrum Disorder
    CNV and Psychiatric Disease Risk

    Methods and Resources
    Online SV Resources
    SNP Array Algorithms for CNV
    Targeted Screening of CNVs
    Array-CGH of FFPE Tissues

    What Have Studies of Genomic Disorders Taught Us About Our Genome?

    Alexandra D. Simmons, Claudia M. B. Carvalho and James R. Lupski
    An overview of high-resolution analysis methods and what they’ve taught us about the architectural features, structure, and rearrangement mechanisms of the genome.   Article | PDF

    Microdeletion and Microduplication Syndromes

    Lisenka E. L. M. Vissers and Paweł Stankiewicz
    An overview of how microdeletions and microduplications form in the genome and the wide variety of phenotypes — including Mendelian and complex diseases — that they can cause.   Article | PDF

    Structural Genomic Variation in Intellectual Disability

    Rolph Pfundt and Joris A. Veltman
    A review of detection and interpretation of copy number variations in mental retardation, with a focus on diagnostic application and interpretation.   Article | PDF

    Copy Number Variation and Psychiatric Disease Risk

    Rebecca J. Levy, Bin Xu, Joseph A. Gogos and Maria Karayiorgou
    An update on the substantial progress toward understanding the role of rare CNVs in the etiology of complex psychiatric diseases, such as schizophrenia.   Article | PDF

    Detection and Characterization of Copy Number Variation in Autism Spectrum Disorder

    Christian R. Marshall and Stephen W. Scherer
    A description of the history of genomic structural variation in ASD and how CNV discovery has been used to pinpoint novel ASD-susceptibility loci.   Article | PDF

    Structural Variation in Subtelomeres

    M. Katharine Rudd
    A guide to the composition and structural variation of subtelomeres, and how FISH and array technologies have been applied to characterize them.   Article | PDF

    Array-Based Approaches in Prenatal Diagnosis

    Paul D. Brady, Koenraad Devriendt, Jan Deprest and Joris R. Vermeesch
    An overview of the recent developments on the use of array CGH in the prenatal setting and a discussion of how to best implement it.   Article | PDF

    Structural Variation and Its Effect on Expression

    Louise Harewood, Evelyne Chaignat and Alexandre Reymond
    A discussion of the profound and dramatic effect that SVs can have on the expression of genes mapping within them, nearby, and elsewhere in the genome.   Article | PDF

    The Challenges of Studying Complex and Dynamic Regions of the Human Genome

    Edward J. Hollox
    A review of key advances in the understanding of the variable structure of our genome, and a discussion of methods that may allow us to analyse this structure in fine detail.   Article | PDF

    Population Genetic Nature of Copy Number Variation

    Per Sjödin and Mattias Jakobsson
    An update on recent progress on understanding CNVs, and discussion of population genetics, recombination, mutation, selection, and demography of these variants.   Article | PDF

    Detection and Interpretation of Genomic Structural Variation in Mammals

    Ira M. Hall and Aaron R. Quinlan
    A summary of the current state of knowledge of SV in mammals, and an exploration of the key biological insights that can be gained by applying NGS methods to model organisms.   Article | PDF

    Structural Genetic Variation in the Context of Somatic Mosaicism

    Jan P. Dumanski and Arkadiusz Piotrowski
    A review combining evidence of structural variation in the context of somatic cells, highlighting the methodoligcal aspects of detection, challenges, and opportunities related to this field.   Article | PDF

    Online Resources for Genomic Structural Variation

    Tam P. Sneddon and Deanna M. Church
    A description of current structural variation online resources highlighting how major databases have addressed the challenges in capturing, storing, and displaying SV data.   Article | PDF

    Algorithm Implementation for CNV Discovery Using Affymetrix and Illumina SNP Array Data

    Laura Winchester and Jiannis Ragoussis
    A review of approaches to detect SVs by SNP array intensities, the importance of the quality control, and some guidelines for implementation.   Article | PDF

    Targeted Screening and Validation of Copy Number Variations

    Shana Ceulemans, Karlijn van der Ven and Jurgen Del-Favero
    A description of methods used for SV screening and validation, including FISH, qPCR, paralogue ratio test, molecular copy-number counting, multiplex PCR, and others.   Article | PDF

    High-Resolution Copy Number Profiling by Array CGH Using DNA Isolated from Formalin-Fixed, Paraffin-Embedded Tissues

    Hendrik F. van Essen and Bauke Ylstra
    A series of protocols tailored to array CGH of FFPE solid malignancies: from sectioning FFPE blocks to specific cynosures for pathological revision, DNA isolation, quality testing, and amplification.   Article | PDF

    Characterizing and Interpreting Genetic Variation from Personal Genome Sequencing

    Anna C. V. Johansson and Lars Feuk
    An overview of whole-genome sequences completed to date and the challenge of interpreting the whole-genome sequence data both from a technical and clinical perspective.   Article | PDF

    Massively Parallel Sequencing Approaches for Characterization of Structural Variation

    Daniel C. Koboldt, David E. Larson, Ken Chen, Li Ding and Richard K. Wilson
    Our own contribution to this volume is a review of methods and software applications for detecting, assembling, and characterizing SVs by next-generation sequencing.
    Article | PDF

    References
    Levy RJ, Xu B, Gogos JA, & Karayiorgou M (2012). Copy number variation and psychiatric disease risk. Methods in molecular biology (Clifton, N.J.), 838, 97-113 PMID: 22228008

    Johansson AC, & Feuk L (2012). Characterizing and interpreting genetic variation from personal genome sequencing. Methods in molecular biology (Clifton, N.J.), 838, 343-67 PMID: 22228021

    Koboldt DC, Larson DE, Chen K, Ding L, & Wilson RK (2012). Massively parallel sequencing approaches for characterization of structural variation. Methods in molecular biology (Clifton, N.J.), 838, 369-84 PMID: 22228022

    Marshall CR, & Scherer SW (2012). Detection and characterization of copy number variation in autism spectrum disorder. Methods in molecular biology (Clifton, N.J.), 838, 115-35 PMID: 22228009

    AddThis Social Bookmark Button

    A Tumor Evolved: Relapsed Acute Myeloid Leukemia

    January 13th, 2012

    Contents: AML SequencingCapture ValidationSomatic AlterationsMutation ProfileEvolution and ClonalityConvergent on IDH2Carl Zimmer ArticleReferences
    Acute myeloid leukemia (AML) is a cancer of myeloid blood cells, in which abnormal white blood cells accumulate in the bone marrow and interfere with normal blood cell production. This is a highly malignant tumor affecting 13,000 adults in the United States each year; if left untreated, it progresses rapidly and leads to death within weeks or months. The standard treatment is chemotherapy: induction therapy to achieve remission, followed by consolidation therapy to eliminate any residual disease. Most of the 8,800 annual deaths in the United States are of patients who relapse with a tumor that has undergone clonal evolution at the cytogenetic level. These relapsed tumors, unlike primary AML, are resistant to chemotherapy and progress rapidly.

    Whole-genome sequencing of Relapsed AML

    By sequencing the complete genomes of primary tumor, relapsed tumor, and matched normal (skin) samples from 8 AML patients, our group was able to study clonal evolution of AML tumors at the genetic level. The patients comprised 5 different French-American-British hematological subtypes; the time to relapse after initial diagnosis ranged from 235 to 961 days. One of these 8 cases was from patient UPN 933124, which we informally call “AML1″ and whose primary tumor was the first cancer genome to be published, also by our group, back in 2008.  Whole-genome sequencing of the 8 cases (24 samples) achieved >25x haploid coverage with >97% of diploid alleles represented.

    Custom capture and deep sequencing validation

    One technical achievement of this project was the high-throughput, deep sequencing validation strategy. For each patient, we designed a custom hybridization capture array (Nimblegen) targeting all candidate somatic events from the primary tumor and relapse sample (median: 539 per case). This enabled genome-wide validation of all variants in uniquely mapped regions, providing a sufficiently large set of data points for clonality analyses. Such a strategy is especially critical in AML, as tumors harbor relatively few somatic coding mutations (~21 per tumor in this study; the range is typically 5 to 50). Fragmented DNA from the primary tumor, relapse sample, and skin (normal) sample was individually hybridized with the validation probeset. Captured libraries then underwent deep sequencing on the Illumina platform, achieving a median of 590-fold coverage for each site. Such high redundancy of coverage enabled us to accurately compute allele frequency, and thereby, the fraction of tumor cells harboring each mutation.

    Patterns of somatic alteration

    We validated a total of 4,315 somatic events genome-wide. These included tier 1 (coding), tier 2 (conserved/regulatory), and tier 3 (unique noncoding) variants.

    Somatic mutations in relapsed acute myeloid leukemia
    Validated Somatic Mutations (Credit: Ding et al, Nature 2012)

    As expected, tier 1 mutations comprised the smallest category and tier 3 the largest in all tumors. This generally reflects the proportion of the genome in each tier (tier 1 is just 5%), though we and other groups have observed that the mutation rates of many tumors are lower in coding regions, likely due in part to transcription-coupled DNA repair. We utilized exome data for 200 AML cases sequenced by the Cancer Genome Atlas research network to identify recurrently mutated genes. These included:

    • Known AML genes, including DNMT3A, FLT3, NPM1, IDH1, IDH2, WT1, RUNX1, PTPRT, PHF6, and ETV6
    • Novel recurrently-mutated genes, including WAC, SMC3, DIS3, DDX41, and DAXX.

    Details on the recurrently mutated genes, and structural and functional analysis of somatic rearrangements, are provided as supplementary materials.

    Mutational Profile of Relapsed AML

    AML1 (UPN 933124) exemplifies the analysis approach we applied to relapsed AML. There were 413 validated somatic events in AML1, of which 78 were relapse-specific, 5 were primary-tumor-specific, and 330 were shared between tumors. Deep sequencing validation revealed some interesting allele frequency patterns for validated mutations:

    Tumor clones in AML
    Mutation allele frequency in AML1 (Credit: Ding et al, Nature 2012)

    You will note that most of the somatic events found in the primary tumor were also present in the relapse and vice-versa. Assuming that all mutations are heterozygous (which is likely), the observed allele frequencies suggest that shared mutations are present in virtually all tumor cells. The allele frequencies are higher in the primary tumor because its tumor cellularity, or purity (93.7%) was higher than that of the relapse (84.5%). Notably, we estimated that the tumor content of the “normal” skin sample was 29% due to infiltrating leukemic cells. This illustrates a key challenge in studying leukemia: most of the somatic mutations are observed at moderate allele frequency in the matched normal. A simplistic approach to somatic mutation detection, in which one simply subtracts all variants called in the normal from those called in the tumor, is poorly suited here. VarScan 2 and SomaticSniper are two mutation-detection algorithms developed by our group capable of addressing this problem.

    Tumor Clonality and Evolution

    A clustering analysis of mutant allele frequencies suggested that there were four clones (tumor subpopulations) in the primary tumor defined by distinct sets of mutations.

    1. Clone 1 (46.86% of primary tumor) was the founding clone.
    2. Clone 2 (24.89% of primary tumor) was derived from clone 1.
    3. Clone 3 (16.00% of primary tumor) was derived from clone 1.
    4. Clone 4 (2.39% of primary tumor) likely arose from clone 3.
    Model of Tumor Cell Evolution in Leukemia
    Tumor Evolution Model in AML1 (Credit: Ding et al, Nature 2012)

    The evidence suggests that a relatively minor subpopulation of tumor cells (clone 4) survived chemotherapy and arose to become the dominant clone at relapse. In the process, it gained additional mutations, possibly via the DNA damage induced by chemotherapy. Four other AML cases in this study were consistent with this model of tumor evolution. We also observed another model, in which the dominant clone in the primary tumor gained mutations found only at relapse:

    Dominant clone model for tumor evolution
    Dominant clone model of tumor evolution (Credit: Ding et al, Nature 2012)

    Three of the cases in this study were consistent with this model of tumor evolution.

    Convergent Evolution of IDH2 Mutations

    Here’s an interesting side-story to this study that wasn’t really discussed in the paper. Two cases harbored mutations in isocitrate dehydrogenase 2 (IDH2), a gene known to be recurrently mutated in AML, glioblastoma, and other tumors. In patient AML28 (UPN 573988), we detected a C to A substitution at chr15:88432938 (build36) causing an amino acid change (R140L) in IDH2. This mutation was also present in the relapse sample at a moderate frequency. However, we also detected a G to A mutation at chr15:88432938 (one base downstream) affecting the same arginine residue but changing it to tryptophan (R140W) instead of leucine. And these two mutations were mutually exclusive. Thus, two subclones of the tumor in AML35 both acquired activating mutations of IDH2 at the same residue but by different mutations.

    Zimmer Article on Discover Blogs

    Carl Zimmer has a nice article on this study on his blog at Go now, and read this paper.

    References
    Ding, L., Ley, T., Larson, D., Miller, C., Koboldt, D., Welch, J., Ritchey, J., Young, M., Lamprecht, T., McLellan, M., McMichael, J., Wallis, J., Lu, C., Shen, D., Harris, C., Dooling, D., Fulton, R., Fulton, L., Chen, K., Schmidt, H., Kalicki-Veizer, J., Magrini, V., Cook, L., McGrath, S., Vickery, T., Wendl, M., Heath, S., Watson, M., Link, D., Tomasson, M., Shannon, W., Payton, J., Kulkarni, S., Westervelt, P., Walter, M., Graubert, T., Mardis, E., Wilson, R., & DiPersio, J. (2012). Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing Nature DOI: 10.1038/nature10738

     

     

    AddThis Social Bookmark Button

    Genetic Basis of an Aggressive Pediatric Leukemia

    January 12th, 2012

    Contents: Early T-cell ALLWhole-genome SequencingGenetic Architecture of ETP-ALLA Stem-cell Leukemia

    Acute lymphoblastic leukemia (ALL) is the most common pediatric cancer, comprising two forms: B-cell ALL (85% of cases) and T-cell ALL (15% of cases). In this week’s issue of Nature, Jinghui Zhang and colleagues report the whole-genome sequencing of 12 cases of early T-cell precursor acute lymphoblastic leukemia (ETP-ALL), a recently described and aggressive subtype of T-ALL whose genetic basis was unknown. This is the first major publication of the Pediatric Cancer Genome Project, a collaborative effort between St. Jude Children’s Research Hospital and Washington University in St. Louis.

    Early T-cell precursor acute lymphoblastic leukemia

    ETP-ALL is associated with a high risk of treatment failure, and bears some distinct characteristics:

    1. Lack of expression of T-lineage cell surface markers CD1a and CD8
    2. Weak or absent expression of CD5
    3. Aberrant expression of myeloid and hematopoietic stem cell markers (such asCD13, CD33, CD34, CD117)
    4. “Early” cells that can differentiate into T-cell and myeloid lineages (but not B-cell).
    5. Gene expression profiles remniscent of the mouse early T-cell precursor

    ETP-ALL tumors exhibit an unusually high burden of DNA copy number alterations, but no unifying genetic have been identified.

    Whole-Genome Sequencing of ETP-ALL

    Zhang et al performed whole-genome sequencing on tumor samples and matched normals from 12 children with ETP-ALL. Tumor samples from two WGS cases also underwent transcriptome sequencing (RNA-seq). To extend their findings, the authors assembled a recurrence cohort of 94 T-cell ALL cases (52 ETP and 42 non-ETP). Three of the ETP samples in the extension cohort also underwent exome sequencing. On average, the authors 1,140 somatic mutations, including 154 that altered protein sequence, and 12 somatic structural rearrangements. More than half of the missense mutations were predicted to be deleterious, suggesting an enrichment for driver mutations involved in leukemogenesis. Notably, 51% of the validated SVs had breakpoints in protein-coding genes, including several with roles in hematopoiesis or leukemogenesis.

    Genetic Architecture of ETP-ALL

    Mutation discovery in the WGS cohort followed by recurrence testing in the extension cohort enabled the authors to identify several genetic patterns in ETP-ALL.

    Lesion Type Pathway Frequency Genes
    Activating mutations Cytokine receptor and RAS signaling 67% NRAS, KRAS, FLT3, IL7R, JAK3, SH2B3, BRAF
    Inactivating lesions Hematopoietic development 58% GATA3, ETV6, RUNX1, IKZF1, EP300
    Inactivating lesions Histone modification 48% EZH2, EED, SUZ12, SETD2, EP300

    ETP-ALL is a stem-cell leukemia

    A detailed comparison of gene expression signatures between ETP ALL tumors and and normal human hematopoietic progenitor cells revealed a somewhat surprising finding: ETP-ALL expression patterns were less consistent with early T-cell precursors, as might have been expected, but more similar to the expression profile of normal hematopoietic stem cells and granulocyte macrophage precursors. They were also enriched for genes expressed in leukemic stem cells of poor-prognosis AML. The evidence from this study suggests that the genetic alterations in ETP ALL cause “gross maturational arrest” resulting in a poorly-differentiated, stem-cell-like leukemia. This observation raises the possibility that treatment regimens for AML, such as high-dose cytarabine, may be beneficial in treating this deadly malignancy.

    References
    Zhang, J., Ding, L., Holmfeldt, L., Wu, G., Heatley, S., Payne-Turner, D., Easton, J., Chen, X., Wang, J., Rusch, M., Lu, C., Chen, S., Wei, L., Collins-Underwood, J., Ma, J., Roberts, K., Pounds, S., Ulyanov, A., Becksfort, J., Gupta, P., Huether, R., Kriwacki, R., Parker, M., McGoldrick, D., Zhao, D., Alford, D., Espy, S., Bobba, K., Song, G., Pei, D., Cheng, C., Roberts, S., Barbato, M., Campana, D., Coustan-Smith, E., Shurtleff, S., Raimondi, S., Kleppe, M., Cools, J., Shimano, K., Hermiston, M., Doulatov, S., Eppert, K., Laurenti, E., Notta, F., Dick, J., Basso, G., Hunger, S., Loh, M., Devidas, M., Wood, B., Winter, S., Dunsmore, K., Fulton, R., Fulton, L., Hong, X., Harris, C., Dooling, D., Ochoa, K., Johnson, K., Obenauer, J., Evans, W., Pui, C., Naeve, C., Ley, T., Mardis, E., Wilson, R., Downing, J., & Mullighan, C. (2012). The genetic basis of early T-cell precursor acute lymphoblastic leukaemia Nature, 481 (7380), 157-163 DOI: 10.1038/nature10725

    AddThis Social Bookmark Button