Cancer hallmarks, omic data, and data resources Anthony

Cancer hallmarks, omic data, and data resources Anthony

Cancer hallmarks, omic data, and data resources Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) January 22, 2015 What computational analysis contributes to cancer research

1. 2. 3. 4. 5. 6. 7. 8. Predicting driver alterations

Defining properties of cancer (sub)types Predicting prognosis and therapy Integrating complementary data Detecting affected pathways and processes Explaining tumor heterogeneity Detecting mutations and variants Organizing, visualizing, and distributing data Convergence of driver events

Amid the complexity and heterogeneity, there is some order Finite number of major pathways that are affected by drivers Vogelstein2013 Hanahan2011 Similar pathway effects

Tumor 1: EGFR receptor mutation makes it hypersensitive Tumor 2: KRAS hyperactive Tumor 3: NF1 inactivated and no longer modulates KRAS Tumor 4: BRAF over responsive to KRAS

signals Vogelstein2013 Detecting affected pathways Ding2014 Pathway enrichment

DAVID Pathway discovery Stimulate receptor 31% of pathway is activated 98% of activity

is not covered BioCarta EGF Signaling Pathway Phosphorylation data from Alejandro Wolf-Yadlin Hallmarks of cancer Hanahan2011 Sustaining proliferative

signaling Cells receive signals from the local environment telling them to grow (proliferate) Specialized receptors detect these signals Feedback in pathways carefully controls the response to these signals Evading growth suppressors Override tumor suppressor genes

Some proteins control the cells decision to grow or switch to an alternate track Apoptosis: programmed cell death Senescence: halt the cell cycle External or internal signals can affect these decisions Cell cycle

Biology of Cancer Resisting cell death One self-defense mechanism against cancer Apoptosis triggers include: DNA damage sensors Limited survival cues Overactive signaling proteins Necrosis causes cells to explode

Destroys a (pre)cancerous cell Releases chemicals that can promote growth in other cells ODay Enabling replicative immortality Cells typically have a limited number of divisions Immortalization: unlimited replicative potential

Telomeres protect the ends of DNA Shorten over time Encode the number of cell divisions remaining Can be artificially upregulated in cancer Patton2013 Telomere shortening Wall Street Journal

Inducing angiogenesis Tumors must receive nutrients like other cells Certain proteins promote growth of blood vessels LKT Laboratories Activating invasion and metastasis Cancer progresses through the aforementioned

stages Epithelial-mesenchymal transition (EMT) Emerging hallmarks Hanahan2011 Genome instability and mutation Cancer cells mutate more frequently

Increased sensitivity to mutagens Loss of telomeres increases copy number alterations Model systems in oncology Cell lines: Cells that reproduce in a lab indefinitely (e.g. Hela cells) Genetically engineered mice: Manipulate mice to make them predisposed to cancer Xenograft: Implant human tumor cells into mice

Omic data types DNA (genome) Mutations Copy number variation Other structural variation RNA expression (transcriptome) Gene expression (mRNA) Micro RNA expression (miRNA)

Protein (proteome) Protein abundance Protein state (e.g. phosphorylation) Protein DNA binding DNA state and accessibility (epigenome) DNA methylation (methylome) Histone modification / chromatin marks DNase I hypersensitivity

Next-generation sequencing (NGS) Revolutionized high-throughput data collection *-seq strategy Decide what you want to measure in cells Figure out how to select or synthesize the right DNA Dump it into a DNA sequencer ~100 different *-seq applications

NODAI *-seq examples Rizzo2012 Generating DNA templates Rizzo2012

Generating reads Rizzo2012 Assembly and alignment Rizzo2012 Microarrays

High-throughput measurement of gene expression, protein DNA binding, etc. Mostly replaced by *-seq Fixed probes as opposed to DNA reads Microarray quantification University of Utah Wikipedia

Wikimedia DNA mutations Whole-exome most prevalent in cancer Only covers exons that form genes, less expensive DNA Link Whole-genome becoming more widespread as

sequencing costs continue to decrease Copy number variation Often represented as relative to normal 2 copies Ranges from a few bases to whole chromosomes Quantitative, not discrete, representation MindSpec Gene expression

Transcript (messenger RNA) abundance Appling lab Graz Genome-wide gene expression Quantitative state of the cell 1

15 87 85 Gene 2 35

32 2 2

5 0 65 3 Brain

Heart Blood (normal) Gene 1 Gene 20000 Blood (infected)

miRNA expression microRNA (miRNA) ~22 nucleotides Does not code for a protein Regulates gene expression levels by binding mRNA NIH Protein abundance

Protein abundance is analogous to gene expression Not perfectly correlated with gene expression Harder to measure Mass spectrometry is almost proteome-wide Vaporize molecules Determine what was vaporized based on mass/charge David Darling

Protein state Chemical groups added to mature protein Phosphorylation is the most-studied Analogous to Boolean state Pierce Protein arrays Currently more common in cancer datasets Measure a limited number of specific proteins using

antibodies Protein abundance or state R&D MD Anderson Transcriptional regulation ChIP-seq directly measures transcription factor (TF) binding but requires a matching antibody

Various indirect strategies Wang2012 Predicting regulator binding sites Motifs are signatures of the DNA sequence recognized by a TF TFs block DNA cleavage

Combining accessible DNA and DNA motifs produces binding predictions for hundreds of TFs Neph2012 DNA methylation Methylation is a DNA modification (state change)

Hyper-methylation suppresses transcription Methylation almost always at C Wikimedia Learn NC Clinical data Age, sex, cancer stage, survival KaplanMeier plot

Wikipedia Large cancer datasets Tumors The Cancer Genome Atlas (TCGA) Broad Firehose and FireBrowse access to TCGA data International Cancer Genome Consortium (ICGC) Cell lines

Cancer Cell Line Encyclopedia (CCLE) Catalogue of Somatic Mutations in Cancer (COSMIC) Cancer gene lists COSMIC Gene Census Vogelstein2013 drivers Interactive tools for cancer data cBioPortal

TumorPortal Cancer Regulome Cancer Genomics Browser StratomeX Gene and protein information TP53 example GeneCards UniProt

Entrez Gene Pathway and function enrichment Database for Annotation, Visualization and Integrat ed Discovery (DAVID) Molecular Signatures Database (MSigDB) Gene expression data

Gene Expression Omnibus (GEO) ArrayExpress Protein interaction networks iRefIndex and iRefWeb Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) High-quality INTeractomes (HINT)

Transcriptional regulation Encyclopedia of DNA Elements (ENCODE) DNA binding motifs TRANSFAC JASPAR UniPROBE miRNA binding miRBase TargetScan

Recently Viewed Presentations

  • Chapter 4: The Sounds of American English

    Chapter 4: The Sounds of American English

    Plosives =sounds you make by blocking off the breath stream entirely for a short period of time, just long enough to build up some air pressure behind your articulators. You then "explode" this air to produce the sound. E.g. "pet"...
  • apexferg.weebly.com

    apexferg.weebly.com

    Burris Ewell, Scout, Dill, and Walter Cunningham were good fits. Students examined the varying environments in which these children live and made some educated guesses as to what their values may have been. We were able to draw some nice...
  • 5 Steps to a 5 - White Plains Middle School

    5 Steps to a 5 - White Plains Middle School

    5 Steps to a 5. A second practice test. 1. Which of the following belief systems owned ... the Renaissance (A). Hinduism retained its traditional. patriarchal society in India (C). African. ... Proto-Bantu is the language family from which. the...
  • Input and Output - KSU

    Input and Output - KSU

    C has no built-in statements for input or output. Input and output functions are provided by the standard library <stdio.h> All input and output is performed with streams: Stream: a sequence of bytes. text stream: consists of series of characters...
  • Chapter 39: Insects 39-1 The Insect World 39-2

    Chapter 39: Insects 39-1 The Insect World 39-2

    39-1 The Insect World ... Labrum & Labium (i.e., upper and lower LIPS, respectively) HOLD food so sharp-edged mandibles can TEAR off edible bits. Salivary Glands (inside the mouth) Saliva MOISTENS food sent through esophagus into crop and gizzard for...
  • i Welcome to the continuation of Ernies Blog

    i Welcome to the continuation of Ernies Blog

    Welcome to the continuation of Ernie's Blog in a more condensed format for the month of October, 2017.. Table of Contents by Date. The purpose of this blog is to provide a commentary in essays, visual organizers and poems on...
  • Parent Possible 2015-2016 Statewide Outcomes

    Parent Possible 2015-2016 Statewide Outcomes

    Bracken School Readiness Assessment (BSRA-3) STRONGER FAMILIES, TODAY & TOMORROW. The Big Picture. ... Beginning July 1, 2017 - requiring use of PICCOLO with ALL families enrolled in PAT or HIPPY (with a child 10 months - 6 years) PAT...
  • Scientific Poster Template

    Scientific Poster Template

    Special thanks to Kenny Lopez, Matthew Herrera, Brandon Fernandez, and Manny Vargas who all performed field and lab work to compile this data. A. Basal Area. B. Crown Base Height. C. Crown Closure. Literature Cited *-Discussion. Piñon and Juniper woodlands...