Overview of Epigenome Analysis

What is Epigenetic Analysis?

Epigenome refers to the acquired modifications of genome that is carried throughout development and cell differentiation, regulating gene’s expression without altering its nucleotide sequence. Epigenetic changes, in particular, include cytosine DNA methylation, modification of nucleosome-forming histone, and chromatin structural regulation. It has been recently suggested that such mechanism is also known to be involved in carcinogenesis. With the advancement from genetic screening of candidate gene to genome global analysis via Next Generation Sequencing (NGS) technique, there has been a growing trend in analyzing epigenetic information of various cell lines, clinical specimens, and even individual cells.

DNA Methylation

  • DNA Methylation:

    CpG Methylation is a process by which DNA methyltransferase covalently adds a methyl group to the 5-carbon of cytosine molecule located at the CpG site resulting in 5-methylcytosine (5-meC). A region composed of CpG-rich area is known as CpG island, existing in about 70% of our promoter.1)

    As a result, CpG methylation usually interferes with the transcription factor (TF) binding and changes the chromatin structure, inhibiting gene expression activity.

  • Histon Modification:

    Histone is composed of two sets of 4 subunits, H2A, H2B, H3, and H4. Together with the 147-bp length of DNA wrapped around the histone yields the nucleosome. H1 subunit and Linker DNA helps conglomerate multiple nucleosomes to form chromatin. Methylation and acetylation at the histone tail (N-terminus of histone protein) can result in the alteration of chromatin structure and binding of histone-TF complex, thereby resulting in transcriptional regulation. Open chromatin regions with methylation/acetylation at the promoter (H3K4me3, H3K27ac) and enhancer (H3K4me1) allow gene to be transcriptionally active. 2)

  • Methylated DNA Immunoprecipitation (MeDIP):

    Methylated DNA Immunoprecipitation (MeDIP): MeDIP is a purification process by which 5-meC specific antibody targets the methylated CpG island. The purified product can then be used for various testing including, genomic tiling array and NGS-based screening called methyl-CpG-binding domain sequencing (MBD-seq).

    In addition, MeDIP-seq, while provides global analysis on methylation, it can result in CpG density-dependent bias or repetitive sequence-based bias.4)

  • Bisulfite Treatment:

    Bisulfite treatment: Treating single strand DNA with bisulfite (and subsequent desulfonation) causes non-methylated cytosine to be converted into uracil.

    Since methylated cytosine consumes takes longer time period to fully convert, this provides an easier way to identify methylated region during PCR amplification; nonmethylated site would be recognized as thymine whereas the methylated site would still be cytosine.

  • MSP (Methylation-specific PCR):

    Determining whether a particular region in the gene is methylated can be achieved by PCR testing using two different primers after bisulfite conversion. One primer is designed to anneal under nonmethylated condition whereas the other only binds when the specific target is methylated. Both gel electrophoresis and real time-PCR can be used to assess methylation qualitatively and semi-quantitatively, respectively.

  • Bisulfite Sequence:

    Targeted region for methylation analysis is amplified via PCR, and subsequently cloned using plasmid vectors. The amplified region of interest is then sequenced to identify all methylated CpG site. Whole-Genome Bisulfite Sequencing is a sequencing method, first introduced in 2008 that utilized both NGS technique and bisulfite treatment to measure cytosine methylation on genome-wide scale. 7)

  • PBAT (post-bisulfite adaptor tagging):

    PBAT method is modified WGBS-based technique that requires less sample quantities. In the standard WGBS, the method requires microgram quantities due to substantial DNA fragmentation caused by bisulfite treatment after tagging adaptors to templates. In PBAT, bisulfite conversion is treated before adaptor tagging by two rounds of random primer extension. By doing so, the required amount can be as little as 125pg to efficiently assess the methylome. 9)

  • RRBS (reduced representation bisulfite sequencing):

    RRBS uses restrictive enzymes (such as Mspl that recognizes CCGG sequence) and size selection to measures methylation level of CpG sequences. Despite targeting only assigned regions or having CpG density-dependent bias, it is able to cover more than 70% of existing promoters and 80% of CpG island. This method has been used in analyze methylome on single cell, as well as on laser micro dissection sample.

  • Infinium MethylationAssay:

    Infinium Methylation Assay by Illumina requires sample quantity as little as 500ng and uses ‘BeadChip’ technology that provides non-bias, high-resolution array-based quantitative measurement for detecting methylation at the single-CpG-site level. After bisulfite treatment, the samples can then be placed under Infinium I assay that consist of probes detecting for either methylated or nonmethylated fragments or Infinium II assay that consist of both types of probes. Recently developed Infinium MethylationEPIC Kit can detect over 850,000 methylation site at a single-nucleotide resolution. The kit covers more than 90% of content contained on the Illumina HumanMethylation450K BeadChip, with CpG sites outside of CpG islands. Detection for methylation is based on fluorescent intensity from each probe’s β values (0.00 – nonmethylated, 1.00-methylated).13)

  • Pyrosequencing:

    This method uses sequencing-by-synthesis principle, by which light emitted upon luciferase activity again pyrophosphate is detected while polymerase is undergoing PCR extension. 14)

    After bisulfite treatment, the extracted genome DNA (gDNA) anneals to biotinylated 5-primers for amplification. The PCR product is then bound streptavidin-coated beads. While amplified DNA is denatured to remove the the non-biotinylated strand, sequencing primers is then added to hybridize to the biotinylated strand. When dNTP is then added during extension, each pyrophosphate emit light reaction that is shown as a peak in the pyrogram. The height of the peak correlates to the dNTP number observed. Methylation can be determined based on the percentage calculated by C/(C+T)*100(%) (available analysis software: PyroMark System by QIAGEN).

  • Clinical Case:

    Performing hierarchical clustering analysis using obtained methylation data has provided substantial information in helping assessing cancer subtypes. It has been recently shown that various types of cancer, including glioma, have methylation marker located at CpG island (CpG methylator phenotypes: CIMP) for certain subtypes. Hence, genes associated with CIMP have been placed a strong focus to carefully assess the causal relationship for carcinogenesis. 17) 18)

    In this light, various techinques including MeDIP and Infinium Methylation Assay for global analysis and MALDI-TOFMs pyrosequencing for screening has been used for “epigenotyping” colon cancer (2010)19), stomach cancer (2011)16), and thyroid cancer (2013)20). For instance, in contrast to hypomethylated stomach cancer, the presence of hypermethylation in stomach cancer is shown to be caused by EB viral infection. Furthermore, free DNA in blood plasma of Gastric cancer patient has shown significant hypermethylation level, thus brings out the necessity for global methylation analysis using MSP and pyrosequecing.16)22).

Histone Modification and Open Chromatin regional analysis

  • Chromatin Immunoprecipitation (Chromatin immunoprecipitation: ChIP):

    ChIP is a purification process using specific antibodies against proteins along the DNA (TFs, Mediators, etc.) to extract gene of interest. Histones and TFs bound to DNA are fixed by applying formaldehyde. After cross-linkage, DNA-protein complex sheared to ~500bp DNA fragments by sonication or nuclease digestion. Cross-linked DNA is selectively immunoprecipitated using protein-specific antibody. DNA strands are then purified by breaking the cross-linkage via high heat exposure for realtime PCR, genome tiling array, or next generation sequencing (ChIP-Seq) application. Though the standard minimum cell number requirement for ChIP-Seq has been about 10^7, recently developed Nano-ChIP-Seq reduces that number to 10^3 necessary for assessing histone modification. In 2013, this method was successfully used for analyzing TF and histone modification at global level from clinical specimens.23)724)25).

  • Open Chromatin Regional Analysis:

    “Open Chromatin Regional Analysis: Apart from taking into account of histone modification by observing TFs and other proteins binding to nucleotide sequence, several sequencing techniques including, Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE)-seq, DNase-seq, and ATAC-seq, have been implemented in directly analyzing the open chromatin region.

  • FAIRE-seq:

    This method is useful for obtaining high concentration of open chromatin region after sonication. Subsequent phenol-cholorform solution is applied to selectively extract most accessible DNA. By sequencing the extracted DNA, it is further possible for one to assess the binding TF candidates.29)

  • DNase-seq/ATAC-seq:

    DNase-seq/ATAC-Seq is a method used to identify the location of regulatory regions based on the genome-wide sequencing of regions hypersensitive to cleavage by DNase I. By adding biotinylated linkers tag, DNA can then be extracted via streptavidin-binding and sequenced.
    ATAC-seq is another sequence-based method that uses Tn5 transposes to read accessible genes. In contrast to DNase-seq, ATAC-seq provides several advantage; Library prep for ATAC-seq requires less time (3hr in total) and sample size (1/100 of DNase-seq). These two sequencing technique is also called “footprint sequencing” for its capability to determine protein-binding domain at single nucleotide level. In this way, one can predict the type of DNA-binding protein and its specific binding position. In fact ATAC sequencing has been recently used to identify and analyze TFs that could be responsible for expressing and regulating gene in T cells. This novel method could provide key insight into gene activity, thus potentially providing personalized medicine for patients with various disease subtypes.
    30) 31)28).


  1. Deaton, A. M.andBird, A.: Genes & development, 25: 1010-1022, 2011.
  2. Kimura, H.: Journal of human genetics, 58: 439-445, 2013.
  3. Down, T. A. et al.: Nature biotechnology, 26: 779-785, 2008.
  4. Laird, P. W.: Nature reviews. Genetics, 11: 191-203, 2010.
  5. Frommer, M. et al.: Proc Natl Acad Sci U S A., 89: 1827-1831, 1992.
  6. Herman, J. G. et al.: Proc. NatlAcad. Sci. USA, 93: 9821–9826, 1996.
  7. Cokus, S. J. et al.: Nature, 452: 215-219, 2008.
  8. Miura, F. et al.: Nucleic acids research, 40: e136, 2012.
  9. Smallwood, S. A. et al.: Nature methods, 11: 817-820, 2014.
  10. Gu, H. et al.: Nature protocols, 6: 468-481, 2011.
  11. Guo, H. et al.: Nature protocols, 10: 645-659, 2015.
  12. Schillebeeckx, M. et al.: Nucleic acids research, 41: e116, 2013.
  13. Dedeurwaerder, S. et al.: Epigenomics, 3: 771-784, 2011.
  14. Michels, K. B. et al.: Nature methods, 10: 949-955, 2013.
  15. Colella, S. et al.: BioTechniques, 35: 146-150, 2003.
  16. Matsusaka, K. et al.: Cancer research, 71: 7187-7197, 2011.
  17. Noushmehr, H. et al.: Cancer cell, 17: 510-522, 2010.
  18. Hughes, L. A. et al.: Cancer research, 73: 5858-5868, 2013.
  19. Yagi, K. et al.: Clinical cancer research : an official journal of the American Association for Cancer Research, 16: 21-33, 2010.
  20. Kikuchi, Y. et al.: Frontiers in genetics, 4: 271, 2013.
  21. Kaneda, A. et al.: Cancer research, 62: 6645-6650, 2002.
  22. Takane, K. et al.: Cancer Med, 3: 1235-1245, 2014.
  23. Landt, S. G. et al.: Genome research, 22: 1813-1831, 2012.
  24. Adli, M.andBernstein, B. E.: Nature protocols, 6: 1656-1668, 2011.
  25. Muratani, M. et al.: Nature communications, 5: 4361, 2014.
  26. Simon, J. M. et al.: Nature protocols, 7: 256-267, 2012.
  27. Song, L.andCrawford, G. E.: Cold Spring Harbor protocols, 2010: pdb prot5384, 2010.
  28. Buenrostro, J. D. et al.: Nature methods, 10: 1213-1218, 2013.
  29. Waki, H. et al.: PLoS genetics, 7: e1002311, 2011.
  30. Neph, S. et al.: Nature, 489: 83-90, 2012.
  31. Davie, K. et al.: PLoS genetics, 11: e1004994, 2015.

This post is also available in: Japanese