Description
These tracks display a synthesis of evidence from different assays
as part of the four Open Chromatin track sets.
This track displays open chromatin regions and/or transcription factor binding
sites identified in
multiple cell types
by one or more complementary methodologies: DNaseI hypersensitivity (HS)
(Duke DNaseI HS),
Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE)
(UNC FAIRE),
and chromatin immunoprecipitation (ChIP) for select regulatory factors
(UTA TFBS).
Each methodology was performed on the same cell type using identical growth
conditions. (Note: Data for some or all ChIP experiments may not be available
for all cell types). Regions that overlap between methodologies identify
regulatory elements that are cross-validated indicating high confidence
regions. In addition, multiple lines of evidence suggest that regions detected
by a single assay (e.g., DNase-only or FAIRE-only) are also biologically
relevant (Song et al., submitted).
DNaseI HS data:
DNaseI is an enzyme that has long been used to map general
chromatin accessibility, and DNaseI "hypersensitivity" is a feature of active
cis-regulatory sequences. The use of this method has led to the discovery of
functional regulatory elements that include promoters, enhancers, silencers,
insulators, locus control regions, and novel elements. DNaseI hypersensitivity
signifies chromatin accessibility following binding of trans-acting factors in
place of a canonical nucleosome.
FAIRE data:
FAIRE (Formaldehyde Assisted Isolation of Regulatory
Elements) is a method to isolate and identify nucleosome-depleted regions of
the genome. FAIRE was initially discovered in yeast and subsequently shown to
identify active regulatory elements in human cells (Giresi et al.,
2007). Similar to DNaseI HS, FAIRE appears to identify functional regulatory
elements that include promoters, enhancers, silencers, insulators, locus
control regions and novel elements.
ChIP data:
ChIP (Chromatin Immunoprecipitation) is a method to
identify the specific location of proteins that are directly or indirectly
bound to genomic DNA. By identifying the binding location of sequence-specific
transcription factors, general transcription machinery components, and
chromatin factors, ChIP can help in the functional annotation of the open
chromatin regions identified by DNaseI HS mapping and FAIRE.
Input data:
As a background control experiment, the input genomic DNA sample
that was used for ChIP was sequenced. Crosslinked chromatin
is sheared and the crosslinks are reversed without carrying out the
immunoprecipitation step. This sample is otherwise processed in a manner
identical to the ChIP sample as described below. The input track is
useful in revealing potential artifacts arising from the sequence
alignment process such as copy number differences between the
reference genome and the sequenced samples, as well as regions of
poor sequence alignability.
Display Conventions and Configuration
This track contains multiple subtracks representing different cell types
that display individually on the browser. Instructions for configuring tracks
with multiple subtracks are
here.
To facilitate analyses, each region has been assigned an Open Chromatin Code (OC Code),
based on the assay(s) by which it was detected, and a color, based on its level of validation
(which was determined by the combination of its OC Code and its statistical significance):
- Validated, OC Code = 1:
- Black:
Regions identified as peaks by both the DNaseI HS assay and FAIRE assay. Peaks
for DNaseI HS have DNase peak calling p-values
< 0.05
(-log10(p-value) > 1.3 ) and peaks for FAIRE have FAIRE
peak calling p-values < 0.1 (-log10(p-value) > 1.0 ).
- Open Chromatin, OC Code = 2 or 3:
- Blue (high significance):
Regions not identified as peaks in both DNaseI HS and FAIRE, but for which the
combination of peak calling p-values from these assays using Fisher's combined
probability test results in a p-value
< 0.01
(-log10(p-value) > 2 ).
- DNase, OC Code = 2:
- Green (low significance):
Regions identified by DNaseI HS as peaks (DNase peak calling p-value
< 0.05
(-log10(p-value) > 1.3 ))
and not identified by FAIRE as peaks (FAIRE peak calling p-value < 0.1
(-log10(p-value) > 1.0 )),
and with a Fisher's combined DNaseI HS and FAIRE p-value >= 0.01
(-log10(p-value) <= 2 ).
- Blue (high significance):
see Open Chromatin above.
- FAIRE, OC Code = 3:
- Dark Red (low significance):
Regions identified by FAIRE as peaks (FAIRE peak calling p-value
< 0.1
(-log10(p-value) > 1.0 ))
and not identified by DNaseI HS as peaks (DNase peak calling p-value < 0.05
(-log10(p-value) > 1.3 )), and with a
Fisher's combined DNaseI HS and FAIRE p-value >= 0.01
(-log10(p-value) <= 2 ).
- Blue (high significance):
see Open Chromatin above.
- ChIP-seq, OC Code = 4:
- Pink:
Regions identified by ChIP-seq as peaks (at least one of the peak calling p-values for
the three ChIP experiments are
< 0.05 (-log10(p-value) > 1.3 ))
indicating binding sites for one
or more of RNA Pol II, CTCF, and c-Myc described here and not identified by DNaseI HS
or FAIRE as peaks. Peaks for ChIP-seq have p-values < 0.05
(-log10(p-value) > 1.3 ). For RNA Pol II, only sites that
overlap annotated transcription start sites by the UCSC Genes track are considered.
All signal values, -log10(p-values) , and the OC Code are
displayed on the detail page for each element and are available in the
corresponding bed file.
Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.
Methods
For each site, the maximum F-Seq Density Signal value has been calculated
for each assay that was performed in that cell type. F-Seq employs Parzen
kernel density estimation to create base pair scores (Boyle et al., 2008b).
Significant regions, or peaks, were determined by fitting the data to a gamma
distribution to calculate p-values. Contiguous regions where p-values were
below a 0.05 (DNaseI HS, ChIP) or 0.1 (FAIRE) threshold were considered
significant. See assay specific description pages
(Duke DNaseI HS,
UNC FAIRE and
UTA TFBS)
for more details.
A Fisher's Combined P-value for DNaseI HS and FAIRE was calculated using
Fisher's combined probability test. First, a test statistic is calculated
using the formula:
X2 = -2∑loge(pi)
where pi are the p-values calculated for DNaseI HS
and FAIRE. X2 follows a chi-squared distribution,
thus a combined p-value can be assigned to this test statistic.
Enhancer and Insulator Functional assays: A subset of DNaseI and FAIRE
regions were cloned into functional tissue culture reporter assays to test for
enhancer and insulator activity. Coordinates and results from these
experiments can be found
here.
Release Notes
This is release 2 (Feb 2012) of this track and is based upon these three
Open Chromatin tracks:
Duke DNaseI HS,
UNC FAIRE,
and
UTA TFBS.
Release 2 brings in synthesis analysis for 10 samples:
Gliobla, GM12891, GM12892, GM18507, GM19239, HeLa-S3/IFNa4h, HTR8svn, Medullo, PanIslets, Urothelia.
Credits
These data and annotations were created by a collaboration of multiple
institutions (contact:
Terry Furey):
References
Bhinge AA, Kim J, Euskirchen GM, Snyder M, Iyer VR.
Mapping the chromosomal targets of STAT1 by Sequence Tag Analysis of Genomic Enrichment (STAGE).
Genome Res. 2007 Jun;17(6):910-6.
Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE et al.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.
Nature. 2007 Jun 14;447(7146):799-816.
Boyle AP, Davis S, Shulha HP, Meltzer P, Margulies EH, Weng Z, Furey TS, Crawford GE.
High-resolution mapping and characterization of open chromatin across the genome.
Cell. 2008 Jan 25;132(2):311-22.
Boyle AP, Guinney J, Crawford GE, Furey TS.
F-Seq: a feature density estimator for high-throughput sequence tags.
Bioinformatics. 2008 Nov 1;24(21):2537-8.
Buck MJ, Nobel AB, Lieb JD.
ChIPOTle: a user-friendly tool for the analysis of ChIP-chip data.
Genome Biol. 2005;6(11):R97.
Crawford GE, Davis S, Scacheri PC, Renaud G, Halawi MJ, Erdos MR, Green R, Meltzer PS, Wolfsberg TG, Collins FS.
DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays.
Nat Methods. 2006 Jul;3(7):503-9.
Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D et al.
Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS).
Genome Res. 2006 Jan;16(1):123-31.
Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD.
FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin.
Genome Res. 2007 Jun;17(6):877-885.
Giresi PG, Lieb JD.
Isolation of active regulatory elements from eukaryotic chromatin using FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements).
Methods. 2009 Jul;48(3):233-9.
Li H, Ruan J, Durbin R.
Mapping short DNA sequencing reads and calling variants using mapping quality scores.
Genome Res. 2008 Nov;18(11):1851-8.
Song L, Crawford GE.
DNase-seq: a high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells.
Cold Spring Harb Protoc. 2010 Feb;2010(2):pdb.prot5384.
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior consent,
submit publications that use an unpublished ENCODE dataset until nine months
following the release of the dataset. This date is listed in the Restricted
Until column on the track configuration page and the download page. The
full data release policy for ENCODE is available here.
|
Top⇑ |