Note: these data have been converted via liftOver from the Mar. 2006 (NCBI36/hg18) version of the track.
Description
This track displays a chromatin state segmentation for each of
nine human cell types.
A common set of states across the cell types were learned by
computationally integrating ChIP-seq data for
nine factors plus input
using a Hidden Markov Model (HMM). In total, fifteen states were used to
segment the genome, and these states were then grouped and colored to
highlight predicted functional elements.
Display Conventions and Configuration
This track is a composite track that contains multiple subtracks. Each subtrack represents data
for a different cell type and displays individually on the browser. Instructions for configuring tracks
with multiple subtracks are
here.
The fifteen states of the HMM, their associated segment color, and the
candidate annotations are as follows:
- State 1 - Bright Red - Active Promoter
- State 2 - Light Red -Weak Promoter
- State 3 - Purple - Inactive/poised Promoter
- State 4 - Orange - Strong enhancer
- State 5 - Orange - Strong enhancer
- State 6 - Yellow - Weak/poised enhancer
- State 7 - Yellow - Weak/poised enhancer
- State 8 - Blue - Insulator
- State 9 - Dark Green - Transcriptional transition
- State 10 - Dark Green - Transcriptional elongation
- State 11 - Light Green - Weak transcribed
- State 12 - Gray - Polycomb-repressed
- State 13 - Light Gray - Heterochromatin; low signal
- State 14 - Light Gray - Repetitive/Copy Number Variation
- State 15 - Light Gray - Repetitive/Copy Number Variation
Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.
Methods
ChIP-seq data from the Broad Histone
track was used to generate this track. Data for
nine factors plus input
and nine cell types
was binarized separately at a 200 base pair resolution based on a Poisson
background model. The chromatin states were learned from this binarized data
using a multivariate Hidden Markov Model (HMM) that explicitly models the
combinatorial patterns of observed modifications (Ernst and Kellis, 2010).
To learn a common set of states across the nine cell types, first the genomes were concatenated
across the cell types. For each of the nine cell types, each 200 base pair interval
was then assigned to its most likely state under the model. Detailed information about the model
parameters and state enrichments can be found in (Ernst et al, accepted).
Release Notes
This is release 1 (Jun 2011) of this track. It was lifted over from the
NCBI36/hg18 version of the track, and is therefore based on the NCBI36/hg18
release of the Broad Histone
track. It is anticipated that the HMM methods will be run on the newer
datasets in the GRCh37/hg19 version of the
Broad Histone track, and, once that
happens, the new data will replace this liftOver.
Credits
The ChIP-seq data were generated at the
Broad Institute and in the
Bradley E. Bernstein lab at the Massachusetts General Hospital/Harvard Medical School,
and the chromatin state segmentation was produced in
Manolis Kellis's Computational Biology group at the Massachusetts Institute of Technology.
Contact: Jason Ernst.
Data generation and analysis was supported by funds from the NHGRI (ENCODE),
the Burroughs Wellcome Fund, Howard Hughes Medical Institute, NSF, Sloan
Foundation, Massachusetts General Hospital and the Broad Institute.
References
Ernst J, Kellis M.
Discovery and characterization of chromatin states for systematic annotation of the human genome.
Nat Biotechnol. 2010 Aug;28(8):817-25.
Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M et al.
Mapping and analysis of chromatin state dynamics in nine human cell types.
Nature. 2011 May 5;473(7345):43-9.
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior
consent, submit publications that use an unpublished ENCODE dataset until
nine months following the release of the dataset. This date is listed in
the Restricted Until column on the track configuration page and
the download page. The full data release policy for ENCODE is available
here.
There is no restriction on the use of segmentation data.
|