Caltech RNA-seq Downloadable Files

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

RNA-seq from ENCODE/Caltech (Track settings)

Additional resources:
• files.txt - lists the name and metadata for each download.
• md5sum.txt - lists the md5sum output for each download.
• downloads server - alternative access to downloadable files (may include obsolete data).

Filter files by: (select multiple categories and items - help)

Cell Line:

Treatment:

View:

UCSC Accession:

GEO Accession:

Submitted:

RESTRICTED
Until:

24 files	Cell Line	Treatment	View	UCSC Accession	GEO Accession	Size	File Type	Submitted	RESTRICTED Until	Additional Details
	10T1/2	EqS_2.0pct_60hr	Alignments	wgEncodeEM002734		4.5 MB	bam.bai	2012-03-19	2012-12-19	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200E2p60hAlnRep1; md5sum=4df4596aa4decb7aa3bc14ac96ff1edc;
	10T1/2	EqS_2.0pct_60hr	Alignments	wgEncodeEM002734		16 GB	bam	2012-03-19	2012-12-19	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200E2p60hAlnRep1; md5sum=c3a4c88ae962d417dd0fa8cd8c29cbfb;
	10T1/2	EqS_2.0pct_60hr	FastqRd1	wgEncodeEM002734	GSM929772	11 GB	fastq	2012-03-19	2012-12-19	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; md5sum=1875f7f7418e55e6877eed73bc7c236d;
	10T1/2	EqS_2.0pct_60hr	FastqRd2	wgEncodeEM002734	GSM929772	12 GB	fastq	2012-03-19	2012-12-19	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; md5sum=77cc9fff44e01327d0d78f72f20c074d;
	10T1/2	EqS_2.0pct_60hr	Raw signal	wgEncodeEM002734	GSM929772	223 MB	bigWig	2012-04-16	2013-01-16	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200E2p60hRawRep1; md5sum=bde824d8d8f03f4a7828c6712e9515d1;
	10T1/2	EqS_2.0pct_60hr	Signal	wgEncodeEM002734	GSM929772	213 MB	bigWig	2012-03-19	2012-12-19	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6081; labExpId=11155; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200E2p60hSigRep1; md5sum=b8a8bbb10114c22ac0608a45f6db15ef;
	10T1/2	None	Alignments	wgEncodeEM002735		3.9 MB	bam.bai	2012-03-20	2012-12-20	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200AlnRep1; md5sum=af80d8822d5e51bd01ee2a8b48fc1c10;
	10T1/2	None	Alignments	wgEncodeEM002735		20 GB	bam	2012-03-20	2012-12-20	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200AlnRep1; md5sum=8323ccba1b81240540517a9387bb57a4;
	10T1/2	None	FastqRd1	wgEncodeEM002735	GSM929773	13 GB	fastq	2012-03-20	2012-12-20	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; md5sum=9a2fb7568049535186e918f3952a4f7e;
	10T1/2	None	FastqRd2	wgEncodeEM002735	GSM929773	17 GB	fastq	2012-03-20	2012-12-20	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; md5sum=4e24f9b9391cf7b741ce74baed38a1ef;
	10T1/2	None	Raw signal	wgEncodeEM002735	GSM929773	133 MB	bigWig	2012-04-16	2013-01-16	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200RawRep1; md5sum=212ff9da9d6be6a39d5d5628c4afa373;
	10T1/2	None	Signal	wgEncodeEM002735	GSM929773	122 MB	bigWig	2012-03-20	2012-12-20	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=6082; labExpId=11154; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeq10t12C3hFR2x75Th131Il200SigRep1; md5sum=0de4fbe228884dc8d329f8f33383dff8;
	C2C12	EqS_2.0pct_60hr	Alignments	wgEncodeEM002733		4.5 MB	bam.bai	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4936; labExpId=10986; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200E2p60hAlnRep1; md5sum=6ed497015084c626890e277d642f647d;
	C2C12	EqS_2.0pct_60hr	Alignments	wgEncodeEM002733		17 GB	bam	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4936; labExpId=10986; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200E2p60hAlnRep1; md5sum=816519a92e72bbae60d742bf5ffcc4c1;
	C2C12	EqS_2.0pct_60hr	FastqRd1	wgEncodeEM002733	GSM929775	11 GB	fastq	2012-02-01	2012-11-01	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4936; labExpId=10986; mapAlgorithm=TH131; md5sum=45ca606b2883e1fd96b5ff1115d2ba75;
	C2C12	EqS_2.0pct_60hr	FastqRd2	wgEncodeEM002733	GSM929775	15 GB	fastq	2012-02-01	2012-11-01	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4936; labExpId=10986; mapAlgorithm=TH131; md5sum=d1cbc37896d3b5ca149ab523e21332c3;
	C2C12	EqS_2.0pct_60hr	Raw signal	wgEncodeEM002733	GSM929775	214 MB	bigWig	2012-04-16	2013-01-16	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=4936; labExpId=10986; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200E2p60hRawRep1; md5sum=7e27b5248e448491538355a6fbc8ff26;
	C2C12	EqS_2.0pct_60hr	Signal	wgEncodeEM002733	GSM929775	203 MB	bigWig	2012-02-01	2012-11-01	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=4936; labExpId=10986; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200E2p60hSigRep1; md5sum=3bb0070418d4520398fee5113d4b4b20;
	C2C12	None	Alignments	wgEncodeEM002732		4.4 MB	bam.bai	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4910; labExpId=10985; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200AlnRep1; md5sum=ed93b0d7103a9425cdd2c02b4ab9ae1b;
	C2C12	None	Alignments	wgEncodeEM002732		22 GB	bam	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4910; labExpId=10985; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200AlnRep1; md5sum=4cb0b035327226fca5a0451be168e94b;
	C2C12	None	FastqRd1	wgEncodeEM002732	GSM929774	16 GB	fastq	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=4910; labExpId=10985; mapAlgorithm=TH131; md5sum=d35aaa9b2b504a0f0144cfbdeb1d2e1f;
	C2C12	None	FastqRd2	wgEncodeEM002732	GSM929774	16 GB	fastq	2012-01-31	2012-10-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=4910; labExpId=10985; mapAlgorithm=TH131; md5sum=e6987968c2c4c0a90d44fa8a63ae86e6;
	C2C12	None	Raw signal	wgEncodeEM002732	GSM929774	216 MB	bigWig	2012-04-16	2013-01-16	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-18; subId=4910; labExpId=10985; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200RawRep1; md5sum=9429e5327e11b814f451cb2772354520;
	C2C12	None	Signal	wgEncodeEM002732	GSM929774	205 MB	bigWig	2012-01-31	2012-11-31	strain=C3H; sex=F; age=immortalized; rnaExtract=longPolyA; readType=2x75; insertLength=200; replicate=1; dataVersion=ENCODE Mar 2012 Freeze; dateResubmitted=2012-04-17; subId=4910; labExpId=10985; mapAlgorithm=TH131; tableName=wgEncodeCaltechRnaSeqC2c12C3hFR2x75Th131Il200SigRep1; md5sum=3d9eaf139f68b043ba0a65aeed18553d;
24 files									Restriction Policy

Description

Rationale for the Mouse ENCODE project

Our knowledge of the function of genomic DNA sequences comes from three basic approaches. Genetics uses changes in behavior or structure of a cell or organism in response to changes in DNA sequence to infer function of the altered sequence. Biochemical approaches monitor states of histone modification, binding of specific transcription factors, accessibility to DNases and other epigenetic features along genomic DNA. In general, these are associated with gene activity, but the precise relationships remain to be established. The third approach is evolutionary, using comparisons among homologous DNA sequences to find segments that are evolving more slowly or more rapidly than expected given the local rate of neutral change. These are inferred to be under negative or positive selection, respectively, and we interpret these as DNA sequences needed for a preserved (negative selection) or adaptive (positive selection) function.

The ENCODE project aims to discover all the DNA sequences associated with various epigenetic features, with the reasonable expectation that these will also be functional (best tested by genetic methods). However, it is not clear how to relate these results with those from evolutionary analyses. The mouse ENCODE project aims to make this connection explicitly and with a moderate breadth. Assays identical to those being used in the ENCODE project are performed in cell types in mouse that are similar or homologous to those studied in the human project. Thus we will be able to discover which epigenetic features are conserved between mouse and human, and we can examine the extent to which these overlap with the DNA sequences under negative selection. The contribution of DNA with a function preserved in mammals versus that with a function in only one species will be discovered. The results will have a significant impact on our understanding of the evolution of gene regulation.

Reference transcriptome measurements with RNA-seq

RNA-seq is a method for mapping and quantifying the transcriptome of any organism that has a genomic DNA sequence assembly (Mortazavi et al., 2008). RNA-seq is performed by reverse-transcribing an RNA sample into cDNA, followed by high-throughput DNA sequencing, which was done here on the Illumina HiSeq sequencer. The transcriptome measurements shown on these tracks were performed on polyA selected RNA from total cellular RNA. PolyA-selected RNA was fragmented by magnesium-catalyzed hydrolysis and then converted into cDNA by random priming and amplified. Paired-end 2x100 bp reads were obtained from each end of a cDNA fragment. Reads were aligned to the mm9 human reference genome using TopHat (Trapnell et al., 2009), a program specifically designed to align RNA-seq reads and discover splice junctions de novo. All sequence and alignments files are available on the downloads page.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. The following views are in this track:

Alignments: The Alignments (BAM file) view shows reads aligned to the genome. Alignments are colored by cell type. See the Bowtie Manual (Langmead et al., 2009) for information about the SAM Bowtie output (including other tags) and the SAM Format Specification for information on the SAM/BAM file format.
Raw Signal: Density graph (wiggle) of signal enrichment based on a normalized aligned read density (Read Per Million, RPM). The RPM measure assists in visualizing the relative amount of a given transcript across multiple samples. This is used to display all reads in this track.
Signal (Unique Reads): Density graph (wiggle) of signal enrichment based on processed data. This is used to display uniquely mapped reads in this track.

Additional views are available on the Downloads page.

Methods

Experimental Procedures

Cells were grown according to the approved ENCODE cell culture protocols. Cells were lysed in RLT buffer (Qiagen RNEasy kit), and processed on RNEasy midi columns according to the manufacturer's protocol, with the inclusion of the "on-column" DNAse digestion step to remove residual genomic DNA. A quantity of 75 �gs of total RNA was selected twice with oligo-dT beads (Dynal) according to the manufacturer's protocol to isolate mRNA from each of the preparations. A quantity of 100 ngs of mRNA was then processed according to the protocol in Mortazavi et al. (2008), and prepared for sequencing on the Illumina GAIIx or HiSeq platforms according to the protocol for the ChIP-Seq DNA genomic DNA kit (Illumina). Paired-end libraries were size-selected around 200 bp (fragment length). Libraries were sequenced with the Illumina HiSeq according to the manufacturer's recommendations. Paired-end reads of 100 bp length were obtained

Data Processing and Analysis

Reads were mapped to the reference mouse genome (version mm9 with or without the Y chromosome, depending on the sex of the cell line, and without the random chromosomes in all cases) using TopHat (version 1.3.1). TopHat was used with default settings with the exception of specifying an empirically determined mean inner-mate distance and supplying known ENSEMBL version 63 splice junctions.

Credits

Wold Group: Brian Williams, Georgi Marinov, Diane Trout, Lorian Schaeffer, Gordon Kwan, Katherine Fisher, Gilberto De Salvo, Ali Mortazavi, Henry Amrhein, Brandon King

Contacts: Georgi Marinov (data coordination/informatics/experimental), Diane Trout (informatics) and Brian Williams (experimental)

References

Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25. PMID: 19261174; PMC: PMC2690996

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008 Jul;5(7):621-8. PMID: 18516045

Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009 May 1;25(9):1105-11. PMID: 19289445; PMC: PMC2672628

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.