|Myrna: Cloud, differential gene expression|
- Frazee AC, Langmead B, Leek JT. ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets. BMC Bioinformatics 12:449
- VLDS poster, 6/11 (coming soon)
All columns of the table below are sortable: clicking on the column title will alphebetize or order the column (keeping the rows properly aligned). The columns are as follows:
With a few exceptions, the datasets are named for the first author of the paper from which the .fastq files were obtained. The Katz paper contained both mouse and human reads, so two separate datasets were created. The "maqc" dataset was built from reads obtained from the MicroArray Quality Control Project. The "modencodeworm" and "modencodefly" datasets were generated using reads from papers associated with the modENCODE Consortium.
Papers from which we collected the .fastq files are accessible via the given clickable PubMed ID.
The species of the samples under study.
Number of biological replicates
The number of distinct biological replicates included in the dataset. Gene counts from technical replicates were pooled. The number of technical replicates pooled to give counts for each biological replicate is available in the ExpressionSet and phenotype table for each dataset.
Number of uniquely aligned reads
In each Myrna run, some reads were discarded because they did not align, and some reads aligned repetitively and were therefore discarded. The count in this column is the number of reads not discarded. *Note that for Montgomery and Pickrell, read counts were for both datasets combined, since both were analyzed with the same Myrna run.
Click "link" to download an .RData file containing the gene count table and phenotype data in an ExpressionSet. When the R object is loaded into the workspace, the ExpressionSet will be named study.eset, where "study" is replaced with the dataset name given in the first column of the table. To use the ExpressionSets, you will need to install Bioconductor and run the command library(Biobase). For some preliminary information on using ExpressionSets, please click here.
Click "link" to download a .txt file containing the raw gene counts output by Myrna.
Click "link" to download a .txt file containing phenotype information for each sample in the count table. Each phenotype table contains a sample.id column and a num.tech.reps column, where sample.id is either the HapMap ID of the sample (if applicable) or the SRX number of the sample, which can be used to search for the sample in NCBI's Sequence Read Archive (SRA). The num.tech.reps column tells how many technical replicates were pooled to obtain gene counts for that sample.
NotesBrief description of experiment.
Please note that to use the ExpressionSets below, you will need to install Bioconductor and run the command library(Biobase)
|Study||PMID||Species||Number of biological replicates||Number of uniquely aligned reads||ExpressionSet||Count table||Phenotype table||Notes|
|bodymap||22496456||human||19||2,197,622,796||link||link||link||Illumina Human BodyMap 2.0 -- tissue comparison|
|cheung||20856902||human||41||834,584,950||link||link||link||HapMap - CEU|
|gilad||20009012||human||6||41,356,738||link||link||link||liver; males and femlaes|
|montgomery||20220756||human||60||*886,468,054||link||link||link||HapMap - CEU|
|pickrell||20220758||human||69||*886,468,054||link||link||link||HapMap - YRI|
|sultan||18599741||human||4||6,573,643||link||link||link||cell type comparison|
|katz.mouse||21057496||mouse||4||14,368,471||link||link||link||control vs. CUG-BP1 knockdown myoblasts|
|yang||20363980||mouse||1||27,883,862||link||link||link||hybrid cell line, X always inactive|
|bottomly||21455293||mouse||21||343,445,340||link||link||link||2 inbred mouse strains|
|nagalakshmi||18451266||yeast||4||7,688,602||link||link||link||priming technique comparison|
|hammer||20452967||rat||8||158,178,477||link||link||link||experimental vs. control at 2 time points|
|modencodeworm||19181841||worm||46||1,451,119,823||link||link||link||developmental time course|
|developmental time course|
**These studies originally contained tables with unpooled technical replicates. The unpooled tables are available under the "original" links, while tables with pooled technical replicates are available under the "pooled" links.
Datasets created without truncationThe count tables and ExpressionSets in the above table were created by truncating all reads longer than 35bp to 35bp. Count tables and ExpressionSets created without truncation are available for download here.
Ensembl 61 gene/exon informationBelow are links to files containing information about the genes and exons in Ensembl 61 and about the union/intersect intervals used in Myrna's gene model when creating the above count tables. (These are the genes.txt, exons.txt, and the .ivals file for each chromosome from the respective organism's Ensembl 61 Myrna reference jar.)
- human [ genes | exons | intervals ]
- mouse [ genes | exons | intervals ]
- yeast [ genes | exons | intervals ]
- rat [ genes | exons | intervals ]
- worm [ genes | exons | intervals ]
- fly [ fly | exons | intervals ]
Below are links to Myrna manifest files used to create the count tables with Myrna.
- Katz - human
- MAQC (links to locally stored files)
- Montgomery/Pickrell (count tables created with same Myrna run)
- Katz - mouse
- modENCODE - worm
- modENCODE - fly
Getting Started with ExpressionSetsPlease click here for a few R commands that are useful when working with ExpressionSets.
Code UsedCommands passed to Myrna
R code used to create ExpressionSets (requires Bioconductor and additional files)
R code used in the "example applications" section of the paper (requires Bioconductor)