available_rnaseq_workflows {GenomicDataCommons}R Documentation

Get RNA-seq quantification from the NCI GDC.

Description

gdc_rnaseq is a high-level function for accessing the NCI GDC RNA-seq data and summarizing as a SummarizedExperiment.

Usage

available_rnaseq_workflows()

gdc_rnaseq(project_id, workflow_type)

Arguments

project_id

character() vector with one or more project ids. Available project_ids can be found using ids(projects()). Note that not all projects contain RNA-seq data.

workflow_type

character(1) with the workflow type. Possible values can be accessed using available_rnaseq_workflows

Details

The RNA-seq data are downloaded using gdcdata with caching used as available. The resulting files are read and combined without any transformation. It us up to the user to perform further normalization or transformation if needed.

Clinical information for each file (see gdc_clinical for details) is loaded into the colData slot. Quality control mapping information is also stored in the colData with column names beginning with "qc__".

Value

a SummarizedExperiment object, populated with the expression values, the gene ids in the rowData, and the clinical data associated with each sample in the colData.

Functions

References

See https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/ for details of data processing that occurs at the GDC.

Examples

available_rnaseq_workflows()

## Not run: 
tcga_se = gdc_rnaseq('TCGA-ACC', 'HTSeq - Counts')
tcga_se

## End(Not run)


[Package GenomicDataCommons version 1.4.3 Index]