easyRNASeq package {easyRNASeq}R Documentation

Count summarization and normalization pipeline for Next Generation Sequencing data.

Description

Offers functionalities to summarize read counts per feature of interest, e.g. exons, transcripts, genes, etc. Offers functionalities to normalize the summarized counts using 3rd party packages like DESeq or edgeR.

Methods

The main function easyRNASeq will summarize the counts per feature of interest, for as many samples as provided and will return a count matrix (N*M) where N are the features and M the samples. This data can be corrected to RPKM in which case a matrix of corrected value is returned instead, with the same dimensions. Alternatively a RangedSummarizedExperiment can be returned and this is expected to be the default in the upcoming version of easyRNASeq (as of 1.5.x). If the necessary sample information are provided, the data can be normalized using either DESeq or edgeR and the corresponding package object returned. For more insider details, and step by step functions, see:

ShortRead methods for pre-processing the data. easyRNASeq annotation methods for getting the annotation. easyRNASeq coverage methods for computing the coverage from a Short Read Alignment file. easyRNASeq summarization methods for summarizing the data. easyRNASeq correction methods for correcting the data (i.e. generating RPKM). edgeR methods for post-processing the data. DESeq methods for post-processing the data.

Author(s)

Nicolas Delhomme, Bastian Schiffthaler, Ismael Padioleau

See Also

The class RNAseq specification: RNAseq

The default output class specification: RangedSummarizedExperiment

The imported packages: biomaRt BiocParallel edgeR genomeIntervals Biostrings BSgenome DESeq GenomicRanges IRanges Rsamtools ShortRead

The suggested packages: parallel GenomicFeatures

The following classes and functions that are made available from other packages:

Examples


  # get the example annotation file - we retrieve a gtf file from GitHub
  library(curl)
  invisible(curl_download(paste0("https://github.com/UPSCb/UPSCb/raw/",
  "master/tutorial/easyRNASeq/Drosophila_melanogaster.BDGP5.77.with-chr.gtf.gz"),
            "Drosophila_melanogaster.BDGP5.77.with-chr.gtf.gz"))

  # get the example data files - we retrieve a set of example bam files
  # from GitHub using curl, as well as their index.
  invisible(sapply(c("ACACTG","ACTAGC"),function(bam){
    curl_download(paste0("https://github.com/UPSCb/UPSCb/raw/",
      "master/tutorial/easyRNASeq/",bam,".bam"),paste0(bam,".bam"))
    curl_download(paste0("https://github.com/UPSCb/UPSCb/raw/",
      "master/tutorial/easyRNASeq/",bam,".bam.bai"),paste0(bam,".bam.bai"))
  }))

  # create the AnnotParam
  annotParam <- AnnotParam(
    datasource="Drosophila_melanogaster.BDGP5.77.with-chr.gtf.gz",
    type="gtf")

  # create the synthetic transcripts
  annotParam <- createSyntheticTranscripts(annotParam,verbose=FALSE)

     # create the RnaSeqParam
     rnaSeqParam <- RnaSeqParam(annotParam=annotParam,countBy="gene")

  # get the bamfiles
  bamFiles <- getBamFileList(dir(pattern="^[A,T].*\\.bam$",full.names=TRUE))

     # get a RangedSummarizedExperiment containing the counts table
  sexp <- simpleRNASeq(
      bamFiles=bamFiles,
      param=rnaSeqParam,
      verbose=TRUE
  )

  # get the counts
  assays(sexp)$genes



[Package easyRNASeq version 2.16.0 Index]