ID-translation {TCGAutils}R Documentation

Translate study identifiers from barcode to UUID and vice versa

Description

These functions allow the user to enter a character vector of identifiers and use the GDC API to translate from TCGA barcodes to Universally Unique Identifiers (UUID) and vice versa. These relationships are not one-to-one. Therefore, a data.frame is returned for all inputs. The UUID to TCGA barcode translation only applies to file and case UUIDs. API queries for this translation service with other types of UUIDS are not fully supported. Please double check any results before using these features for analysis. Case / submitter identifiers are translated by default, see the id_type argument for details.

Usage

UUIDtoBarcode(id_vector, id_type = c("case_id", "file_id"),
  end_point = "participant", legacy = FALSE)

barcodeToUUID(barcodes, id_type = c("case_id", "file_id"), legacy = FALSE)

translateBuild(from, to = "UCSC")

extractBuild(string, build = c("UCSC", "NCBI"))

Arguments

id_vector

A character vector of UUIDs corresponding to either files or cases

id_type

Either case_id or file_id indicating the type of id_vector entered (default "case_id")

end_point

The cutoff point of the barcode that should be returned, only applies to file_id type queries. See details for options.

legacy

(logical default FALSE) whether to search the legacy archives

barcodes

A character vector of TCGA barcodes

from

A build version name

to

The name of the desired version

string

A single character string

build

A vector of build version names (default UCSC, NCBI)

Details

The end_point options reflect endpoints in the Genomic Data Commons API. These are summarized as follows:

Only these keywords need to be used to target the specific barcode endpoint. These endpoints only apply to "file_id" type translations to TCGA barcodes (see id_type argument).

Value

A data.frame of TCGA barcode identifiers and UUIDs

builds

A couple of functions are available to search for build versions, either from NCBI or UCSC. translateBuild will translate between UCSC and NCBI build versions. extractBuild will use grep patterns to find the first build within a string.

Author(s)

Sean Davis, M. Ramos

Examples

## Translate UUIDs >> TCGA Barcode

uuids <- c("0001801b-54b0-4551-8d7a-d66fb59429bf",
"002c67f2-ff52-4246-9d65-a3f69df6789e",
"003143c8-bbbf-46b9-a96f-f58530f4bb82")

UUIDtoBarcode(uuids, id_type = "file_id", end_point = "sample")

UUIDtoBarcode("ae55b2d3-62a1-419e-9f9a-5ddfac356db4", id_type = "case_id")

## Translate TCGA Barcode >> UUIDs

fullBarcodes <- c("TCGA-B0-5117-11A-01D-1421-08",
"TCGA-B0-5094-11A-01D-1421-08",
"TCGA-E9-A295-10A-01D-A16D-09")

sample_ids <- TCGAbarcode(fullBarcodes, sample = TRUE)

barcodeToUUID(sample_ids)

participant_ids <- c("TCGA-CK-4948", "TCGA-D1-A17N",
"TCGA-4V-A9QX", "TCGA-4V-A9QM")

barcodeToUUID(participant_ids)


translateBuild("GRCh35", "UCSC")


extractBuild(
"SCENA_p_TCGAb29and30_SNP_N_GenomeWideSNP_6_G05_569110.nocnv_grch38.seg.txt"
)


[Package TCGAutils version 1.0.1 Index]