convert_protein_ids {SWATH2stats} | R Documentation |
This function renames protein ids in a data frame or file
convert_protein_ids(data_table, column.name = "Protein", species = "hsapiens_gene_ensembl", host = "www.ensembl.org", mart = "ENSEMBL_MART_ENSEMBL", ID1 = "uniprotswissprot", ID2 = "hgnc_symbol", id.separator = "/", copy_nonconverted = TRUE, verbose = FALSE)
data_table |
A data frame or file name. |
column.name |
The column name where the original protein identifiers are present. Default: "Protein". |
species |
The species of the protein identifiers in the term used by biomaRt (e.g. "hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "drerio_gene_ensembl", etc.). Default: "hsapiens_gene_ensembl". |
host |
Path of the biomaRt database (e.g. "www.ensembl.org", "dec2017.archive.ensembl.org"). Default: "www.ensembl.org". |
mart |
The type of mart (e.g. "ENSEMBL_MART_ENSEMBL", etc.). Default: "ENSEMBL_MART_ENSEMBL" |
ID1 |
The type of the original protein identifiers (e.g. "uniprotswissprot", "ensembl_peptide_id"). Default: "uniprotswissprot". |
ID2 |
The type of the converted protein identifiers (e.g. "hgnc_symbol", "mgi_symbol", "external_gene_name"). Default: "hgnc_symbol". |
id.separator |
Separator between protein identifiers of shared peptides. Default: "/". |
copy_nonconverted |
Option defining if the identifiers that cannot be converted should be copied. Default: TRUE. |
verbose |
Option to write a file containing the version of the database used. Default: FALSE. |
Returns the data frame with an added column of the converted protein identifiers.
Protein identifiers from shared peptides should be separated by a forward slash. The host of archived ensembl databases can be introduced as well (e.g. "dec2017.archive.ensembl.org")
Peter Blattmann
data_table <- data.frame(Protein = c("Q01581", "P49327", "2/P63261/P60709"), Abundance = c(100, 3390, 43423)) convert_protein_ids(data_table)