calculate_PVCA {proBatch} | R Documentation |
Calculate variance distribution by variable
calculate_PVCA( data_matrix, sample_annotation, feature_id_col = "peptide_group_label", sample_id_col = "FullRunName", factors_for_PVCA = c("MS_batch", "digestion_batch", "Diet", "Sex", "Strain"), pca_threshold = 0.6, variance_threshold = 0.01, fill_the_missing = -1 )
data_matrix |
features (in rows) vs samples (in columns) matrix, with
feature IDs in rownames and file/sample names as colnames.
See "example_proteome_matrix" for more details (to call the description,
use |
sample_annotation |
data frame with:
.
See |
feature_id_col |
name of the column with feature/gene/peptide/protein
ID used in the long format representation |
sample_id_col |
name of the column in |
factors_for_PVCA |
vector of factors from |
pca_threshold |
the percentile value of the minimum amount of the variabilities that the selected principal components need to explain |
variance_threshold |
the percentile value of weight each of the factors needs to explain (the rest will be lumped together) |
fill_the_missing |
numeric value determining how missing values
should be substituted. If |
data frame of weights of Principal Variance Components
matrix_test <- example_proteome_matrix[1:150, ] pvca_df <- calculate_PVCA(matrix_test, example_sample_annotation, factors_for_PVCA = c('MS_batch', 'digestion_batch',"Diet", "Sex", "Strain"), pca_threshold = .6, variance_threshold = .01, fill_the_missing = -1)