Preprocessing of raw data
preprocess_data(df, Column_gene_name = "Gene.names", Column_score = "Score", Column_ID = "Protein.IDs", Column_Npep = NULL, Column_intensity_pattern = "^Intensity.", bait_gene_name, condition = NULL, bckg_bait = bait_gene_name, bckg_ctrl = "WT", log = TRUE, filter_time = NULL, filter_bio = NULL, filter_tech = NULL, min_score = 0, filter_gene_name = TRUE, ...)
df | Data.frame with protein intensities |
---|---|
Column_gene_name | Column with gene names |
Column_score | Column with protein identification score |
Column_ID | Column with protein IDs |
Column_Npep | Column with number of theoretically observable peptides per protein |
Column_intensity_pattern | Pattern (regular exrpression) used to identfy df's columns containing protein intensity values |
bait_gene_name | The gene name of the bait |
condition | data.frame with columns "column", bckg", "bio", "time" and "tech" indicating for each intensity column ("sample") its corresponding background ("bckg"), biologicla replicate ("bio), experimental condition ("tine) and technical replicate ("tech). |
bckg_bait | Name of the bait background as found in |
bckg_ctrl | Name of the control background as found in |
log | logical, use geometric mean to average technical replicates |
filter_time | vector of experimental conditions to exclude from analysis |
filter_bio | vector of biological replicates to exclude from analysis |
filter_tech | vector of technical replicates to exclude from analysis |
min_score | threshold on identification score |
filter_gene_name | logical, filter out proteins withy empty gene name |
... | Additional parameters passed to function |