present_prediction_t: Present Liblinear prediction results

Description Usage Arguments Examples

View source: R/predict_targetted.R

Description

Combine Liblinear prediction results and window, position informations of sites. Calculate score threshold at user specified sensitivity level.

Usage

1
2
3
4
5
6
present_prediction_t(flag_for_score_threshold_chosen = "reference",
  score_threshold, ptm_site, flanking_size = 12, SPIDER = T,
  pred_candidate_df_Rds_name, pred_score_file_name, positive_info_file,
  known_protein_fasta_file, n_fold = 2, lower_bound = -1, upper_bound = 1,
  liblinear_dir, feature_file_path, cvlog_path_name, specificity_level,
  output_label)

Arguments

flag_for_score_threshold_chosen

A string indicating whether use reference score threshold or get from the user supplied training data, defalt set to "reference".

score_threshold

A numerical value between 0 to 1 indicating the reference score threshold (supply when in "reference")

ptm_site

The amino acid this PTM involves, in upper-case single letter representation.

flanking_size

The number of residues surrounding each side of the center residue, the total window size will be 2*flanking_size+1, default to 12.

SPIDER

A boolean variable indicating the usage of SPIDER3 features, default set to TRUE.

pred_candidate_df_Rds_name

An Rds file containing the candidate data frame of the proteins to be predicted.

pred_score_file_name

A text file containing the predicted score for the proteins of interest.

positive_info_file

A text file containing the positive PTM sites info in required format.

known_protein_fasta_file

A text file containing the proteins sequences of interest and known PTM sites in Fasta format.

n_fold

Number of folds used for training and prediction, default set to 2

lower_bound

The lower bound of the scaled data range, default to -1.

upper_bound

The upper bound of the scaled data range, default to 1.

liblinear_dir

Absolute path of Liblinear tool.

feature_file_path

Absolute path of the feature files.

cvlog_path_name

The path and name of the log files, which hold the details of Liblinear procedures.

specificity_level

A numerical number indicating the specificity user requires the classifier to achieve, default set to 0.99. Used only not in "reference" mode.

output_label

The string to tag the output files in threshold getting purpose.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
present_prediction_t(flag_for_score_threshold_chosen = "cv",
                           score_threshold = NULL,
                           ptm_site = "S", flanking_size = 12,
                           pred_candidate_df_Rds_name = "ps_predict_candidate.Rds",
                           pred_score_file_name = "ps_predict_predict.tsv",
                           positive_info_file = "known_ps.tsv", 
                           known_protein_fasta_file = "known_fasta.tsv",
                           n_fold = 2,
                           lower_bound = -1,
                           upper_bound = 1,
                           liblinear_dir = "/data/ginny/liblinear-2.11/",
                           feature_file_path = "/data/ginny/test_package/",
                           output_path = "/data/ginny/test_package/",
                           cvlog_path_name = "/data/ginny/test_package/cvlog.txt",
                           specificity_level = 0.99,
                           output_label = "ps_target")

ginnyintifa/PTMscape documentation built on Nov. 9, 2021, 10:39 p.m.