benchmarkMotifs: Benchmark linear motif instance found using QSLIMFinder...
In vitkl/SLIMFinderR: Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

Description Usage Arguments Value Author(s) See Also

Benchmark linear motif instance found using QSLIMFinder (SLIMFinder)

Get motifs from the output of benchmarking linear motifs by id

Benchmark linear motif instance found using QSLIMFinder (SLIMFinder)

  benchmarkMotifs(occurence_file = "../viral_project/qslimfinder.Full_IntAct3.FALSE/result/occurence.txt",
  main_file = "../viral_project/qslimfinder.Full_IntAct3.FALSE/result/main_result.txt",
  domain_res_file = "../viral_project/processed_data_files/domain_res_count_20171019.RData",
  motif_setup = "../viral_project/processed_data_files/QSLIMFinder_instances_h2v_qslimfinder.Full_IntAct3.FALSE_clust201802.RData",
  neg_set = c("all_instances", "all_proteins", "random")[1],
  domain_results_obj = "res_count",
  motif_input_obj = "forSLIMFinder_Ready", motif_setup_obj2 = NULL,
  occurence_filt = NULL, one_from_cloud = T,
  dbfile_main = "../viral_project/data_files/instances_all.gff",
  dburl_main = "http://elm.eu.org/instances.gff?q=None&taxon=Homo%20sapiens&instance_logic=",
  dbfile_query = "../viral_project/data_files/instances_query.gff",
  dburl_query = "http://elm.eu.org/instances.gff?q=all&taxon=irus&instance_logic=",
  query_res_query_only = T, motif_types = c("DOC", "MOD", "LIG", "DEG",
  "CLV", "TRG"), all_res_excl_query = T, merge_motif_variants = F,
  seed = 21, N = 100, replace = T, within1sequence = T,
  query_predictor_col = "Sig", all_predictor_col = "Sig",
  normalise = T, minoverlap = 2, minoverlap_redundant = 5,
  merge_domain_data = T, merge_by_occurence_mcols = c("query",
  "interacts_with"), merge_by_domain_res_cols = c("IDs_interactor_viral",
  "IDs_interactor_human", "IDs_domain_human", "Taxid_interactor_human",
  "Taxid_interactor_viral"),
  merge_by_non_query_domain_res_cols = c("IDs_interactor_human_A",
  "IDs_interactor_human_B", "IDs_domain_human_B",
  "Taxid_interactor_human_A", "Taxid_interactor_human_B"),
  filter_by_domain_data = "p.value < 0.05", motif_pval_cutoff = 1,
  select_predictor_per_range = max,
  non_query_domain_res_file = "../viral_project/processed_data_files/predict_domain_human_clust20180819.RData",
  non_query_domain_results_obj = NULL, non_query_domains_N = 0,
  non_query_set_only = F, query_domains_only = F,
  min_non_query_domain_support = 0,
  min_top_domain_support4motif_nq = 0, select_top_domain = F, ...)

queryOCCByMCOL(res, keytype = "IDs_domain_human", key = "IPR032440")

mBenchmarkMotifs(datasets = c("qslimfinder.Full_IntAct3.FALSE"),
  descriptions = c("human network (full IntAct) searched \nfor motifs present in viral proteins"),
  dir = "./", motif_setup_months = "201802", ...)

`occurence_file`	a path to a tsv (txt) file containing QSLIMFinder (SLIMFinder) occurence output
`main_file`	a path to a tsv (txt) file containing QSLIMFinder (SLIMFinder) main output
`domain_res_file`	path to RData containing objects generated by what_we_find_VS_ELM.Rmd script (specifically `domain_results_obj` object)
`motif_setup`	path to RData containing objects generated by PPInetwork2SLIMFinder pipeline (specifically `motif_input_obj` object)
`domain_results_obj`	character, name of the object containing domain enrichment results (class == XYZinteration_XZEmpiricalPval)
`motif_input_obj`	character, name of the object of class InteractionSubsetFASTA_list containing: FASTA sequences for interacting proteins, molecular interaction data they correspond to. Each element of a list contains input for individual QSLIMFinder run.
`motif_setup_obj2`	alternative way to provide motif_input_obj (class InteractionSubsetFASTA_list) directly. This object should not require matching domain-protein pairs. It must have been already processed by `domainProteinPairMatch` Can be useful for repeating benchmarking.
`occurence_filt`	QSLIMFinder (SLIMFinder) occurence output filtered by those that we could have found from motif_input_obj.
`one_from_cloud`	use only one top motif from motif cloud
`dbfile_main`	a path to a gff (txt) file containing ELM database motif occurrences (proteins in the main set)
`dburl_main`	url where to get ELM database containing motif occurrences (proteins in the main set)
`dbfile_query`	a path to a gff (txt) file containing ELM database motif occurrences (proteins in the query set)
`dburl_query`	url where to get ELM database containing motif occurrences (proteins in the query set)
`query_res_query_only`	return only GRanges for query proteins, passed to "GRangesINinteractionSubsetFASTA". Do not change the default value.
`motif_types`	character vector of motif types
`all_res_excl_query`	all results in the output is all occurences excluding the query proteins. If FALSE, all results include occurences in all proteins. Not implemented
`merge_motif_variants`	If FALSE (default) merge motif occurences only if motifs are variants of the same motif (such as TRG_NLS).
`seed`	when using random negative sets (`neg_set = "random"`): seed for RNG for sampling
`N`	when using random negative sets (`neg_set = "random"`): number of samples
`replace`	when using random negative sets (`neg_set = "random"`): sample starts of GRanges with replacement randomGRanges
`within1sequence`	when using random negative sets (`neg_set = "random"`): resample GRanges within one sequence or across sequences randomGRanges. If seq 1 has two motifs of length 4 and 7 and `within1sequence = TRUE` two motifs of the same length 4 and 7 will be sampled from the same protein. If `within1sequence = FALSE` two motifs of the same length 4 and 7 will be sampled from any protein in the set used for benchmarking.
`query_predictor_col`	"Sig" or "p.value" or "domain_motif_pval"
`all_predictor_col`	"Sig"
`normalise`	logical, normalise predictor value, just in case predictor doesn't span the full range between 0 ... 1
`minoverlap`	integer, passed to `findOverlaps`
`minoverlap_redundant`	for removing motif classes that match the same occurence
`merge_domain_data`	If TRUE, merge domain enrichment results to motif occurence
`merge_by_occurence_mcols`	columns of mcols (metadata of GRanges) that contain IDs of [1] protein with motif, [2] proteins with domain, e.g. c("query", "interacts_with"),
`merge_by_domain_res_cols`	columns of domain enrichment results that contain IDs of [1] protein with motif, [2] proteins with domain, [3] domain, [4] and [5] Taxid for proteins with motif and domain respectively, e.g. c("IDs_interactor_viral", "IDs_interactor_human", "IDs_domain_human", "Taxid_interactor_human","Taxid_interactor_viral"). If Taxid columns are not present - omit.
`merge_by_non_query_domain_res_cols`	columns of domain enrichment results for non-query proteins that contain IDs of [1] protein with motif, [2] proteins with domain, [3] domain, [4] and [5] Taxid for proteins with motif and domain respectively, e.g. c("IDs_interactor_human_A", "IDs_interactor_human_B", "IDs_domain_human_B", "Taxid_interactor_human_A","Taxid_interactor_human_B"). If Taxid columns are not present - omit.
`filter_by_domain_data`	criteria to filter domain data and restrict motif search datasets (for example, "p.value < 0.05" or "fdr_pval < 0.05 & domain_count_per_IDs_interactor_viral > 1")
`select_predictor_per_range`	function (such as min) that select predictor value if multiple values (such as returned by multiple datasets or multiple domains integrated) describe the same range
`non_query_domain_res_file`	path to RData file containing the result of domain enrichment analysis for non-query proteins
`non_query_domain_results_obj`	character, name of the object containing domain enrichment results for non-query proteins (class == XYZinteration_XZEmpiricalPval), when provided will be used for filtering datasets.
`non_query_domains_N`	the number of non-query proteins with predicted domains for each dataset. Used only when non_query_domain_results_obj is not NULL
`non_query_set_only`	If TRUE sequence sets searched for motif are filtered to contain only proteins from non_query_domain_results_obj (interacting partners of a seed), if FALSE - both from non_query_domain_results_obj and domain_res_obj. Used only when non_query_domain_results_obj is not NULL.
`query_domains_only`	If TRUE proteins whose sequences will be used for motif search must be predicted to bind the same domains in a seed protein as domains predicted for query protein. Used only when non_query_domain_results_obj is not NULL
`min_non_query_domain_support`	Minimal number of non-query proteins with the same motif as the query that are predicted to bind the same domain. Used to filter domains and proteins that do not predict top domains. Used only when non_query_domain_results_obj is not NULL.
`min_top_domain_support4motif_nq`	Similar to min_non_query_domain_support. Minimal number of non-query proteins with the same motif as the query which have the same top-1 domain predicted.
`select_top_domain`	If TRUE, top domain is selected using a product of domain p-values for all proteins with the same motif (min p-value) found using the same dataset. Used only when non_query_domain_results_obj is not NULL.
`...`	other arguments passed to passed to `findOverlaps`
`res`	object class `(benchmarkMotifsResult)`, the output of benchmarkMotifs
`keytype`	character, name of the column that contains key identifiers
`key`	character, identifiers for which to retrieve the result
`datasets`	character vector, names of the datasets ("Vidal" in "./SLIMFinder_Vidal/result/occurence.txt" or "" in "./SLIMFinder/result/occurence.txt")
`descriptions`	character vector, description of the datasets (title of the ROC plot)
`dir`	character, base directory. For example, "./" in "./SLIMFinder_Vidal/result/occurence.txt"

object class (benchmarkMotifsResult) containing occurence (GRanges, all, query, just after filtering by motif setup), instances_all (GRanges, known instances in all proteins or all excluding the query proteins), instances_query (GRanges, known instances in query proteins), predictions_all (for ROCR), labels_all (for ROCR), predictions_query (for ROCR), labels_query (for ROCR), overlapping_GRanges_all (GRanges, known instances that we also found), overlapping_GRanges_query(GRanges, known instances that we also found), N_query_prot_with_known_instances, N_query_known_instances, N_all_prot_with_known_instances, N_all_known_instances

GenomicRanges containing motifs for a given key

list of objects of class (benchmarkMotifsResult)

Vitalii Kleshchevnikov

ELMdb2GRanges, findOverlapsBench

vitkl/SLIMFinderR documentation built on May 3, 2019, 8:08 p.m.

vitkl/SLIMFinderR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

vitkl/SLIMFinderR
Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

benchmarkMotifs: Benchmark linear motif instance found using QSLIMFinder...
In vitkl/SLIMFinderR: Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

Description

Usage

Arguments

Value

Author(s)

See Also

Related to benchmarkMotifs in vitkl/SLIMFinderR...

R Package Documentation

Browse R Packages

We want your feedback!

vitkl/SLIMFinderR Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

benchmarkMotifs: Benchmark linear motif instance found using QSLIMFinder... In vitkl/SLIMFinderR: Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

Description

Usage

Arguments

Value

Author(s)

See Also

Related to benchmarkMotifs in vitkl/SLIMFinderR...

R Package Documentation

Browse R Packages

We want your feedback!

vitkl/SLIMFinderR
Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference

benchmarkMotifs: Benchmark linear motif instance found using QSLIMFinder...
In vitkl/SLIMFinderR: Short Linear Motif Search Using QSLIMFinder, Protein Interaction Network and Binding Domain Inference