pgxLoader: Load data from Progenetix database

View source: R/pgxLoader.R

pgxLoaderR Documentation

Load data from Progenetix database

Description

This function loads various data from Progenetix database.

Usage

pgxLoader(
  type = NULL,
  output = NULL,
  filters = NULL,
  codematches = FALSE,
  filterLogic = "AND",
  limit = 0,
  skip = NULL,
  biosample_id = NULL,
  individual_id = NULL,
  save_file = FALSE,
  filename = NULL,
  domain = "http://progenetix.org",
  dataset = "progenetix"
)

Arguments

type

A string specifying output data type. Available options are "biosample", "individual", "variant" or "frequency". The first two options return corresponding metadata, "variant" returns CNV variant data, and "frequency" returns precomputed CNV frequency based on data in Progenetix.

output

A string specifying output data format. When the parameter type is "variant", available options are NULL, "pgxseg", "seg", "coverage", or "pgxmatrix"; When the parameter type is "frequency", available options are "pgxfreq" or "pgxmatrix".

filters

Identifiers for cancer type, literature, cohorts, and age such as c("NCIT:C7376", "pgx:icdom-98353", "PMID:22824167", "pgx:cohort-TCGAcancers", "age:>=P50Y").

codematches

A logical value determining whether to exclude samples from child concepts of specified filters that belong to cancer type/tissue encoding system (NCIt, icdom/t, Uberon). If TRUE, retrieved samples only keep samples exactly encoded by specified filters. Do not use this parameter when filters include cancer-irrelevant filters such as PMID and cohort identifiers. Default is FALSE.

filterLogic

A string specifying logic for combining multiple filters when query metadata (the paramter type = "biosample" or "individual"). Available options are "AND" and "OR". Default is "AND". An exception is filters associated with age that always use AND logic when combined with any other filter, even if filterLogic = "OR", which affects other filters. Note that when type = "frequency", the combining logic is "OR", which is not changed by this parameter.

limit

Integer to specify the number of returned biosample/individual/variant profiles for each filter. Default is 0 (return all).

skip

Integer to specify the number of skipped biosample/individual/variant profiles for each filter. E.g. if skip = 2, limit=500, the first 2*500 =1000 profiles are skipped and the next 500 profiles are returned. Default is NULL (no skip).

biosample_id

Identifiers used in Progenetix database for identifying biosamples.

individual_id

Identifiers used in Progenetix database for identifying individuals.

save_file

A logical value determining whether to save the segment variant data as file instead of direct return. Only used when the parameter type is "variant" and output is "pgxseg" or "seg". Default is FALSE.

filename

A string specifying the path and name of the file to be saved. Only used if the parameter save_file is TRUE. Default is "variants.seg/pgxseg" in current work directory.

domain

A string specifying the domain of database. Default is "http://progenetix.org".

dataset

A string specifying the dataset to query. Default is "progenetix". Other available options are "cancercelllines".

Value

Data from Progenetix database

Examples

## query metadata
biosamples <- pgxLoader(type="biosample", filters = "NCIT:C3512")
## query segment variants
seg <- pgxLoader(type="variant", output = "pgxseg", biosample_id = "pgxbs-kftvgx4y")
## query CNV frequency
freq <- pgxLoader(type="frequency", output ='pgxfreq', filters="NCIT:C3512")

progenetix/pgxRpi documentation built on May 7, 2024, 2:57 p.m.