getHighdimData: Obtain a high dimensional dataset for a certain study

Description Usage Arguments Details Value Note Author(s) See Also Examples

Description

This function will retrieve a highdimensional dataset for a specific concept of a certain study. It retrieves the data from the server and parses it, converting the high dimensional data to a data.frame.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
getHighdimData(study.name, concept.match = NULL, concept.link = NULL, projection = NULL,
                data.constraints = list(), assay.constraints = list(),
                progress.parse = .make.progresscallback.parse(),
                trial.name = NULL,
                patient.set = NULL,
                ontology.term = NULL,
                assay.ids = NULL,
                patient.ids = NULL,
                search.keywords = NULL,
                chromosome = NULL,
                genes = NULL,
                gene.ids = NULL,
                proteins = NULL,
                protein.ids = NULL,
                pathways = NULL,
                pathway.ids = NULL,
                gene.signatures = NULL,
                gene.signature.ids = NULL,
                snps = NULL,
                assay.or = NULL,
                data.or = NULL)

Arguments

study.name

a character string giving the name of a study.

concept.match

a character string containing the concept name that should be matched. The getHighdimData function will search within the requested study for the first concept which name matches the given string value. It uses the name column of the result from getConcepts to perform the matching.

concept.link

a character string containing the link pointing to the location of the data on the server. Candidate values for this argument can be found in api.link.self.href column of the getConcepts results. It is the most exact way to refer to a concept, and it overwrites the concept.match argument.

projection

a character string specifying what part of the dataset should be returned: should all data be returned or only a certain type of data, for example only the log values for an mRNA data set. Candidate values can be obtained, see details section on how. (note: the types of data present may differ between studies).

data.constraints

A list of raw data constraints. These are converted to JSON with toJSON and passed to the server as-is. For a more user friendly way to specify constraints, see the assay constraint parameters.

assay.constraints

Just like data.constraints but for assay constraints. See the data constraint parameters for a more user friendly syntax to pass such constraints.

progress.parse

(The default should be fine for most users) This argument functions will be called before, during and after the parsing of the downloaded data.

The remaining parameters are constraints that limit the amount of data that is returned.

Assay constraints:

trial.name

A single character string with the trial name.

patient.set

A number indicating the patient set.

ontology.term

A single character string containing the concept path.

assay.ids

A numeric vector containing the id's of the assays you want to retrieve.

patient.ids

A character vector with the patient ids that you want to retrieve.

assay.or

Should be a named list containing multiple of the above assay constraints. The returned assays are those that match at least one of the specified constraints.

Data constraints:

chromosome

Either the name of a chromosome, or a named list containing names chromosome, start and end. chromosome should be a single character vector with the chromosome name, start and end are single numbers indicating the region of the chromosome you want to select.

genes

A character vector of gene symbols.

gene.ids

A character vector of NCBI gene accessions.

proteins

A character vector of protein names.

protein.ids

A character vector of UniProt ids.

pathways

A character vector of pathway names.

pathway.ids

A character vector of pathway names in the form <database>:<db specific id>.

gene.signatures

A character vector of gene signature names.

gene.signature.ids

A character vector of gene signature ids.

snps

A character vector of SNPs to select.

data.or

Similar to assay.or. A named list containing multiple of the above data constraints. The returned data will match at least one of those constraints.

Details

Retrieving and parsing the data may take some time, depending on the size of the data requested and on your connection characteristics.

A dataframe containing the high dimensional data will only be returned if a projection is specified.

To discover what projections are possible, see highdimInfo. If you call getHighDimData while only providing a study name and concept, i.e. getHighdimData(study.name, concept.match) or getHighDimData(study.name, concept.link) it will also tell you the valid projections.

Value

If a projection is specified, this function returns a list containing the highdimensional data, with the following components:

data

a dataframe containing the high dimensional data.

labelToBioMarkerMap

a hash describing which (column) labels refer to which bioMarker. This component will only be present if biomarker data is available for this particular dataset. See hash for more details about how to use hashes.

If no projection is specified this function returns a list of the projections available for the requested study. No highdimensional data is returned. It will also print the instruction to set the projection argument and a list of the projections that are available.

Note

To be able to access a transmart database, you need to be connected to the server the database is on. If you haven't connected to the server yet, establish a connection using the connectToTransmart function.

Author(s)

Tim Dorscheidt, Jan Kanis, Rianne Jansen. Contact: development@thehyve.nl

See Also

hash, highdimInfo, getStudies, getConcepts.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## Not run: 

  # if a concept contains high dimensional data, use the following command to obtain this data:
  getHighdimData(study.name = "GSE8581", concept.match = "Lung")

  # you will be told that one of the listed projections needs to be selected:
  "No valid projection selected.
  Set the projection argument to one of the following options:
    default_real_projection
    zscore
    log_intensity
    all_data"

  # the following will return the actual data:
  data <- getHighdimData(study.name = "GSE8581", concept.match = "Lung", projection = "zscore")
  
## End(Not run)

transmart/RInterface documentation built on May 31, 2019, 7:45 p.m.