import_single_Vispa2Matrix: Import a single integration matrix from file
In calabrialab/ISAnalytics: Analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies

import_single_Vispa2Matrix

R Documentation

Import a single integration matrix from file

Description

This function allows to read and import an integration matrix (ideally produced by VISPA2) and converts it to a tidy format.

Usage

import_single_Vispa2Matrix(
  path,
  separator = "\t",
  additional_cols = NULL,
  transformations = NULL,
  sample_names_to = pcr_id_column(),
  values_to = "Value",
  to_exclude = lifecycle::deprecated(),
  keep_excluded = lifecycle::deprecated()
)

Arguments

`path`	The path to the file on disk
`separator`	The column delimiter used, defaults to `⁠\t⁠`
`additional_cols`	Either `NULL`, a named character vector or a named list. See details.
`transformations`	Either `NULL` or a named list of purrr-style lambdas where names are column names the function should be applied to.
`sample_names_to`	Name of the output column holding the sample identifier. Defaults to `pcr_id_column()`
`values_to`	Name of the output column holding the quantification values. Defaults to `Value`.
`to_exclude`	Deprecated. Use `additonal_cols` instead
`keep_excluded`	Deprecated. Use `additonal_cols` instead

Details

Additional columns

Additional columns are annotation columns present in the integration matrix to import that are not

part of the mandatory IS vars (see mandatory_IS_vars())
part of the annotation IS vars (see annotation_IS_vars())
the sample identifier column
the quantification column

When specified they tell the function how to treat those columns in the import phase, by providing a named character vector, where names correspond to the additional column names and values are a choice of the following:

"char" for character (strings)
"int" for integers
"logi" for logical values (TRUE / FALSE)
"numeric" for numeric values
"factor" for factors
"date" for generic date format - note that functions that need to read and parse files will try to guess the format and parsing may fail
One of the accepted date/datetime formats by lubridate, you can use ISAnalytics::date_formats() to view the accepted formats
"_" to drop the column

For more details see the "How to use import functions" vignette: vignette("workflow_start", package = "ISAnalytics")

Transformations

Lambdas provided in input in the transformations argument, must be transformations, aka functions that take in input a vector and return a vector of the same length as the input.

If the transformation list contains column names that are not present in the data frame, they are simply ignored.

Value

A data frame object in tidy format

Required tags

The function will explicitly check for the presence of these tags:

All columns declared in mandatory_IS_vars()

Examples

fs_path <- generate_default_folder_structure(type = "correct")
matrix_path <- fs::path(
    fs_path$root, "PJ01", "quantification",
    "POOL01-1", "PJ01_POOL01-1_seqCount_matrix.no0.annotated.tsv.gz"
)
matrix <- import_single_Vispa2Matrix(matrix_path)
head(matrix)

calabrialab/ISAnalytics documentation built on Dec. 10, 2024, 10:50 p.m.