get_unsupervised_graph_from_files: Builds an unsupervised graph starting from a list of input...

Description Usage Arguments Value

View source: R/unsupervised.R

Description

This function is similar to get_unsupervised_graph, except that the input in this case is a list of file names containing the vertices of the graph. Each file should be a tab-separated table of nodes, similar to the tab input of get_unsupervised_graph. This function will rbind all the files in files.list, and a build a single graph with the union of all the nodes. Optionally, file-level metadata can also be incorporated in the vertex properties of the resulting graph, using the metadata.tab parameter

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
get_unsupervised_graph_from_files(
  files.list,
  col.names,
  filtering.threshold,
  metadata.tab = NULL,
  metadata.filename.col = NULL,
  use.basename = TRUE,
  process.clusters.data = TRUE,
  clusters.data.out.dir = "./",
  downsample.to = 1000,
  method = c("forceatlas2", "umap"),
  ...
)

Arguments

files.list

The list of files to process. The function will first determine the set of columns that are common to all the files. Only the common vertex properties will feature in the output graph

col.names

A character vector indicating which columns of tab should be used to calculate distances

filtering.threshold

The threshold used to filter edges in the graph

metadata.tab

Optional. If specified, a table of file-level metadata, to be added as vertex properties in the graph. Each row should specify metadata for a single file, with the columns of metadata.tab representing metadata values. All the vertices derived from that file will have the corresponding metadatata value. Please note that the names name, Label, type and sample are used internally by this package, and therfore cannot bs used as metadata vertex properties

metadata.filename.col

The name of the column in metadata.tab that contains the file name to be matched to the files in files.list

use.basename

The resulting graph will contain an additional vertex property called sample identifying which file the vertex was derived from. If use.basename is TRUE the basename will be used, otherwise the full path as specified in files.list. Moreover if use.basename is TRUE the matching with the file names contained in metadata.tab will be based on the basename only

process.clusters.data

If this is TRUE this function will look for a file with extension .all_events.rds for each file in files.list (see the Documentation of grappolo::cluster_fcs_files). This file contains single-cell data (i.e. each row represent a cell, as opposed to the files in files.list where each row represents a cluster). Each file will be processed using the write_clusters_data function. This processing is used for downstream data visualization but it is not strictly necessary to create the graph. If use.basename is TRUE the basename of the files in files.list will be used for processing.

clusters.data.out.dir

Only used if process.clusters.data == TRUE. The output directory where the clusters data will be written

downsample.to

The target number of events for downsampling. Only used if process.clusters.data == TRUE. This is only used for downstream data visualization and does not affect the construction of the graph

method

The method to use. Either build a force-directed layout graph using ForceAtlas2, or alternatively use UMAP

...

Additional argument passed to build_graph or build_umap_graph depending on the choice of method

Value

See the return value of get_unsupervised_graph


ParkerICI/scgraphs documentation built on April 30, 2021, 1:10 p.m.