data_importer: Import scRNAseq data

View source: R/data_importer.R

data_importerR Documentation

Import scRNAseq data

Description

This is a function for easily importing multiple scRNAseq runs and storing data in a list after filtering as suggested in Seurat workflow. It takes a dataframe explaining the sample details (see below), finds cell-barcode matrices generated by CellRanger pipeline in a folder, and returns a list containing individual Seurat objects. This is useful for integrating multiple datasets for comparisons

Usage

data_importer(
  sample_df,
  data_folder = "",
  min.cells = 3,
  min.features = 200,
  verbose = T,
  species = "mouse"
)

Arguments

sample_df

A data frame containing ID numbers for individual sequencing runs, sample names, and arbitrary sample-level metadata. This data frame must contain at least two columns named 'sample_name' and 'run_id'. Other columns must be uniquely named and can contain metadata such as genotype, treatment, sex, etc. The character string under 'sample_name' will be appended to cell barcodes in the created Seurat objects to enable downstream integration of multiple datasets.

data_folder

A string to point out the relative path of the folder containing data files. In this folder, filtered cell-barcode matrices for individual runs must be present under separate folders, and the folder names should correspond to the 'run_id' column in the 'sample_df'.

min.cells

A numeric value for filtering genes that are present less than this number of cells (see Seurat package). It defaults to 3.

min.features

A numeric value for filtering cells that have less than this number of features (see Seurat package). It defaults to 200

verbose

TRUE/FALSE for turning messages on/off regarding the data set that is being processed.

species

character string which can be either "human" or "mouse". This will impact the calculation of the mitochondrial gene percentage. Mouse genes are lower case (ex. mt-...) whereas human genes are upper case (ex. MT-...)

Value

A list object containing Seurat objects. The list items will be named with data from 'run_id'.

Examples


## Not run: 

sample_info <- data.frame(run_id = c("1111X","1112X","1113X"),
                          sample_name = c("ctrl", "drug1", "drug2"),
                          genotype = c("WT","WT","KO"),
                          sex = c("F","F","M"))

obj_list <- data_importer(sample_df = sample_info,
                          data_folder = "../data/",
                          min.cells = 3, min.features = 200,
                          verbose = T,
                          species ="mouse")


## End(Not run)



atakanekiz/SCseqtools documentation built on April 18, 2023, 12:55 a.m.