VDJ_integrate_bulk: A function to integrate bulk and single cell data

View source: R/VDJ_integrate_bulk.R

VDJ_integrate_bulkR Documentation

A function to integrate bulk and single cell data

Description

Integrate bulk and single-cell data by reannotating the germline genes and integrating the bulk sequences into the existing single-cell clonotypes.

Usage

VDJ_integrate_bulk(
  sc.VDJ,
  bulk.tsv,
  bulk.tsv.sequence.column,
  bulk.tsv.sample.column,
  bulk.tsv.barcode.column,
  bulk.tsv.isotype.column,
  organism,
  scRNA_seqs_annotations,
  bulkRNA_seqs_annotations,
  igblast.dir,
  trim.FR1,
  tie.resolvement,
  seq.identity
)

Arguments

sc.VDJ

VDJ dataframe of the single cell data created with Platypus VDJ_build function.

bulk.tsv

A tab separated file of the bulk sequences with the at least columns containing the sequence, a sample ID, a barcode, and the isotype.

bulk.tsv.sequence.column

column name of the bulk tsv that contains the nucleotide sequence

bulk.tsv.sample.column

column name of the bulk tsv that contains the sample_id that matches the sample_id in sc_VDJ

bulk.tsv.barcode.column

column name of the bulk tsv that contains the barcode/identifier of the recovered sequence

bulk.tsv.isotype.column

column name of the bulk tsv that contains the isotype of the recovered sequence

organism

"human" or "mouse"

scRNA_seqs_annotations

A tab separated file of the reannotated single-cell sequences using Change-O AssignGenes.py. If NULL, this function will run Change-O AssignGenes.py (Make sure to have this installed, including igblast.dir). Default is NULL.

bulkRNA_seqs_annotations

A tab separated file of the reannotated bulk sequences using Change-O AssignGenes.py. If NULL, this function will run Change-O AssignGenes.py (Make sure to have this installed, including igblast.dir). Default is NULL.

igblast.dir

directory where the igblast executables are located. For example: use the instruction to set up IgPhyML environment in the AntibodyForests vignette ($(conda info –base)/envs/igphyml/share/igblast)

trim.FR1
  • boolean - whether to trim the FR1 region from the sequences and germline, this is recommended to account for variation in primer design during sequencing (Default is TRUE)

tie.resolvement

How to resolve a bulk sequence for which multiple clonotypes match. "all" - assign the bulk sequence to all matching clonotypes (Default) "none" - do not assign the bulk sequence to any clonotype "random" - randomly assign the bulk sequence to one of the matching clonotypes

seq.identity

sequence identity threshold for clonotype assignment (Default: 0.85)

Value

The VDJ dataframe of both the bulk and single-cell data

Examples

## Not run: 
VDJ <- VDJ_integrate_bulk(sc_VDJ = AntibodyForests::small_vdj,
  bulk_tsv = "bulk_rna.tsv",
  bulk_tsv_sequence_column = "sequence",
  bulk_tsv_sample_column = "sample_id",
  bulk_tsv_barcode_column = "barcode",
  bulk_tsv_isotype_column = "isotype",
  organism = "human",
  igblast_dir = "anaconda3/envs/igphyml/share/igblast",
  tie_resolvement = "random",
  seq_identity = 0.85)

## End(Not run)

AntibodyForests documentation built on April 4, 2025, 4:45 a.m.