VDJ_integrate_bulk: A function to integrate bulk and single cell data
In AntibodyForests: Delineating Inter- And Intra-Antibody Repertoire Evolution

VDJ_integrate_bulk

R Documentation

A function to integrate bulk and single cell data

Description

Integrate bulk and single-cell data by reannotating the germline genes and integrating the bulk sequences into the existing single-cell clonotypes.

Usage

VDJ_integrate_bulk(
  sc.VDJ,
  bulk.tsv,
  bulk.tsv.sequence.column,
  bulk.tsv.sample.column,
  bulk.tsv.barcode.column,
  bulk.tsv.isotype.column,
  organism,
  scRNA_seqs_annotations,
  bulkRNA_seqs_annotations,
  igblast.dir,
  trim.FR1,
  tie.resolvement,
  seq.identity
)

Arguments

`sc.VDJ`	VDJ dataframe of the single cell data created with Platypus VDJ_build function.
`bulk.tsv`	A tab separated file of the bulk sequences with the at least columns containing the sequence, a sample ID, a barcode, and the isotype.
`bulk.tsv.sequence.column`	column name of the bulk tsv that contains the nucleotide sequence
`bulk.tsv.sample.column`	column name of the bulk tsv that contains the sample_id that matches the sample_id in sc_VDJ
`bulk.tsv.barcode.column`	column name of the bulk tsv that contains the barcode/identifier of the recovered sequence
`bulk.tsv.isotype.column`	column name of the bulk tsv that contains the isotype of the recovered sequence
`organism`	"human" or "mouse"
`scRNA_seqs_annotations`	A tab separated file of the reannotated single-cell sequences using Change-O AssignGenes.py. If NULL, this function will run Change-O AssignGenes.py (Make sure to have this installed, including igblast.dir). Default is NULL.
`bulkRNA_seqs_annotations`	A tab separated file of the reannotated bulk sequences using Change-O AssignGenes.py. If NULL, this function will run Change-O AssignGenes.py (Make sure to have this installed, including igblast.dir). Default is NULL.
`igblast.dir`	directory where the igblast executables are located. For example: use the instruction to set up IgPhyML environment in the AntibodyForests vignette ($(conda info –base)/envs/igphyml/share/igblast)
`trim.FR1`	boolean - whether to trim the FR1 region from the sequences and germline, this is recommended to account for variation in primer design during sequencing (Default is TRUE)
`tie.resolvement`	How to resolve a bulk sequence for which multiple clonotypes match. "all" - assign the bulk sequence to all matching clonotypes (Default) "none" - do not assign the bulk sequence to any clonotype "random" - randomly assign the bulk sequence to one of the matching clonotypes
`seq.identity`	sequence identity threshold for clonotype assignment (Default: 0.85)

Value

The VDJ dataframe of both the bulk and single-cell data

Examples

## Not run: 
VDJ <- VDJ_integrate_bulk(sc_VDJ = AntibodyForests::small_vdj,
  bulk_tsv = "bulk_rna.tsv",
  bulk_tsv_sequence_column = "sequence",
  bulk_tsv_sample_column = "sample_id",
  bulk_tsv_barcode_column = "barcode",
  bulk_tsv_isotype_column = "isotype",
  organism = "human",
  igblast_dir = "anaconda3/envs/igphyml/share/igblast",
  tie_resolvement = "random",
  seq_identity = 0.85)

## End(Not run)

AntibodyForests documentation built on April 4, 2025, 4:45 a.m.