sc_lb: single cell data

Description Usage Format

Description

This dataset contains mutations, V-gene family, sample ID, number of mutations and germline divergence score. V mutations div_germ sample status label_subtype

Usage

1

Format

A data frame with 623 rows and 35 variables

alnstart

alignemnt start on the subject

b_cell_subset

b_cell_subset

cdr1

CDR1 sequence, in AA

cdr1_seq

CDR1 sequence, in nt

cdr2

CDR2 sequence, in AA

cdr2_seq

CDR2 sequence, in nt

cdr3

CDR3 sequence, in AA

cdr3_seq

CDR3 sequence, in nt

chain_type

type of the chain, either heavy or light

compartment

the compartment

dh

D-gene information, with subfamily

dh2

D-gene information, 2 digits precision (allele information)

evalue

blast evalue, les than 10e-3 is good

fr1

FR1 sequence, in AA

fr1_seq

FR1 sequence, in nt

fr2

FR2 sequence, in AA

fr2_seq

FR2 sequence, in nt

fr3

FR3 sequence, in AA

fr3_seq

FR3 sequence, in nt

fr4

FR4 sequence, in AA

fr4_seq

FR4 sequence, in nt

inframe

is the CDR3 stop IN-FRAME

jh

J-gene information, with subfamily

jh2

J-gene information, 2 digits precision (allele information)

length

CDR3 length, in nt

mutations

number of mutations in the V-region

name

sequence name

sample

sample code

status

status of the patient, early-treated (PHI) or late-treated (CHI)

stop

stop codon, boolean YES or NO

strand

DNA strand, either '+' or '-'

vh

V-gene information, with subfamily

vh2

V-gene information, 2 digits precision (allele information)

wgxg

mask to state wether the CDR3 is well defined or not, boolean YES or NO

wgxg_2

mask to state wether the CDR3 is well defined or not [this delimitation is not IN-FRAME], boolean YES or NO


thierrycnam/igfuns documentation built on May 4, 2020, 3:21 a.m.