Resample data frame using values from the column with number of clonesets.

Share:

Description

Resample data frame using values from the column with number of clonesets. Number of clonestes (i.e., rows of a MiTCR data frame) are reads (usually the "Read.count" column) or UMIs (i.e., barcodes, usually the "Umi.count" column).

Usage

1
resample(.data, .n = -1, .col = "read.count")

Arguments

.data

Data frame with the column .col or list of such data frames.

.n

Number of values / reads / UMIs to choose.

.col

Which column choose to represent quanitites of clonotypes. See "Details".

Details

resample. Using multinomial distribution, compute the number of occurences for each cloneset, than remove zero-number clonotypes and return resulting data frame. Probabilities for rmultinom for each cloneset is a percentage of this cloneset in the .col column. It's a some sort of simulation of how clonotypes are chosen from the organisms. For now it's not working very well, so use downsample instead.

downsample. Choose .n clones (not clonotypes!) from the input repertoires without any probabilistic simulation, but exactly computing each choosed clones. Its output is same as for resample (repertoires), but is more consistent and biologically pleasant.

Value

Data frame with sum(.data[, .col]) == .n.

See Also

rmultinom

Examples

1
2
3
4
5
## Not run: 
# Get 100K reads (not clones!).
immdata.1.100k <- resample(immdata[[1]], 100000, .col = "read.count")

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.