optimize_combinations | R Documentation |
This function uses the Shannon Entropy to identify a set of compatible barcode combinations with least heterogeneity in barcode usage.
optimize_combinations(combination_m, nb_lane, index_number,
thrs_size_comb, max_iteration, method)
combination_m |
A matrix of compatible barcode combinations. |
nb_lane |
The number of lanes to be use for sequencing (i.e. the number of libraries divided by the multiplex level). |
index_number |
The total number of distinct DNA barcodes in the dataset. |
thrs_size_comb |
The maximum size of the set of compatible combinations to be used for the greedy optimization. |
max_iteration |
The maximum number of iterations during the optimizing step. |
method |
The choice of the greedy search: 'greedy_exchange' or 'greedy_descent'. |
N/k compatible combinations are then selected using a Shannon entropy maximization approach. It can be shown that the maximum value of the entropy that can be attained for a selection of N barcodes among n, with possible repetitions, reads:
S_{max}=-(n-r)\frac{\lfloor N/n\rfloor}{N}
\log(\frac{\lfloor N/n\rfloor}{N})-r\frac{\lceil N/n\rceil}{N}
\log(\frac{\lceil N/n\rceil}{N})
where r denotes the rest of the division of N by n, while
\lfloor N/n\rfloor
and
\lceil N/n\rceil
denote the lower and upper integer parts of N/n, respectively.
Case 1: number of lanes < number of compatible DNA-barcode combinations
This function seeks for compatible DNA-barcode combinations of highest entropy. In brief this function uses a randomized greedy descent algorithm to find an optimized selection. Note that the resulting optimized selection may not be globally optimal. It is actually close to optimal and much improved in terms of non-redundancy of DNA barcodes used, compared to a randomly chosen set of combinations of compatible barcodes.
Case 2: number of lanes >= number of compatible DNA-barcode combinations
In such a case, there are not enough compatible DNA-barcode combinations and redundancy is inevitable.
A matrix containing an optimized set of combinations of compatible barcodes.
get_all_combinations
,
get_random_combinations
,
experiment_design
m <- get_random_combinations(DNABarcodeCompatibility::IlluminaIndexes, 3, 4)
optimize_combinations(m, 12, 48)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.