summarize_sample: Summarize a processed STR sample
In ShawHahnLab/chiimp: Computational, High-throughput Individual Identification through Microsatellite Profiling

summarize_sample

R Documentation

Summarize a processed STR sample

Description

Converts an STR sample data frame as produced by analyze_sample into a concise list of consistent attributes, suitable for binding together across samples for a dataset. At this stage the summary is prepared for a single specific locus as in analyze_sample but as a list with a fixed length. The Allele1 entries correspond to the sequence with the highest count, Allele2 the second highest. See the Functions section below for how specific variants of this function behave.

Usage

summarize_sample(
  sample_data,
  sample_attrs,
  min_locus_reads = cfg("min_locus_reads")
)

summarize_sample_guided(
  sample_data,
  sample_attrs,
  min_locus_reads = cfg("min_locus_reads")
)

Arguments

`sample_data`	data frame of processed data for one sample as produced by analyze_sample.
`sample_attrs`	list of sample attributes, such as the rows produced by prepare_dataset.
`min_locus_reads`	numeric threshold for the minimum number of counts that must be present, in total across entries passing all filters, for potential alleles to be considered.

Details

Entries in the returned list:

For Allele1 and Allele2:
- Seq: sequence text for each allele.
- Count: integer count of occurrences of this exact sequence.
- Length: integer sequence length.
Homozygous: If the sample appears homozygous (if so, the Allele2 entries will be NA).
Ambiguous: If a potential allele was ignored due to ambiguous bases in sequence content (such as "N").
Stutter: If a potential allele was ignored due to apparent PCR stutter.
Artifact: If a potential allele was ignored due to apparent PCR artifact (other than stutter).
CountTotal: The total number of sequences in the original sample data.
CountLocus: The number of sequences matching all criteria for the specified locus in the original sample data.
ProminentSeqs: The number of entries above the specified threshold after all filtering. This should be either one (for a homozygous sample) or two (for a heterozygous sample) but conditions such as cross-sample contamination or excessive PCR stutter can lead to more than two.

Value

list of attributes describing the sample.

Functions

summarize_sample(): Default version of sample summary.
summarize_sample_guided(): Summarize a processed STR sample Using known lengths. If ExpectedLength1 and optionally ExpectedLength2 are given in sample_attrs, the min_locus_reads threshold is ignored. See also analyze_sample_guided.

ShawHahnLab/chiimp documentation built on Aug. 20, 2023, 1:41 a.m.

ShawHahnLab/chiimp index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ShawHahnLab/chiimp
Computational, High-throughput Individual Identification through Microsatellite Profiling

summarize_sample: Summarize a processed STR sample
In ShawHahnLab/chiimp: Computational, High-throughput Individual Identification through Microsatellite Profiling

Summarize a processed STR sample

Description

Usage

Arguments

Details

Value

Functions

Related to summarize_sample in ShawHahnLab/chiimp...

R Package Documentation

Browse R Packages

We want your feedback!

ShawHahnLab/chiimp Computational, High-throughput Individual Identification through Microsatellite Profiling

summarize_sample: Summarize a processed STR sample In ShawHahnLab/chiimp: Computational, High-throughput Individual Identification through Microsatellite Profiling

Summarize a processed STR sample

Description

Usage

Arguments

Details

Value

Functions

Related to summarize_sample in ShawHahnLab/chiimp...

R Package Documentation

Browse R Packages

We want your feedback!

ShawHahnLab/chiimp
Computational, High-throughput Individual Identification through Microsatellite Profiling

summarize_sample: Summarize a processed STR sample
In ShawHahnLab/chiimp: Computational, High-throughput Individual Identification through Microsatellite Profiling