sequence_length_summary_covariate: Summarize Sequence Lengths by Covariate
In AnimalSequences: Analyse Animal Sequential Behaviour and Communication

View source: R/sequence_length_summary_covariate.R

sequence_length_summary_covariate

R Documentation

Summarize Sequence Lengths by Covariate

Description

This function calculates summary statistics for the lengths of sequences of elements, grouped by a specified covariate. It includes mean, standard deviation, median, minimum, and maximum lengths, along with the number of distinct elements and the p-value comparing to shuffled sequences.

Usage

sequence_length_summary_covariate(sequences, covariate)

Arguments

`sequences`	A character vector where each element is a sequence of elements separated by spaces.
`covariate`	A vector of covariates with the same length as 'sequences', used to group the sequences.

Value

A data frame with the following columns:

`covariate`	The value of the covariate.
`mean_seq_elements`	The mean length of sequences for this covariate value.
`sd_seq_elements`	The standard deviation of the sequence lengths for this covariate value.
`median_seq_elements`	The median length of sequences for this covariate value.
`min_seq_elements`	The minimum length of sequences for this covariate value.
`max_seq_elements`	The maximum length of sequences for this covariate value.
`distinct_elements`	The number of distinct elements for this covariate value.
`pvalue_distinct_elements`	The p-value comparing the number of distinct elements to shuffled sequences for this covariate value.

Examples

sequences <- c('hello world', 'hello world hello', 'hello world hello world')
covariate <- c('A', 'B', 'A')
sequence_length_summary_covariate(sequences, covariate)

AnimalSequences documentation built on Sept. 30, 2024, 9:18 a.m.