View source: R/02_data_preprocessing.R
check_input | R Documentation |
Check if input objects are ready for further analyses
check_input(seq = NULL, annotation = NULL, gene_field = "gene_id")
seq |
A list of AAStringSet objects, each list element containing protein sequences for a given species. This list must have names (not NULL), and names of each list element must match the names of list elements in annotation. |
annotation |
A GRangesList, CompressedGRangesList, or list of GRanges with the annotation for the sequences in seq. This list must have names (not NULL), and names of each list element must match the names of list elements in seq. |
gene_field |
Character, name of the column in the GRanges objects that contains gene IDs. Default: "gene_id". |
This function checks the input data for 3 required conditions:
Names of seq list (i.e., names(seq)
) match
the names of annotation GRangesList/CompressedGRangesList
(i.e., names(annotation)
)
For each species (list elements), the number of sequences in seq is not greater than the number of genes in annotation. This is a way to ensure users do not input the translated sequences for multiple isoforms of the same gene (generated by alternative splicing). Ideally, the number of sequences in seq should be equal to the number of genes in annotation, but this may not always stand true because of non-protein-coding genes.
For each species, sequence names (i.e., names(seq[[x]])
,
equivalent to FASTA headers) match gene names in annotation.
TRUE if the objects pass the check.
data(annotation)
data(proteomes)
check_input(proteomes, annotation)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.