PgR6MS: PgR6 class with Methods and Sequences.

PgR6MSR Documentation

PgR6 class with Methods and Sequences.

Description

PgR6 with Methods and Sequences. Final users should use pagoo instead of this, since is more easy to understand. Inherits: PgR6M

Super classes

pagoo::PgR6 -> pagoo::PgR6M -> PgR6MS

Active bindings

sequences

A DNAStringSetList with the set of sequences grouped by cluster. Each group is accessible as were a list. All Biostrings methods are available.

core_sequences

Like $sequences, but only showing core sequences.

cloud_sequences

Like $sequences, but only showing cloud sequences as defined above.

shell_sequences

Like $sequences, but only showing shell sequences, as defined above.

Methods

Public methods

Inherited methods

Method new()

Create a PgR6MS object.

Usage
PgR6MS$new(
  data,
  org_meta,
  cluster_meta,
  core_level = 95,
  sep = "__",
  DF,
  group_meta,
  sequences,
  verbose = TRUE
)
Arguments
data

A data.frame or DataFrame containing at least the following columns: gene (gene name), org (organism name to which the gene belongs to), and cluster (group of orthologous to which the gene belongs to). More columns can be added as metadata for each gene.

org_meta

(optional) A data.frame or DataFrame containing additional metadata for organisms. This data.frame must have a column named "org" with valid organisms names (that is, they should match with those provided in data, column org), and additional columns will be used as metadata. Each row should correspond to each organism.

cluster_meta

(optional) A data.frame or DataFrame containing additional metadata for clusters. This data.frame must have a column named "cluster" with valid organisms names (that is, they should match with those provided in data, column cluster), and additional columns will be used as metadata. Each row should correspond to each cluster.

core_level

The initial core_level (that's the percentage of organisms a core cluster must be in to be considered as part of the core genome). Must be a number between 100 and 85, (default: 95). You can change it later by using the $core_level field once the object was created.

sep

A separator. By default is '__'(two underscores). It will be used to create a unique gid (gene identifier) for each gene. gids are created by pasting org to gene, separated by sep.

DF

Deprecated. Use data instead.

group_meta

Deprecated. Use cluster_meta instead.

sequences

Can accept: 1) a named list of named character vector. Name of list are names of organisms, names of character vector are gene names; or 2) a named list of DNAStringSetList objects (same requirements as (1), but with BStringSet names as gene names); or 3) a DNAStringSetList (same requirements as (2) but DNAStringSetList names are organisms names).

verbose

logical. Whether to display progress messages when loading class.

Returns

An R6 object of class PgR6MS. It contains basic fields and methods for analyzing a pangenome. It also contains additional statistical methods for analyze it, methods to make basic exploratory plots, and methods for sequence manipulation.


Method core_seqs_4_phylo()

A field for obtaining core gene sequences is available (see below), but for creating a phylogeny with this sets is useful to: 1) have the possibility of extracting just one sequence of each organism on each cluster, in case paralogues are present, and 2) filling gaps with empty sequences in case the core_level was set below 100%, allowing more genes (some not in 100% of organisms) to be incorporated to the phylogeny. That is the purpose of this special function.

Usage
PgR6MS$core_seqs_4_phylo(max_per_org = 1, fill = TRUE)
Arguments
max_per_org

Maximum number of sequences of each organism to be taken from each cluster.

fill

logical. If fill DNAStringSet with empty DNAString in cases where core_level is set below 100%, and some clusters with missing organisms are also considered.

Returns

A DNAStringSetList with core genes. Order of organisms on each cluster is conserved, so it is easier to concatenate them into a super-gene suitable for phylogenetic inference.


Method clone()

The objects of this class are cloneable with this method.

Usage
PgR6MS$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


pagoo documentation built on Nov. 19, 2022, 1:07 a.m.