hsp_naive: Naive hidden state prediction.
In castor: Efficient Phylogenetics on Large Trees

hsp_naive

R Documentation

Naive hidden state prediction.

Description

Reconstruct ancestral discrete states of nodes and predict unknown (hidden) states of tips on a tree based on empirical state proportions across tips on the entire tree. This method completely ignores any phylogenetic relationships, and is not designed for making reliable state predictions. It is only meant for testing and debugging purposes, e.g. as a "null" model!

Usage

hsp_naive(tree, 
          tip_states, 
          Nstates     = NULL,
          check_input = TRUE)

Arguments

`tree`	A rooted tree of class "phylo". Note that the tree structure is not actually used internally.
`tip_states`	An integer vector of size Ntips, specifying the state of each tip in the tree as an integer from 1 to Nstates, where Nstates is the possible number of states (see below). `tip_states` can include `NA` to indicate an unknown tip state that is to be predicted.
`Nstates`	Either `NULL`, or an integer specifying the number of possible states of the trait. If `NULL`, then it will be computed based on the maximum non-`NA` value encountered in `tip_states`
`check_input`	Logical, specifying whether to perform some basic checks on the validity of the input data. If you are certain that your input data are valid, you can set this to `FALSE` to reduce computation.

Details

For this function, the trait's states must be represented by integers within 1,..,Nstates, where Nstates is the total number of possible states. If the states are originally in some other format (e.g. characters or factors), you should map them to a set of integers 1,..,Nstates. You can easily map any set of discrete states to integers using the function map_to_state_space. Any NA entries in tip_states are interpreted as unknown states.

The function calculates the "global" empirical proportions of known states, i.e., across the entire tree, and then sets the state likelihoods of tips with unknown state and nodes to this distribution. States are "predicted" for each tip with unknown state and each node by randomly drawing states according to the same global empirical distribution. This function has asymptotic time complexity O(Ntips x Nstates).

Tips must be represented in tip_states in the same order as in tree$tip.label. The vector tip_states need not include names; if it does, however, they are checked for consistency (if check_input==TRUE).

This function is meant for reconstructing ancestral states in all nodes of a tree as well as predicting the states of tips with an a priory unknown state, according to the "null" model where phylogenetic relationships are ignored but empirical frequencies of known states are still accounted for.

Value

A list with the following elements:

`success`	Logical, indicating whether HSP was successful. If `FALSE`, some return values may be `NULL`.
`likelihoods`	A 2D numeric matrix, listing the probability of each tip and node being in each state. This matrix will have (Ntips+Nnodes) rows and Nstates columns, where Nstates was either explicitly provided as an argument or set to the maximum value found in `tip_states` (if `Nstates` was passed as NULL). The rows in this matrix will be in the order in which tips and nodes are indexed in the tree, i.e. the rows 1,..,Ntips store the probabilities for tips, while rows (Ntips+1),..,(Ntips+Nnodes) store the probabilities for nodes. Each row in this matrix will sum up to 1. Note that the return value is named this way for compatibility with other HSP functions.
`states`	Integer vector of length Ntips+Nnodes, with values in {1,..,Nstates}, specifying the "predicted" state for each tip & node. For nodes and tips with a priori unknown state, the predicted state is simply a random draw from the empirical distribution.

Author(s)

Stilianos Louca

References

J. R. Zaneveld and R. L. V. Thurber (2014). Hidden state prediction: A modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses. Frontiers in Microbiology. 5:431.

Examples

## Not run: 
# generate random tree
Ntips = 100
tree = generate_random_tree(list(birth_rate_intercept=1),max_tips=Ntips)$tree

# simulate a discrete trait
Nstates = 5
Q = get_random_mk_transition_matrix(Nstates, rate_model="ER", max_rate=0.1)
tip_states = simulate_mk_model(tree, Q)$tip_states

# print states of first 20 tips
print(tip_states[1:20])

# set half of the tips to unknown state
tip_states[sample.int(Ntips,size=as.integer(Ntips/2),replace=FALSE)] = NA

# reconstruct all tip states using the naive method
HSP = hsp_naive(tree, tip_states=tip_states, Nstates=Nstates)

# print estimated states of first 20 tips
print(HSP$states[1:20])

## End(Not run)

castor documentation built on Jan. 21, 2026, 9:08 a.m.

castor index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

castor
Efficient Phylogenetics on Large Trees

hsp_naive: Naive hidden state prediction.
In castor: Efficient Phylogenetics on Large Trees

Naive hidden state prediction.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to hsp_naive in castor...

R Package Documentation

Browse R Packages

We want your feedback!

castor Efficient Phylogenetics on Large Trees

hsp_naive: Naive hidden state prediction. In castor: Efficient Phylogenetics on Large Trees

Naive hidden state prediction.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to hsp_naive in castor...

R Package Documentation

Browse R Packages

We want your feedback!

castor
Efficient Phylogenetics on Large Trees

hsp_naive: Naive hidden state prediction.
In castor: Efficient Phylogenetics on Large Trees