design_tree: Organize the data around the rooted binary weighted tree

View source: R/design_tree.R

design_treeR Documentation

Organize the data around the rooted binary weighted tree

Description

NB: currently this minimally built; need some checking functions

Usage

design_tree(Y, leaf_ids, mytree, weighted_edge = FALSE, Z_obs = NULL)

Arguments

Y

N by J binary data matrix; rows for subjects; columns for binary measurements/features

leaf_ids

Character string for leaf nodes for each observation

mytree

a tree (an igraph object) that contains the node, edge, edge-length ("weight") information.

weighted_edge

logical: TRUE for using the branch lengths then the mytree must have this info; if FALSE, every edge, including an imaginary edge leading to the root node, is set to have length 1.

Z_obs

Default is NULL; A two-column matrix of (id, class indicator); the number of rows equals the number of observations; an entry of 2nd column is NA if the subject in that row has an unknown class indicator. Importantly, the rows will be reordered according to the reordered Y.

Value

A list of data and tree information for model fitting

  • Y A matrix of N by J; binary measurements with rows ordered by leaf groups (leaf_ids).

  • A A matrix of p by p; each column contains some 1s, indicating the node in that column is an ancestor of the node represented in the row. Ancestor of a node include that node itself.

  • A_leaves A matrix of pL by p; A submatrix of A that represents the ancestor matrix but only for leaves

  • leaf_ids A vector of Nintegers; ordered by the leaves as specified by mytree.

  • leaf_ids_units A list of length pL, each element is a vector of subject ids belonging to each leaf node

  • leaf_ids_nodes a list of length p, each element is a vector of integers (between 1 and pL; id is only for leaf nodes) indicating the leaf nodes.

  • ancestors a list of length pL, each element is the vector of ancestors (between 1 and p; id is among all nodes)

  • edge_lengths a list of length pL, each element is a numeric vector of edge lengths from the root node to the leaf. It is computed based on E(mytree)$weight. It is NULL if E(mytree)$weight is NULL

  • h_pau a numeric vector of length p; each value is the edge length from u to its parent (if u is a root node, then the value is 1). This vector by default is all 1s. If weighted_edge=TRUE, h_pau is set to E(mytree)$weight, the input edge weights.

  • v_units (redundant - identical to leaf_ids) a vector of length equal to the total number of rows in Y; each element is an integer between 1 and pL, indicating which leaf does the observation belong to.

  • subject_id_list a list of length p; each element is a vector of subject ids that are in the leaf descendants of node u (internal or leaf node)

  • ord the permutation to order the original rows to produce the final ordering of the rows of Y.


zhenkewu/lotR documentation built on April 24, 2022, 2:36 a.m.