This function simulates a K-stage SMART data with `(pinfo + pnoise)`

baseline variables from a multivariate Gaussian distribution. The `pinfo`

variables have variance 1 and pairwise correlation 0.2; the `pnoise`

variables have mean 0 and are uncorrelated with each other and with the `pinfo`

variables.

Subjects are from `n_cluster`

latent groups with equal sizes, and these `n_cluster`

groups are characterized by their differentiable means in the `pinfo`

feature variables. Each latent group has its own optimal treatment sequence, where the optimal treatment for subjects in group g at stage k is generated as * A^* = 2( [ g/(2k -1) ]* mod 2) - 1. The assigned treatment group (1 or -1) for each subject at each stage is randomly generated with equal probability. The primary outcome is observed only at the end of the trial, which is generated as
*R = ∑_{k=1}^{K} A_k A_k^* + N(0,1)*.

1 | ```
sim_Kstage (n, n_cluster, pinfo, pnoise, centroids=NULL, K)
``` |

`n` |
sample size, should be a multiple of |

`n_cluster` |
number of latent groups |

`pinfo` |
number of informative baseline variables |

`pnoise` |
number of non-informative baseline variables |

`centroids` |
centroids of the |

`K` |
number of stages. |

`X ` |
baseline variables. It is a matrix of dimension |

`A ` |
treatment assigments for the K-stages. It is a list of K vectors. |

`R ` |
outcomes of the K-stages. It is a list of K vectors. In this simulation setting, no intermediate outcomes are observed, so the first K-1 vectors are vectors of 0. |

`optA ` |
optimal treatments for the K-stages. It is a list of K vectors. |

`centroids ` |
centroids of the |

Yuan Chen, Ying Liu, Donglin Zeng, Yuanjia Wang

Maintainer: Yuan Chen <yc3281@columbia.edu><irene.yuan.chen@gmail.com>

1 2 3 4 5 6 7 8 9 10 11 | ```
n_train = 100
n_test = 500
n_cluster = 10
pinfo = 10
pnoise = 20
# simulate a 2-stage training set
train = sim_Kstage(n_train, n_cluster, pinfo, pnoise, K=2)
# simulate an independent 2-stage test set with the same centroids of the training set
test = sim_Kstage(n_test, n_cluster, pinfo, pnoise, train$centroids, K=2)
``` |

