permute_inputs: permute_inputs

Description Usage Arguments Details Value

View source: R/permute_inputs.R

Description

Runs the permutation sampling procedure on the two data sets specified by the df1 and df2 inputs over the blocks defined in the data sets (dataframes or relative paths to dataframes). A sampled permutation can be applied to the data set at df2_path to aling its rows with those of the data set at df1_path.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
permute_inputs(
  df1,
  df2,
  formula,
  family,
  N,
  I,
  t,
  burn_in,
  sample_interval,
  block_name = "block",
  conda_env = "NA",
  activate_env = "NA",
  python = "python"
)

Arguments

df1

A dataframe OR a string with the path (from the R working directory) to the first data set with blocks.

df2

A dataframe OR a string with the path (from the R working directory) to the second data set with blocks. This data set MUST contain the response variable from the formula.

formula

A string with the formula specifying the response variable and the covariates. Written in the R format (e.g. 'y ~ x1 + x2 + x3').

family

A string specifying the family of distributions used in the regression step. Current supported families are 'Normal', 'Logistic', "Poisson".

N

An integer specifying the number of desired full permutations. The number of iterations will be burn_in + N*sample_interval.

I

An integer specifying how many iterations to complete in sampling the regression parameters.

t

An integer specifying the number of iterations (successful or not) in the Metropolis-Hastings step.

burn_in

An integer specifying the number of full iterations to be completed before any samples are taken.

sample_interval

An integer specifying the period of sampling. If sample_interval is 1, every full iteration will be a sample, if sample_interval is 2, every other full iteration will be a sample, and so on.

block_name

The name of the column in the two input data sets that contains the number of the block to which each row belongs. Set to "block" by default. Column name must be the same in the two data sets.

conda_env

'NA' by default. If using the Anaconda distribution of python, specify which environment to use. Runs 'source activate conda_env'.

activate_env

'NA' by default. Enter a string with the command used to activate the desired python environment if it is neither the default environment nor an Anaconda environment.

python

The name of the installed python deployment. If a conda environment was set up with create_conda_environment, this should be "python".

Details

This file is part of GFS.

GFS is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

GFS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with GFS. If not, see <https://www.gnu.org/licenses/>.

Value

A dataframe with number of rows equal to the number of rows in the first data set and N columns. Each column is a full permutation with respect to the first data set to be applied to the second data set.


edwinfarley/GFS documentation built on Dec. 5, 2020, 1:43 p.m.