rnonRLII: Type II Non-Random Labeling of a Given Set of Points

View source: R/NNCTFunctions.R

rnonRLIIR Documentation

Type II Non-Random Labeling of a Given Set of Points

Description

An object of class "SpatPatterns".

Given the set of n points, dat, in a region, this function assigns n_1=round(n*ult.prop,0) of them as cases, and the rest as controls with first selecting k_0=round(n*init.prop,0) as cases initially, then selecting a contagious case and then assigning the label case to the remaining points with infection probabilities inversely proportional to their position among the kNNs.

The initial and ultimate number of cases will be k_0 and n_1 on the average if the argument poisson=TRUE (i.e., k_0=rpois(1,round(n*init.prop,0)) and n_1=rpois(1,round(n*ult.prop,0)) ), otherwise they will be exactly equal to n_1=round(n*ult.prop,0) and k_0=round(n*init.prop,0). More specifically, let z_1,…,z_{k_0} be the initial cases. Then one of the cases is selected as a contagious case, say z_j and then its kNNs (among the non-cases) are found. Then label these kNN non-case points as cases with infection probabilities prob equal to the value of the rho*(1/(1:k))^pow values at these points, where rho is a scaling parameter for the infection probabilities and pow is a parameter in the power adjusting the kNN dependence. We stop when we first exceed n_1 cases. rho has to be in (0,1) for prob to be a vector of probabilities, and for a given rho, pow must be > \ln(rho)/\ln(k). If rand.init=TRUE, first k_0 entries are chosen as the initial cases in the data set, dat, otherwise, k_0 initial cases are selected randomly among the data points.

Algorithmically, first all dat points are treated as non-cases (i.e. controls or healthy subjects). Then the function follows the following steps for labeling of the points:

step 0: n_1 is generated randomly from a Poisson distribution with mean = round(n*ult.prop,0), so that the average number of ultimate cases will be round(n*ult.prop,0) if the argument poisson=TRUE, else n_1=round(n*ult.prop,0). And k_0 is generated randomly from a Poisson distribution with mean = round(n*init.prop,0), so that the average number of initial cases will be round(n*init.prop,0) if the argument poisson=TRUE, else k_0=round(n*init.prop,0).

step 1: Initially, k_0 many points from dat are selected as cases. The selection of initial cases are determined based on the argument rand.init (with default=TRUE) where if rand.init=TRUE then the initial cases are selected randomly from the data points, and if rand.init= FALSE, the first k_0 entries in the data set, dat, are selected as the cases.

step 2: Then it selects a contagious case among the cases, and randomly labels its k control NNs as cases with decreasing infection probabilities prob=rho*(1/(1:k))^pow. See the description for the details of the parameters in the prob.

step 3: The procedure ends when number of cases n_c exceeds n_1, and n_c-n_1 of the cases (other than the initial cases) are randomly selected and relabeled as controls, i.e. 0s, so that the number of cases is exactly n_1.

Note that the infection probabilities of the kNNs of each initial case increase with increasing rho; and probability of infection decreases as further NNs are considered from a contagious case (i.e. as k increases in the kNNs).

See \insertCiteceyhan:SiM-seg-ind2014;textualnnspat for more detail where type II non-RL pattern is the case 2 of non-RL pattern considered in Section 6 with n_1 is fixed as a parameter rather than being generated from a Poisson distribution and pow=1.

Although the non-RL pattern is described for the case-control setting, it can be adapted for any two-class setting when it is appropriate to treat one of the classes as cases or one of the classes behave like cases and other class as controls.

Usage

rnonRLII(
  dat,
  k,
  rho,
  pow,
  init.prop,
  ult.prop,
  rand.init = TRUE,
  poisson = FALSE
)

Arguments

dat

A set of points the non-RL procedure is applied to obtain cases and controls randomly in the type II fashion (see the description).

k

An integer representing the number of NNs considered for each contagious case, i.e., kNNs of each contagious case are candidates to be infected to become cases.

rho

A scaling parameter for the probabilities of labeling the points as cases (see the description).

pow

A parameter in the power adjusting the kNN dependence in the probabilities of labeling the points as cases (see the description).

init.prop

A real number between 0 and 1 representing the initial proportion of cases in the data set, dat. The selection of the initial cases depends on the parameter rand.init (see the description).

ult.prop

A real number between 0 and 1 representing the ultimate proportion of cases in the data set, dat after the non-RL assignment.

rand.init

A logical argument (default is TRUE) to determine the choice of the initial cases in the data set, dat. If rand.init=TRUE then the initial cases are selected randomly from the data points, and if rand.init= FALSE, the first init.prop*n entries in the data set, dat, are labeled as the cases.

poisson

A logical argument (default is FALSE) to determine whether the number of initial and ultimate cases, k_0 and n_1, will be random or fixed. If poisson=TRUE then the k_0 and n_1 are from a Poisson distribution, k_0=rpois(1,round(n*init.prop,0)) and n_1=rpois(1,round(n*ult.prop,0)) otherwise they are fixed, k_0=round(n*init.prop,0) and n_1=round(n*ult.prop,0).

Value

A list with the elements

pat.type

="cc" for the case-control patterns for RL or non-RL of the given data points, dat

type

The type of the point pattern

parameters

Number of NNs, k, a scaling parameter for the infection probabilities of kNNs, rho, a parameter in the power adjusting the kNN dependence of the infection probabilities, initial proportion of cases, init.prop, and the ultimate proportion of cases, ult.prop.

dat.points

The set of points non-RL procedure is applied to obtain cases and controls randomly in the type II fashion

lab

The labels of the points as 1 for cases and 0 for controls after the type II nonRL procedure is applied to the data set, dat. Cases are denoted as red dots and controls as black circles in the plot.

init.cases

The initial cases in the data set, dat. Denoted as red crosses in the plot of the points.

cont.cases

The contagious cases in the data set, dat. Denoted as blue points in the plot of the points.

gen.points,ref.points

Both are NULL for this function, as initial set of points, dat, are provided for the non-RL procedure.

desc.pat

Description of the point pattern

mtitle

The "main" title for the plot of the point pattern

num.points

The vector of two numbers, which are the number of cases and controls.

xlimit,ylimit

The possible ranges of the x- and y-coordinates of the generated and the reference points

Author(s)

Elvan Ceyhan

References

\insertAllCited

See Also

rnonRLI, rnonRLIII, rnonRLIV, and rnonRL

Examples

n<-40;  #try also n<-20; n<-100;
#data generation
dat<-cbind(runif(n,0,1),runif(n,0,1))

rho<-.8
pow<-2
knn<-5 #try 2 or 3
ip<-.3 #initial proportion
up<-.5 #ultimate proportion

Xdat<-rnonRLII(dat,knn,rho,pow,ip,up,poisson=FALSE) #labeled data, try poisson=TRUE
Xdat

table(Xdat$lab)

summary(Xdat)
plot(Xdat,asp=1)
plot(Xdat)

#normal original data
n<-40;  #try also n<-20; n<-100;
#data generation
dat<-cbind(rnorm(n,0,1),rnorm(n,0,1))

rho<-0.8
pow<-2
knn<-5 #try 2 or 3
ip<-.3 #initial proportion
up<-.5 #ultimate proportion

Xdat<-rnonRLII(dat,knn,rho,pow,ip,up,poisson=FALSE) #labeled data, try poisson=TRUE
Xdat

table(Xdat$lab)

summary(Xdat)
plot(Xdat,asp=1)
plot(Xdat)


nnspat documentation built on Aug. 30, 2022, 9:06 a.m.