nsincd: Colocalization index of d-type
In colocalization: Normalized Spatial Intensity Correlation

Description Usage Arguments Details Value Author(s) References Examples

nsinc.d is used to calculate the Pearson's correlation coefficient of the average proportion densities with complete spatial randomness (CSR) as reference of two types of signals in a specified proximity of all signals or all signals of interested type (or base signals) as the colocalization index for a whole image. If a range of proximity sizes are concerned, the nsinc.d will take the average of the index values over the range. In the case of multiple-species data, the average of index values of all pairs at each proximity size is taken as the index for the image at that size of neighborhood.

nsinc.d(data, membership, dim = 2, r.min = NULL,
        r.max = NULL, r.count = NULL, r.adjust = NULL,
        box = NULL, edge.effect = TRUE, strata = FALSE,
        base.member = NULL, r.model = "full", ...)

`data`	a data frame (or object coercible by as.data.frame to a data frame) containing at least the columns `membership` and `x (xc, X or Xc)`, `y (yc, Y or Yc)` if `dim = 2` and `x (xc, X or Xc)`, `y (yc, Y or Yc)`, `z (zc, Z or Zc)` if `dim = 3`.
`membership`	a string describing the column name in the `data` representing the membership of data points. There should be no less than 2 levels in the membership.
`dim`	an integer either `= 2` or `= 3`. If `dim = 2`, the data are treated as two-dimensional; if `dim = 3`, the data are treated as three-dimensional.
`r.min`	the minimum proximity size that the user identifies as colocalization of signals. It should be numeric. If `r.model = "full"`, the function will automatically choose the smallest inter-point distance as the `r.min`; if `r.model = "r.med"`, the function will use the median inter-point distance for both `r.min` and `r.max`; if `r.model = "other"`, the user must specify `r.min`, which should be no larger than `r.max`.
`r.max`	the maximum proximity size that the user identifies as colocalization of signals. It should be numeric. If `r.model = "full"`, the function will automatically choose the largest inter-point distance as the `r.max`; if `r.model = "r.med"`, the function will use the median inter-point distance for both `r.min` and `r.max`; if `r.model = "other"`, the user must specify `r.max`, which should be between the smallest and the largest inter-point distances and no smaller than `r.min`.
`r.count`	the total count of the series of proximity sizes between `r.min` and `r.max`. If `r.max = r.min` or `r.adjust = (r.max - r.min)/2`, then `r.count = 1`, otherwise `r.count = 30` by default or is specified by the user.
`r.adjust`	a small adjustment for `r.min` and `r.max` to get the series of proximity sizes between `r.min + r.adjust` and `r.max - r.adjust` to avoid zero standard deviation of average proportion densities at extremely small and large r's. The values of `r.adjust` depends on the choice of `r.model` and values of `r.min` and `r.max`. For most scenarios, it is suggested to use `r.adjust = NULL` and let the function choose the default value for `r.adjust`. In general, by default either `r.adjust = 0` or `r.adjust = (r.max - r.min)/(r.count + 1)`; otherwise it is a positive number specified by the user satisfying `r.adjust` ≤ `(r.max - r.min)/2`.
`box`	a one-row data frame describing the study region which must contain columns `xmin, xmax, ymin, ymax` if `dim = 2` and additionally `zmin, zmax` if `dim = 3`. If `box = NULL`, the function will detect the smallest box containing all data points and add a buffer edge in each dimension which is equal to the median of nearest neighbor distances in that dimension. If `box` is specified by the user, only the data enclosed in the specified box will be considered in the analysis and signals outside the `box` will be ignored.
`edge.effect`	a logical value showing whether the edge effect should be corrected. By default it should be corrected otherwise the results are not accurate.
`strata`	a logical value showing whether the user wants to consider single-direction or bi-direction colocalization. By default `strata = FALSE` is for bi-direction colocalization. In this case, all proximity regions around all signals are considered. If `strata = TRUE`, then `base.member` must be specified or the first membership that R detects in the membership column will be used by default and only the circular regions around signals in the base membership are considered. Then, colocalization will be single-direction in this case.
`base.member`	one level of the memberships that is designated as the base. It works only when `strata = TRUE`. If `strata = TRUE` and no `base.member` is specified by the user, the first membership that R detects in the membership column will be used by default for `base.member`.
`r.model`	equals either `"full"`, `"r.med"` or `"other"`. The `r.model` will be used to choose the proximity size ranges that the user defines for colocalization. `"full"` or `"r.med"` can be used if the user has no specific sense of proximity sizes for colocalization. In `"full"` model, the colocalization proximity sizes will range from the smallest inter-point distance to the largest inter-point distances; in `"r.med"` model, the fixed proximity size is the median of inter-point distances; in `"other"` model, the user can define their research driven proximity sizes by specifying `r.min` and `r.max`.
`...`	Parameters passed to `cor`. The user could choose methods other than `Pearson` for calculating correlation.

The function calculates the average proportion density with CSR as reference of two types of signals in a specified r neighborhood with edge effect corrected of all signals or all base signals if strata = TRUE is specified, then obtains the Pearson correlation coefficients of each pair of channels and average them among all pairs at each r in the r series from r.min to r.max. In the case of multiple-species data, the average of index values of all pairs at each proximity size is taken as the index for the image at that size of neighborhood. The index for the whole image is named as NSInC^d or NSInC of type d. The index will be close to 1 if signals are colocalized, 0 if random and -1 if dispersed. The function can deal with 2D or 3D data.

If the users have their specific proximity size, then they are encouraged to specify r.model = "other", and same values for r.min and r.max.

nsinc.d returns all colocalization index at each separate proximity size r, and the average colocalization index across all r's, the data that the colocalization index is calculated from, the study region, i.e., the carrying box, the original and normalized proportions of each type of signals in an r neighborhood of all (base) signals, the r series, and some summary information:

`method`	"nsinc.d"
`input.data.summary`	a list containing the number of membership levels and the signal counts in each channel or membership of the input data.
`post.data.summary`	a list showing the number of membership levels and signal counts in each channel of the data after removal of signals located outside the specified box by the user. If there is no signals excluded, then `post.data.summary` presents the same results as `input.data.summary`.
`r.summary`	a data frame listing the `r.min`, `r.max`, `r.count`, `r.adjust` used in the calculation and the `r.model` specified by the user or the default. `r.summary` also gives the r range for the default `"full"` model, i.e., the minimum and maximum of the inter-point distance of all signals, and the median value in addition.
`strata`	a list showing the default setting of strata or the specified strata by the user. It also presents the base membership used in the function if `strata` is TRUE.
`edge.effect`	a data frame containing a logical value indicating whether edge effect is corrected or not.
`index.all`	a data frame showing the colocalization index of d-type at each r.
`index`	the averaged colocalization index of d-type across all r's.
`post.data`	a data frame representing the data after removal of signals located outside the specified box by the user. If there is no signal excluded, then `post.data` presents the same observations as data.
`study.region`	the carrying box with the size of buffer width in each dimension.
`P.all`	the data frame showing all original and normalized proportions of each type of signals in an r-neighborhood around every (base) signal. Rows are (base) signals and columns are all memberships and r's.
`r`	the r series for which the colocalization indices are calculated.

Xueyan Liu, Jiahui Xu, Cheng Cheng, Hui Zhang.

Liu, X., Xu, J., Guy C., Romero E., Green D., Cheng, C., Zhang, H. (2019). Unbiased and Robust Analysis of Co-localization in Super-resolution Images. Manuscript submitted for publication.

## A simulated 2D example data.
set.seed(1234)
x <- runif(300, min = -1, max = 1)
y <- runif(300, min = -1, max = 1)
red <- data.frame(x,y, color = "red")
x <- runif(50, min = -1, max = 1)
y <- runif(50, min = -1, max = 1)
green <- data.frame(x,y, color = "green")

mydata <- rbind(red,green)
plot(mydata$x,mydata$y,col = mydata$color)


mydata.results <- nsinc.d(data = mydata, membership = "color", dim = 2)

mydata.results$index.all
mydata.results$index


## A simulated 3D example data.
data("twolines")


library("rgl")
plot3d(twolines[,c("x","y","z")], type='s', size=0.7, col = twolines$membership)
aspect3d("iso")

twolines.results <- nsinc.d(data = twolines, membership = "membership",
                            dim = 3, r.model = "r.med")

twolines.results$index