crimeClust_bayes: Bayesian model-based partially-supervised clustering for...
In crimelinkage: Statistical Methods for Crime Series Linkage

Description Usage Arguments Value Author(s) References See Also Examples

Bayesian model-based partially-supervised clustering for crime series identification

1
2
3

crimeClust_bayes(crimeID, spatial, t1, t2, Xcat, Xnorm, maxcriminals = 1000,
  iters = 10000, burn = 5000, plot = TRUE, update = 100, seed = NULL,
  use_space = TRUE, use_time = TRUE, use_cats = TRUE)

`crimeID`	n-vector of criminal IDs for the n crimes in the dataset. For unsolved crimes, the value should be `NA`.
`spatial`	(n x 2) matrix of spatial locations, represent missing locations with `NA`
`t1`	earliest possible time for crime
`t2`	latest possible time for crime. Crime occurred between `t1` and `t2`.
`Xcat`	(n x q) matrix of categorical crime features. Each column is a variable, such as mode of entry. The different factors (window, door, etc) should be coded as integers 1,2,...,m.
`Xnorm`	(n x p) matrix of continuous crime features.
`maxcriminals`	maximum number of clusters in the model.
`iters`	Number of MCMC samples to generate.
`burn`	Number of MCMC samples to discard as burn-in.
`plot`	(logical) Should plots be produced during run.
`update`	Number of MCMC iterations between graphical displays.
`seed`	seed for random number generation
`use_space`	(logical) should the spatial locations be used in clustering?
`use_time`	(logical) should the event times be used in clustering?
`use_cats`	(logical) should the categorical crime features be used in clustering?

(list) p.equal is the (n x n) matrix of probabilities that each pair of crimes are committed by the same criminal.

if plot=TRUE, then progress plots are produced.

Brian J. Reich

Reich, B. J. and Porter, M. D. (2015), Partially supervised spatiotemporal clustering for burglary crime series identification. Journal of the Royal Statistical Society: Series A (Statistics in Society). 178:2, 465–480. http://www4.stat.ncsu.edu/~reich/papers/CrimeClust.pdf

bayesPairs

# Toy dataset with 12 crimes and three criminals.

 # Make IDs: Criminal 1 committed crimes 1-4, etc.
 id <- c(1,1,1,1,
         2,2,2,2,
                 3,3,3,3)

 # spatial locations of the crimes:
 s <- c(0.8,0.9,1.1,1.2,
        1.8,1.9,2.1,2.2,
        2.8,2.9,3.1,3.2)
 s <- cbind(0,s)

 # Categorical crime features, say mode of entry (1=door, 2=other) and
 # type of residence (1=apartment, 2=other)
 Mode <- c(1,1,1,1,  #Different distribution by criminal
           1,2,1,2,
           2,2,2,2)
 Type <- c(1,2,1,2,  #Same distribution for all criminals
           1,2,1,2,
           1,2,1,2)
 Xcat <- cbind(Mode,Type)

 # Times of the crimes
 t <- c(1,2,3,4,
        2,3,4,5,
        3,4,5,6)

 # Now let's pretend we don't know the criminal for crimes 1, 4, 6, 8, and 12.
 id <- c(NA,1,1,NA,2,NA,2,NA,3,3,3,NA)

 # Fit the model (nb: use much larger iters and burn on real problem)
 fit <- crimeClust_bayes(crimeID=id, spatial=s, t1=t,t2=t, Xcat=Xcat,
                   maxcriminals=12,iters=500,burn=100,update=100)

 # Plot the posterior probability matrix that each pair of crimes was
 # committed by the same criminal:
 if(require(fields,quietly=TRUE)){
 fields::image.plot(1:12,1:12,fit$p.equal,
            xlab="Crime",ylab="Crime",
            main="Probability crimes are from the same criminal")
 }

 # Extract the crimes with the largest posterior probability
 bayesPairs(fit$p.equal)
 bayesProb(fit$p.equal[1,])