The flatfile data format
Distance allows loading data as a "flat file" and analyse data (and obtain abundance estimates) straight away, provided that the format of the flat file is correct. One can provide the file as, for example, an Excel spreadsheet using
read.xls in gdata or CSV using
Each row of the data table corresponds to one observation and must have a the following columns:
|observed distance to object|
||Identifier for the sample (transect id)|
||effort for this transect (e.g. line transect length or number of times point transect was visited)|
||label for a given stratum (see below)|
||area of the strata|
Note that in the simplest case (one area surveyed only once) there is only one
Region.Label and a single corresponding
Area duplicated for each observation.
The example given below was provided by Eric Rexstad.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
## Not run: library(Distance) # Need to have the gdata library installed from CRAN, requires a system # with perl installed (usually fine for Linux/Mac) library(gdata) # Need to get the file path first # Going to the path given in the below, one can examine the format minke.filepath <- system.file("minke.xlsx",package="Distance") # Load the Excel file, note that header=FALSE and we add column names after minke <- read.xls(minke.filepath, stringsAsFactor=FALSE,header=FALSE) names(minke) <- c("Region.Label", "Area", "Sample.Label", "Effort","distance") # One may want to call edit(minke) or head(minke) at this point # to examine the data format # Due to the way the file was saved and the default behaviour in R # for numbers stored with many decimal places (they are read as strings # rather than numbers, see str(minke)). We must coerce the Effort column # to numeric minke$Effort <- as.numeric(minke$Effort) ## perform an analysis using the exact distances pooled.exact <- ds(minke, truncation=1.5, key="hr", order=0) summary(pooled.exact) ## Try a binned analysis # first define the bins dist.bins <- c(0,.214, .428,.643,.857,1.071,1.286,1.5) pooled.binned <- ds(minke, truncation=1.5, cutpoints=dist.bins, key="hr", order=0) # binned with stratum as a covariate minke$stratum <- ifelse(minke$Region.Label=="North", "N", "S") strat.covar.binned <- ds(minke, truncation=1.5, key="hr", formula=~as.factor(stratum), cutpoints=dist.bins) # Stratified by North/South full.strat.binned.North <- ds(minke[minke$Region.Label=="North",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) full.strat.binned.South <- ds(minke[minke$Region.Label=="South",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) ## model summaries model.sel.bin <- data.frame(name=c("Pooled f(0)", "Stratum covariate", "Full stratification"), aic=c(pooled.binned$ddf$criterion, strat.covar.binned$ddf$criterion, full.strat.binned.North$ddf$criterion+ full.strat.binned.South$ddf$criterion)) # Note model with stratum as covariate is most parsimonious print(model.sel.bin) ## End(Not run)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.