read_write_spc: Loading and Saving Frequency Spectra (zipfR)

Description Usage Arguments Format Details Value See Also Examples

Description

read.spc loads frequency spectrum from .spc file

write.spc saves frequency spectrum object in .spc file

Usage

1
2
3

Arguments

file

character string specifying the pathname of a disk file. Files with extension .gz will automatically be compressed/decompressed. See section "Format" for a description of the required file format

spc

a frequency spectrum, i.e.\ an object of class spc

Format

A TAB-delimited text file with column headers but no row names (suitable for reading with read.delim). The file must contain at least the following two columns:

m

frequency class m

Vm

number V_m of types in frequency class m (or expected class size E[V_m])

An optional column labelled VVm can be used to specify variances of expected class sizes (for a frequency spectrum derived from a LNRE model or by binomial interpolation).

These columns may appear in any order in the text file. All other columns will be silently ignored.

Details

If the filename file ends in the extension .gz, .bz2 or .xz, the disk file will automatically be decompressed (read.spc) or compressed (write.spc).

The .spc file format does not store the values of N, V and VV explicitly. Therefore, incomplete frequency spectra and expected spectra with variances cannot be fully reconstructed from disk files. Saving such frequency spectra (or loading a spectrum with variance data) will trigger corresponding warnings.

Value

read.spc returns an object of class spc (see the spc manpage for details)

See Also

See the spc manpage for details on spc objects. See read.tfl and read.vgc for import/export of other data structures.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## save Italian ultra- frequency spectru to external text file
fname <- tempfile(fileext=".spc")
write.spc(ItaUltra.spc, fname)
## now <fname> is a TAB-delimited text file with columns m and Vm

## we ready it back in
New.spc <- read.spc(fname)

## same spectrum as ItaUltra.spc, compare:
summary(New.spc)
summary(ItaUltra.spc)

stopifnot(isTRUE(all.equal(New.spc, ItaUltra.spc))) # should be identical

## Not run: 
## DON'T do the following, incomplete spectrum will not be restored properly !!!
zm <- lnre("zm", ItaUltra.spc) # estimate model
zm.spc <- lnre.spc(zm,N(zm))   # incomplete spectrum from model
write.spc(zm.spc, fname)       # WARNINGS
bad.spc <- read.spc(fname)     # but this function cannot know something is wrong

summary(zm.spc)
summary(bad.spc) # note that N and V are completely wrong !!!

## End(Not run)

Example output

zipfR object for frequency spectrum
Sample size:     N  = 3467 
Vocabulary size: V  = 523 
Class sizes:     Vm = 333 68 37 15 11 4 4 5 ...
zipfR object for frequency spectrum
Sample size:     N  = 3467 
Vocabulary size: V  = 523 
Class sizes:     Vm = 333 68 37 15 11 4 4 5 ...
Warning message:
In write.spc(zm.spc, fname) :
  saving incomplete frequency spectrum, which cannot be restored from disk file!
zipfR object for expected frequency spectrum, incomplete (m <= 100)
Sample size:     N  = 3467 
Vocabulary size: V  = 526.1246 
Class sizes:     Vm = 350.9382 59.02895 26.29556 15.35928 10.24896 7.407276 5.646891 4.472625 ...
zipfR object for frequency spectrum
Sample size:     N  = 1848.805 
Vocabulary size: V  = 519.4727 
Class sizes:     Vm = 350.9382 59.02895 26.29556 15.35928 10.24896 7.407276 5.646891 4.472625 ...

zipfR documentation built on Nov. 13, 2020, 3:01 a.m.