read_data_frame_CBGM: Read a CBGM file
In tjfinney/ANTTV: Analysis of New Testament Textual Variation (Code)

read_data_frame_CBGM

R Documentation

Read a CBGM file

Description

Read a CBGM data file to produce a data frame of witnesses (as rows), variation sites (as columns), and reading codes (as cells).

Usage

read_data_frame_CBGM(fn)

Arguments

`fn`	A file name, which can be a URL.

Details

The input data file must have the following format:

Field nos	Description
1	record no
2-10	site addresses (e.g. 20101010)
11	reading code (e.g. a)
12	?
13	reading (e.g. χριστου)
14	witness code (Nestle-Aland; e.g. 043)
15	witness code (INTF; e.g. 200430)
16	?

Site addresses (fields 2-10) are comprised of:

start address as four fields (book:chapter:verse:word)
end address as three fields (chapter:verse:word)
combined start address field (e.g. 20101010)
combined end address field (e.g. 20101010)

All but the following input fields are dropped:

9 (combined start address)
10 (combined end address)
11 (reading code)
14 (Nestle-Aland witness code)
15 (INTF witness code)

The output is a data frame with witnesses as rows and variation sites as columns. Reading codes "zz", "zw", "zu" are coded as NA = not available. Column headings are variation site labels based on the combined start address and an integer for each corresponding end address. For example, the combined start address for the first variation site in Mark is "20101010" and there is one corresponding end address. The combined start address is converted to "Mk.1.1.10" (i.e. book.chapter.verse.word). Finally, a ".1" for the corresponding end address is appended to produce "Mk.1.1.10.1" as the label.