readPed: Read a pedigree from file

View source: R/readPed.R

readPedR Documentation

Read a pedigree from file

Description

Reads a text file in pedigree format, or something fairly close to it.

Usage

readPed(
  pedfile,
  colSep = "",
  header = NA,
  famid_col = NA,
  id_col = NA,
  fid_col = NA,
  mid_col = NA,
  sex_col = NA,
  marker_col = NA,
  locusAttributes = NULL,
  missing = 0,
  sep = NULL,
  colSkip = NULL,
  sexCodes = NULL,
  addMissingFounders = FALSE,
  validate = TRUE,
  ...
)

Arguments

pedfile

A file name

colSep

A column separator character, passed on as the sep argument of read.table(). The default is to separate on white space, that is, one or more spaces, tabs, newlines or carriage returns. (Note: the parameter sep is used to indicate allele separation in genotypes.)

header

A logical. If NA, the program will interpret the first line as a header line it contains both "id" and "sex" as part of some entries (ignoring case).

famid_col

Index of family ID column. If NA, the program looks for a column named "famid" (ignoring case).

id_col

Index of individual ID column. If NA, the program looks for a column named "id" (ignoring case).

fid_col

Index of father ID column. If NA, the program looks for a column named "fid" (ignoring case).

mid_col

Index of mother ID column. If NA, the program looks for a column named "mid" (ignoring case).

sex_col

Index of column with gender codes (0 = unknown; 1 = male; 2 = female). If NA, the program looks for a column named "sex" (ignoring case). If this is not found, genders of parents are deduced from the data, leaving the remaining as unknown.

marker_col

Index vector indicating columns with marker alleles. If NA, all columns to the right of all pedigree columns are used. If sep (see below) is non-NULL, each column is interpreted as a genotype column and split into separate alleles with strsplit(..., split = sep, fixed = TRUE).

locusAttributes

Passed on to setMarkers() (see explanation there).

missing

Passed on to setMarkers() (see explanation there).

sep

Passed on to setMarkers() (see explanation there).

colSkip

Columns to skip, given as a vector of indices or columns names. If given, these columns are removed directly after read.table(), before any other processing.

sexCodes

A list with optional entries "male", "female" and "unknown", indicating how non-default entries in the sex column should be interpreted. Default values: male = 1, female = 2, unknown = 0.

addMissingFounders

A logical. If TRUE, any parent not included in the id column is added as a founder of corresponding sex. By default, missing founders result in an error.

validate

A logical indicating if the pedigree structure should be validated.

...

Further parameters passed on to read.table(), e.g. comment.char and quote.

Details

If there are no headers, and no column information is provided by the user, the program assumes the following column order:

  • family ID (optional; guessed from the data)

  • individual ID

  • father's ID

  • mother's ID

  • sex

  • marker data (remaining columns)

Reading SNP data

Adding the argument locusAttributes = "snp-AB", sets all markers to be equifrequent SNPs with alleles A and B. Moreover, the letters A and B may be replaced by other single-character letters or numbers, e.g., "snp-12" gives alleles 1 and 2.

Value

A ped object or a list of such.

Examples


tf = tempfile()

### Write and read a trio
trio = data.frame(id = 1:3, fid = c(0,0,1), mid = c(0,0,2), sex = c(1,2,1))
write.table(trio, file = tf, row.names = FALSE)
readPed(tf)

# With marker data in one column
trio.marker = cbind(trio, M = c("1/1", "2/2", "1/2"))
write.table(trio.marker, file = tf, row.names = FALSE)
readPed(tf)

# With marker data in two allele columns
trio.marker2 = cbind(trio, M.1 = c(1,2,1), M.2 = c(1,2,2))
write.table(trio.marker2, file = tf, row.names = FALSE)
readPed(tf)

### Two singletons in the same file
singles = data.frame(id = c("S1", "S2"),
                     fid = c(0,0), mid = c(0,0), sex = c(2,1),
                     M = c("9/14.2", "9/9"))
write.table(singles, file = tf, row.names = FALSE)
readPed(tf)

### Two trios in the same file
trio2 = cbind(famid = rep(c("trio1", "trio2"), each = 3), rbind(trio, trio))

# Without column names
write.table(trio2, file = tf, row.names = FALSE)
readPed(tf)

# With column names
write.table(trio2, file = tf, col.names = FALSE, row.names = FALSE)
readPed(tf, famid = 1, id = 2, fid = 3, mid = 4, sex = 5)

### With non-standard `sex` codes
trio3 = data.frame(id = 1:3, fid = c(0,0,1), mid = c(0,0,2),
                   sex = c("male","female","?"))
write.table(trio3, file = tf, row.names = FALSE)
readPed(tf, sexCodes = list(male = "male", female = "female", unknown = "?"))

# Cleanup
unlink(tf)


magnusdv/pedtools documentation built on April 9, 2024, 7:35 a.m.