tworows2twocols: Function to reshape a data frame or file

Description Usage Arguments Details Value Examples

View source: R/PediHaplotyper.r

Description

This is a function to help in formatting marker data for input into PediHaplotyper. It takes a data frame or file with one column per marker and two rows per individual and converts this to two columns per marker and one row per individual. It is more user-friendly than reshape() for this particular case.

Usage

1
2
tworows2twocols(source, target = "", sep = "", skip = 0, 
  na.strings = "NA", sep.out = "\t", na.out = "", ...)

Arguments

source

Either a data frame or a file name: the data to be transposed. If a file name, the file should be readable by read.table() with header=TRUE; all other parameters for read.table have the default values of read.table but can be specified.

target

Ignored if source is a data frame, else the name of the transposed file

sep

Ignored if source is a data frame, else the separator (passed to read.table)

skip

Ignored if source is a data frame, else the number of lines to skip (passed to read.table)

na.strings

Ignored if source is a data frame, else the strings used to represent NA values (passed to read.table)

sep.out

Ignored if source is a data frame, else the separator to be used in the target file

na.out

Ignored if source is a data frame, else the representation of NA values to be used in the target file

...

Ignored if source is a data frame, else further parameters to be passed to read.table

Details

The source data consist of one column with names of individuals, followed by one column per marker; for each individuals two consecutive rows are present that contain the marker alleles inherited for parent 1 and parent 2. The column names are the marker names, the row names are ignored. This function reformats the data such that there is only one row per individual and two consecutive columns per marker, as required by PediHaplotyper. The name of the second column of each marker is the marker name with "_2nd" appended.

If source is a file name the file is read using read.table; it must contain a header line, but in contrast to read.table the headers are not modified if they are not valid identifiers or if duplicate headers occur. The column names of the original and rehaped data frame may be invalid as identifiers; that does not affect any of the PediHaplotyper functions.

Value

The return value is always the reshaped data frame. If source and target are file names also the target file is created, overwriting any previous file.

Examples

1
2
3
4
data(mrkdat) # a data frame with data for two markers and two individuals
s <- twocols2tworows(transpose(mrkdat))
s # a data frame arranged as the source of tworows2twocols
tworows2twocols(s)

PBR/PediHaplotyper documentation built on Feb. 3, 2021, 12:03 a.m.