FAData: Pedigree data information

View source: R/Constructors.R

FAData-classR Documentation

Pedigree data information

Description

FAData objects conveniently store pedigree along with trait information. This object is the central data structure from the FamAgg package. Basic usage pedigree analysis methods are described on this page and on the PedigreeUtils help page, familial aggregation analysis methods on the PedigreeAnalysis help page.

See the section about the pedigree data.frame below for a detailed description of the encoding of missing trait data or founder individuals in FamAgg.

Usage


## S4 method for signature 'FAData'
affectedIndividuals(object)

## S4 method for signature 'FAData'
age(object)

## S4 replacement method for signature 'FAData'
age(object) <- value

## S4 method for signature 'FAData'
buildPed(object, id=NULL, family = NULL, max.generations.up=3,
                            max.generations.down=16, prune=FALSE, ...)

## S4 method for signature 'FAData'
export(object, con, format="ped", ...)

FAData(pedigree, age, trait, traitName, header=FALSE, sep="\t", id.col="id",
       family.col="family", father.col="father", mother.col="mother",
       sex.col="sex")

## S4 method for signature 'FAData'
family(object, id=NULL, family=NULL,
                          return.type="data.frame")

## S4 method for signature 'FAData'
kinship(id, ...)

## S4 method for signature 'FAData'
pedigree(object, return.type="data.frame")

## S4 replacement method for signature 'FAData'
pedigree(object) <- value

## S4 method for signature 'FAData'
pedigreeSize(object)

## S4 method for signature 'FAData'
phenotypedIndividuals(object)

## S4 method for signature 'FAData'
plotPed(object, id=NULL, family=NULL, filename=NULL,
                           device="plot", symbol.related=NA,
                           proband.id=NULL, highlight.ids=NULL,
                           only.phenotyped=FALSE,
                           label1=age(object), label2=NULL, label3=NULL,
                           ...)

## S4 method for signature 'FAData'
show(object)

## S4 method for signature 'FAData'
trait(object, na.rm=FALSE)

## S4 replacement method for signature 'FAData'
trait(object) <- value

Arguments

(in alphabetic order)

age

For FAData: either a character(1) specifying the file name from which the age should be read or a named numeric vector of ages with the names corresponding to the ids of the individuals in the pedigree.

con

For export: the file name or connection to a file to which the pedigree information should be exported.

device

For plotPed: the device of file format in which the plot should be saved. See details for allowed values.

family

For buildPed: the id of the family for which the pedigree should be returned. For family: the id of the family for which the pedigree should be returned (full pedigree of the family). For plotPed: the id of the family for which the pedigree should be plotted.

family.col

For FAData: the name of the column containing the id of the families.

father.col

For FAData: the name of the column containing the id of the father.

filename

For plotPed: a character string specifying the name of the file to which the plot should be saved. If none is submitted, the plot is saved to a temporary file.

format

For export: the format in which the pedigree should be exported. At present only "ped" and "fam" are exported, i.e. the file formats from plink (http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml).

header

For FAData: only used if argument pedigree is a character(1), i.e. the file name from which the pedigree should be read. The header argument is passed to the read.table function, i.e. should be set to TRUE if the file contains column headers.

highlight.ids

A list of character vector(s) of ids that should be labeled. The name(s) of the character vector(s) is/are used as the text to label the individuals (the text is shown below the symbol of the individuals). Up to 3 character vectors are supported. Alternatively, a single character vector of ids can be submitted in which case the individuals are labeled with an asterisc ("*").

id

For method kinship: the FAData object from which the kinship matrix should be extracted, for all other methods the id of the individual.

For method plotPed: the id of the individual for which the pedigree should be built (see buildPed) and plotted.

Note: id can be a numeric or a character. Numeric ids are internally converted to character.

id.col

For FAData: the name of the column containing the id of the individuals.

label1

For plotPed: labels that should be plotted below the symbol for each individual. Should be either a named vector with names corresponding to the ids of the individuals in the pedigree or a vector of the same length than individuals that are to be plotted. For the former it is sufficient to just specify the labels for the individuals that should be shown.

label2

For plotPed: see label1. The labels are plotted in the second line below the symbol if HaploPainter is used to generate the plot, or on the top left corner of the individual's symbol for kinship2 plotting.

label3

For plotPed: see label1. The labels are plotted in the third line below the symbol if HaploPainter is used to generate the plot, or on the top right corner of the individual's symbol for kinship2 plotting.

max.generations.down

For buildPed: the maximal number of generations to look for children.

max.generations.up

For buildPed: the maximal number of generations to look for ancestors.

mother.col

For FAData: the name of the column containing the id of the mother.

na.rm

For trait: whether missing values in trait should be returned or not.

object

The FAData object.

only.phenotyped

Wheter only phenotyped individuals, i.e. individuals with a non-NA value in column affected (the trait information). Requires this information to be present.

pedigree

For FAData: either a data.frame with the pedigree information or a character(1) specifying the file name from which the pedigree should be read. See description below for more details.

proband.id

For plotPed: character vector with the id(s) of one ore more individuals that should be highlighted as probands. HaploPainter indicates probands with a "P" next to the symbol and an arrow pointing to the symbol.

prune

For buildPed: whether the smallest possible (connected) pedigree for the submitted ids should be build. This makes only sense if more than one id is submitted.

return.type

Either "data.frame" or "pedigree" if the pedigree information should be returned as a data.frame or pedigreeList object as defined in the kinship2 package.

sep

For FAData: only used if argument pedigree is a character(1), i.e. the file name from which the pedigree should be read. The sep argument is passed to the read.table function and specifies the field separator.

sex.col

For FAData: the name of the column spefifying the sex of the individuals.

symbol.related

For plotPed: the symbol which should be used to label individuals sharing kinship with the id for which the pedigree is generated and plotted.

trait

For FAData: a numeric vector with 0, 1 and NA or a logical vector indicating unaffected (but phenotyped), affected and not phenotyped individuals.

traitName

For FAData: an optional name for the trait.

value

For age<-: a named numeric vector. The names (at least some of them) have to match the ids in the pedigree of the object.

For pedigree<-:

For trait<-: a named numeric vector with 0, 1 and NA or a logical vector with FALSE, TRUE, NA for not affected, affected and not tested. The names (at least some of them) have to match the ids in the pedigree of the object.

...

Additional arguments to be passed to the plotting functions (doPlotPed for plotPed).

Details

See sections below for a description of the individual methods.

The buildPed method is a combination of the methods getAncestors, getChildren and getMissingMate, i.e. it first gets all ancestors for the specified id(s), determines then the children of all of the ids (submitted ids and their ancestors) and at last looks for any missing mates/spouses to complete the pedigree.

The plotPed function uses either the external perl program HaploPainter or the plotting capabilities of the kinship2 package. With HaploPainter, as it is an external too, it is not possible to display the plot directly, but each plot is automatically saved to a file (either "pdf", "ps", "svg" or "png"; can be specified with the device parameter). HaploPainter plotting supports also device = "txt" in which case the data table is exported (in the format expected by HaploPainter) to a tabulator delimited text file and the name of this text file is returned - no plot is created. Plotting with kinship2 (the default) allows to display the plot (device="plot") or export it to a file (device="pdf" or device="png").

The switchPlotfun function can be used to change the plotting system.

Value

Refer to the method and function description above for detailed information on the returned result object.

Objects from the Class

FAData objects are created by the constructor function FAData and should not be directly created by a call to new.

Slots

age

A (named) numerical vector with the age of the individuals. It is suggested to use the getter and setter methods described below to access this slot.

pedigree

A data.frame with the pedigree. It is suggested to use the getter and setter methods described below to access this slot.

.kinship

The kinship matrix for the kinship of each individual in the pedigree with each other. This slot should not be accessed directly, but the kinship method should be used instead.

traitname

The name of the trait being stored in the object.

.trait

A numerical vector with the trait information, 0, 1, NA, for phenotyped but not affected, affected and not tested, respectively. This slot should not be accessed directly, but the trait and trait<- methods should be used instead that ensure that the data is matched to the information in the pedigree.

Constructors, importing and exporting data

FAData

Constructor function to create a new FAData instance. In addition to submitting the pedigree information as data.frame, pedigree or pedigreeLinst it is possible to specify the name of the file from which the pedigree information should be read. The recognizes and imports plink ped and fam files (http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml) or from generic text files. For the latter, arguments header, sep, family.col, id.col, father.col, mother.col and sex.col allow to further specify which columns of the file contain what information etc. If argument pedigree is a data.frame, the column names "family", "id", "father", "mother" and "sex" are expected. Any additional columns are dropped.

The sex is expected to be encoded either as a numeric 1 (male), 2 (female) with any other number or NA representing unknown, or as a character vector or factor with "M", "m", "Male" or "male" for male and "F", "f", "Female" or "female" for female.

export

Export pedigree data to a file.

Accessors and subsetting

object$name

Access name column in the pedigree of the FAData object. The function returns a named vector wirh the names corresponding to the ids of the individuals or NULL if name does not correspoind to a column name in the pedigree. The trait data can be accessed either by object$trait or object$affected.

age

Returns the age of the individuals as a named numeric vector. If the pedigree is set, the order of the values corresponds always to the ordering of the individuals in the pedigree with NA for individuals for which the age is unknown. In case the age was never set it returns a vector of NAs with length equal to the number of individuals.

age<-

Setter for the age. Value has to be a named numeric vector.

pedigree

Returns the pedigree either as a data.frame or a pedigreeList object (defined in the kinship2 package) depending on the value of the parameter return.type (i.e. either return.type="data.frame" or return.type="pedigree"). If pedigree is called on any other object than a FAData object (or any object that inherits from that object), the pedigree method from the kinship2 package is called.

For the default return type (i.e. return.type="data.frame") a data.frame is returned with the following columns: "family": the ID of the family, "id": the ID of the individual, "father": the ID of the individual's father. Founder individuals, i.e. individuals for whom the father and mother is not known in the data set, contain a NA in this column. "mother": the ID of the individual's father. Founder individuals, i.e. individuals for whom the father and mother is not known in the data set, contain a NA in this column. "sex": the sex of the individuals encoded as a factor with levels "M" and "F" for male and female, or NA for not known. If trait information is available in the object the returned data.frame will also contain a column named affected with the information whether the individual is affected (1), not affected (0) or was not tested/phenotyped NA.

pedigree<-

Setter for the pedigree slow. Value can be a data.frame with columns containing the family id, individual id, father id, mother id and sex (in this order) or a pedigree or pedigreeList object as defined in the kinship2 package.

object[i, ]

Subsets the FAData object to individuals specified with i which can be a logical, numeric or character vector. For the latter, the elements have to be the ids of the individuals (i.e. rownames of pedigree(object)). Returns the sub-setted object. Note that subsetting other than by family might result in a non-valid pedigree (e.g. if mother or father ID are not available in the sub-setted pedigree).

trait

Get the trait vector from the object. By default, the ordering is the same as pedigree, setting argument na.rm=TRUE removes all NA values, thus the ordering and length might be different. Returns a named vector with the names corresponding to the ids of the individuals.

trait<-

Setter for the trait slot. Can be a named numeric vector (values 0, 1 and NA) or logical vector (values FALSE, TRUE and NA) with the names matching the ids of the individuals in the pedigree. The method internally matches and re-orders the trait vector to match the ordering of the ids in the pedigree.

Basic usage

affectedIndividuals

Returns a character vector with the ids of the affected individuals, i.e. the id of the individuals with a value other than 0 or NA in the trait. If no trait data is available the method returns NULL.

buildPed

Builds a pedigree for the specified id(s) containing generations defined by max.generations.up and max.generations.down and returns it as a data.frame. The pedigree contains all individuals in the family sharing kinship with the input individual(s) and mates needed to complete the pedigree. For prune=TRUE the function tries to find the smallest connected pedigree for all the submitted ids.

family

Returns the pedigree for a full family. In contrast to buildPed which constructs a (sub)pedigree for a specific individual, this method returns the pedigree of the complete family for an individual (if id is specified). The function returns either a data.frame or a pedigreeList with the pedigree for the family.

kinship

Extracts the pre-calculated kinship matrix, i.e. a symmetric matrix with the kinship between all individuals in the pedigree. The matrix is calculated using the kinship method provided by the kinship2 package [Sinwell (2014)]. The function returns a dsCMatrix from the Matrix package.

pedigreeSize

Returns the size, i.e. the number of individuals (rows) in the pedigree.

phenotypedIndividuals

Returns a character vector with the ids of the phenotyped individuals, i.e. the id of all individuals that have a non-NA value in thetrait. If no trait data is available the method returns NULL.

plotPed

Creates the pedigree for the submitted id(s) or family and plots it (i.e. saves it to the specified file). See details above for more information. Returns the file name of the file to which the pedigree plot was exported or NULL for kinship2 plotting and device="plot".

For HaploPainter plotting and device = "txt" the name of the file to which the plotting data has been exported is returned.

See doPlotPed for more information.

Pedigree analysis methods

Methods for familial aggregation and other pedigree analysis methods are described on the PedigreeAnalysis help page.

Pedigree utilities

A variety of different pedigree utilities are defined for FAData objects. For the full list of methods see the PedigreeUtils help page.

Note

The ids of individuals, father, mother and family can be either numeric or characters, internally, all ids will however be handled as characters.

The pedigree<- setter method removes all white spaces in columns "id", "family", "father" and "mother" of the pedigree.

Author(s)

Johannes Rainer.

References

Sinwell JP, Therneau TM & Schaid DJ (2014) The kinship2 R package for pedigree data. Human heredity 78:91-93.

See Also

pedigree, FAProbResults, FAKinGroupResults, FAKinSumResults, FAGenIndexResults, doPlotPed, PedigreeUtils, getAll, PedigreeAnalysis

Examples

##########################
##
##  Create a new FAData object
##
## Load the Minnesota Breast Cancer record and subset to the
## first families.
data(minnbreast)
mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ]
mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")]
## Renaming column names
colnames(mbped) <- c("family", "id", "father", "mother", "sex")
## Defining the optional argument age.
Age <- mbsub$endage
names(Age) <- mbsub$id
## Create the object
fad <- FAData(pedigree=mbped, age=Age)

fad

## Extract the ids directly...
head(fad$id)

## Extract the kinship matrix
dim(kinship(fad))

## What's the size of the pedigree?
pedigreeSize(fad)

## Importing a "ped" file.
pedFile <- system.file("txt/minnbreastsub.ped.gz", package="FamAgg")
## Quick glance at the file.
readLines(pedFile, n=1)
fad <- FAData(pedFile)

head(pedigree(fad))

## Creating the FAData reading data from a txt file.
pedFile <- system.file("txt/minnbreastsub.txt", package="FamAgg")
fad <- FAData(pedigree=pedFile, header=TRUE, id.col="id",
              family.col="famid", father.col="fatherid",
              mother.col="motherid")
## Adding the age
age(fad) <- Age
fad
## List all families in the pedigree along with the number of
## individuals
table(fad$family)

##########################
##
##  Basic usage
##
## Extracting the pedigree information
ped <- pedigree(fad)
## By default the pedigree is returned as a data.frame.
class(ped)
head(ped)

## In addition, we can extract the pedigree as a pedigreeList
pedigree(fad, return.type="pedigree")

## Return the ids of all ancestors of individual 6
## up to 3 generations
getAncestors(fad, id="6")

## Build the pedigree for individual 6: this includes all of its
## children and all of its ancestors up to the maximal number of
## specified generations.
buildPed(fad, id=6)
## Which is a sub-pedigree of the complete family:
family(fad, id=6)

## In addition we can specify manually some ids in the pedigree and
## generate the smallest possible pedigree containing all ids:
buildPed(fad, id=c(6, 23, 28), prune=TRUE)

## Get the list of all ids sharing kinship with individuals
## 5 and 9
shareKinship(fad, id=c("5", "9"))

## Subset the fad to family "4"
subFad <- fad[fad$family == "4", ]
subFad

## Export the pedigree from this family to a ped file
tmpFile <- tempfile()
export(subFad, con=tmpFile, format="ped")

head(read.table(tmpFile, sep="\t"))

##########################
##
##  Plotting
##
## Plot the pedigree for individual 6.
plotPed(fad, id=6)

## Alternatively, exporte it to a temporary file
pfile <- plotPed(fad, id=6, device="pdf")
pfile

## Highlighting some of the individuals:
## first get to know which other individuals are in the pedigree
plotPed(fad, id=6, highlight.ids=list(hello=c(1, 2, 4)))


##########################
##
##  Adding trait data
##
fad <- FAData(pedigree=mbped, age=Age)
tcancer <- mbsub$cancer
names(tcancer) <- mbsub$id
trait(fad) <- tcancer
## Now we can plot the pedigree also showing the affected status.
plotPed(fad, id=6)


## Alternatively, create the FAData with the trait data
fad <- FAData(pedigree=mbped, trait=mbsub$cancer, traitName="cancer")
plotPed(fad, id=6)


EuracBiomedicalResearch/FamAgg documentation built on March 12, 2023, 7:45 p.m.