Pedigree-utils: Basic pedigree utilities

PedigreeUtilsR Documentation

Basic pedigree utilities

Description

Utility functions to access, modify or subset pedigrees. Most of these functions can be applied to simple data.frame in pedigree format or pedigree or pedigreeList objects defined in the kinship2 package.

Usage


## S4 method for signature 'missing'
cliques(object, ...)

connectedSubgraph(graph, nodes, mode="all", all.nodes=TRUE, ifnotfound)

## S4 method for signature 'FAData'
countGenerations(object, id=NULL, direction="down", ...)

## S4 method for signature 'FAData'
estimateGenerations(object, family=NULL, ...)

## S4 method for signature 'FAData'
findFounders(object, family=NULL, id = NULL, ...)

## S4 method for signature 'FAData'
generationsFrom(object, id=NULL, ...)

## S4 method for signature 'FAData'
getAncestors(object, id=NULL, max.generations=3, ...)

## S4 method for signature 'FAData'
getChildren(object, id=NULL, max.generations=16, ...)

## S4 method for signature 'FAData'
getCommonAncestor(object, id, method="min.dist")

## S4 method for signature 'FAData'
getFounders(object, ...)

## S4 method for signature 'FAData'
getMissingMate(object, id=NULL, ...)

## S4 method for signature 'FAData'
getSiblings(object, id=NULL, ...)

## S4 method for signature 'FAData'
getSingletons(object, ...)

ped2graph(ped)

## S4 method for signature 'FAData'
removeSingletons(object, ...)

## S4 method for signature 'data.frame'
removeSingletons(object, ...)

subPedigree(ped, id=NULL, all=TRUE)

## S4 method for signature 'FAData'
shareKinship(object, id, rmKinship=0)

Arguments

(in alphabetic order)

all

For subPedigree: if all individuals have to be present in the sub-pedigree.

all.nodes

For connectedSubgraph: if all nodes have to be present in the resulting graph, or only those that are connected with each other.

direction

For countGenerations: whether the number of ancestor ("up") generations or offspring ("down") generation should be counted.

family

A character or numeric representing the family id. For doFindFounders: the id of the family in the pedigree for which the founders should be identified. Uses the first family in the pedigree if not specified. For estimateGenerations: optional id of the family if generation numbers should only be calculated for one family. Otherwiseif, the generations are estimated for all families (separately) in the object.

graph

An igraph graph object.

id

A character or numeric vector length 1 or longer specifying the id(s) of the individual(s). For generationsFrom and findFounders only a single id should be submitted.

ifnotfound

For connectedSubgraph: if not defined, the function throws an error if no subgraph can be specified. If defined, its value is returned if no subgraph was found.

max.generations

For getAncestors and getChildren: the maximal number of ancestor or offspring generations that should be returned.

method

For getCommonAncestor: the method by which the closest common ancestor sould be identified. Either "min.dist" (ancestor with the minimal distance to any of the individuals) or "smallest.mean.dist" (ancestor with the smallest mean distance to any of the individuals).

mode

For connectedSubgraph: either "all", "in", "out" specifying how distances and paths between individual nodes should be determined. See help for function shortest_paths in package igraph for more details.

nodes

For connectedSubgraph: A character vector of node (vertex) names for which the subgraph should be defined.

object

For cliques: passed to the cliques function from the igraph package. For all other methods: either a FAData object (or any object inheriting from it), a data.frame, pedigree or pedigreeList objects (the latter being defined in the kinship2 package).

ped

Either a data.frame or a pedigree object specifying the pedigree. If a data.frame is submitted, the columns id, family, father, mother and sex are required.

rmKinship

For shareKinship: threshold for inclusion in being reported. Pairs of individuals with a kinship less or equal than this value will be omitted. This can be used to remove very distant relatives.

...

For cliques: additional arguments passed to the cliques function from the igraph package.

Value

Refer to the method and function description above for detailed information on the returned result object.

Basic pedigree utilities

countGenerations

Count the generations up- or down the pedigree for the specified individual(s), i.e. determine the number of ancestor or offspring generations defined in the pedigree for the specified individual(s). Returns a named numeric vector, names corresponding to the individual's id, with the number of generations for each specified individual.

findFounders

Identifies the founder couple with the largest number of offspring generations in the pedigree. The provided pedigree object/data.frame can contain pedigrees of multiple families, thus, to identify the founder pair for a family its ID can be provided with the family parameter. Alternatively, the ID of an individual can be specified, in which case the founder pair of the (full) pedigree of the specified individual is identified. If two or more couples have the same, largest number of offspring generations, the first couple is selected. Returns a character vector of length 2 with the ids of the founder individuals.

getFounders

Returns the ids of all founders in the pedigree. A founder is an individual from which neither father nor mother is known in the pedigree.

getSingletons

Returns the ids of all singletons, i.e. individuals in the pedigree that are not connected to any other individual (have no parents in the pedigree and no children).

getAncestors

Identify and return the ids of ancestor generations (up to max.generations) for the specified individual(s).

getChildren

Identify and return the ids of offspring generations (up to max.generations) for the specified individual(s).

getCommonAncestor

Finds the closest common ancestor between specified individuals (2 or more ids are required). Returns a character vector with the ids of the ancestors or NA if no common ancestor was found.

getMissingMate

The function evaluates if in the sub-pedigree defined by the specified ids one or more mates (spouse) are missing and if so it returns their ids.

getSiblings

Get siblings for the specified id(s). Returns their ids as character, or numeric vector.

removeSingletons

Removes all unconnected individuals (i.e. singletons) from the pedigree. Returns a data.frame with the pedigree cleaned from all singletons. Note that, due to internal sanitizing, columns "father" and "mother" in the resulting data.frame have a NA for individuals for which the father or mother is not known in the pedigree.

subPedigree

Finds the smallest pedigree containing all specified individuals. Depending on the input, a data.frame, pedigree or pedigreeList.

Advanced pedigree methods

estimateGenerations

Estimates generation levels/numbers for each, or only one, family in the object. Generation numbers are always relative to the founder couple (defined by findFounders). Returns (always) a named list of generation numbers. The names of the list represent the family id, the names of the numeric vector of generations the id of the individuals in the family.

generationsFrom

Determine generations starting from the specified individual. Siblings including their mates and all other in the same generation () are assigned generation 0, ancestor generations (all their parents etc) negative generation numbers, decreasing with ancestor level and their offspring positive numbers, increasing with each generation. Generations are only estimated within the family of the individual, also, if the pedigree consists of un-connected sub-pedigree, generation numbers will only be calculated for the sub-pedigree containing the specified individual. The function returns a named numeric vector of generation numbers, the names corresponding to the ids of the individuals in the specified individual's family. Not connected individuals in the family get a NA generation number.

shareKinship

Finds all related individuals (individuals sharing kinship > rmKinship with the individual) for the specified individual(s) in the pedigree and returns their ids as a character vector.

Graph theory related functions

cliques

Wrapper method passing all arguments to the cliques function from the igraph package.

connectedSubgraph

Finds the (eventually smallest) connected subgraph of all specified nodes. Returns an igraph object representing the subgraph of the specified nodes.

ped2graph

Transforms the pedigree into a (directed) graph with the direction of the edges being always from parent to child. An igraph object.

Author(s)

Johannes Rainer.

See Also

pedigree, FAData, FAProbResults, FAKinGroupResults, FAKinSumResults, PedigreeAnalysis kinshipPairs

Examples

##########################
##
##  Defining a small pedigree
##
## load the Minnesota Breast Cancer record and subset to the
## first families.
data(minnbreast)
mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ]
mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")]
## renaming column names
colnames(mbped) <- c("family", "id", "father", "mother", "sex")

## plot the pedigree for family 4 to get an overview.
switchPlotfun(method="ks2paint")
fam4 <- mbped[mbped$family==4, ]
doPlotPed(individual=fam4$id, father=fam4$father, mother=fam4$mother,
          gender=fam4$sex, device="plot")

## find the closest common ancestor between individuals 23, 3 and 8
getCommonAncestor(fam4, id=c(23, 3, 8))

## create the smallest sub-pedigree for individuals 21, 22 and 25
subPedigree(fam4, id=c(21, 22, 25))
## plot that
fam4sub <- subPedigree(fam4, id=c(21, 22, 25))
doPlotPed(individual=fam4sub$id, father=fam4sub$father, mother=fam4sub$mother,
          gender=fam4sub$sex, device="plot")

#########################
##
##  Basic pedigree utils
##
## Note: the same methods can be applied to a data.frame representing
## a pedigree, or a FAData, pedigree or pedigreeList object.

## Find the founder couple for family 4
findFounders(fam4, family=4)

## Alternatively, find the founders for the pedigree in which ibdividual 20 is a
## member
findFounders(fam4, id = 20)

## Return all founders in the pedigree.
getFounders(fam4)

## Get all founders without children (i.e. singletons).
getSingletons(fam4)

## Clean the pedigree from all singletons
fam4noS <- removeSingletons(fam4)
nrow(fam4)
nrow(fam4noS)

## Count the offspring generations for individual "4"
countGenerations(fam4, id="4")

## Get the ids of all ancestors for that individual
getAncestors(fam4, id="4")

## Get the ids of the children of this individual
getChildren(fam4, id="4", max.generations=1)

## Get the ids of the complete offspring for this individuals
getChildren(fam4, id="4")

## Create a FAData object from the pedigree data.frame
fad <- FAData(fam4)
## get the list of all ids sharing kinship with individuals
## 5 and 9
shareKinship(fad, id=c("5", "9"))

## Count the numbers of generations of ancestors for individual 12
countGenerations(fad, id="12", direction="up")

## Count the numbers of offspring generations for individuals 2 and 29
countGenerations(fad, id=c("2", "29"))

## Get all brothers/sisters for individual 9
getSiblings(fad, id="9")

## Determine generation levels starting from individual "9"
generationsFrom(fad, id="9")

## Estimate generations relative to the founder couple for each
## family in the submitted object, a data.frame in the example below
estimateGenerations(mbped)


#########################
##
##  Graph utilities
##
## Convert the pedigree into a graph
pgraph <- ped2graph(fam4)
plot(pgraph)

## Make a subgraph containing nodes 10, 22, 12 and 14
sgraph <- connectedSubgraph(pgraph, c("10", "22", "12", "14"))
plot(sgraph)


EuracBiomedicalResearch/FamAgg documentation built on March 12, 2023, 7:45 p.m.