FAGenIndexResults: Genealogical Index

FAGenIndexResults-classR Documentation

Genealogical Index

Description

The genealogical index [Hill, 1980], also referred to as the genealogical index of familiality (GIF) in the literature, is a method to identify familial clustering of diseases or other traits. For a given trait, the method computes the mean kinship between affected in the whole pedigree along with mean kinships of randomly drawn sets of individuals. The distribution of average kinship values among the control sets is used to estimate the probability that the observed level of kinship among the cases is due to chance.

Usage


## S4 method for signature 'FAGenIndexResults'
plotPed(object, id=NULL, family=NULL,
                                     filename=NULL, device="plot", ...)

## S4 method for signature 'FAGenIndexResults'
plotRes(object, id=NULL, family=NULL,
                                     addLegend=TRUE, type="density", ...)

## S4 method for signature 'FAGenIndexResults'
result(object, method="BH")

## S4 method for signature 'FAGenIndexResults'
runSimulation(object, nsim=50000,
                                           perFamilyTest=FALSE,
                                           controlSetMethod="getAll",
                                           rm.singletons=TRUE, strata=NULL, ...)

## S4 replacement method for signature 'FAGenIndexResults'
trait(object) <- value

Arguments

(in alphabetic order)

addLegend

For plotRes: if a legend should be added to the plot.

controlSetMethod

For runSimulation: the method (i.e. name of the function) that should be used to define the set of (eventually matched) control individuals from which the random samples are taken. Supported functions are getAll, getSexMatched and getExternalMatched. For perFamilyTest=TRUE also getGenerationMatched and getGenerationSexMatched are supported. Note: for getExternalMatched, a numeric, character or factor vector to be used for the matching has to be submitted to runSimulation as additional argument match.using.

device

For plotPed: see plotPed for more details.

family

For plotPed: the family for which the pedigree should be plotted. For plotRes: the family for which the genealogical index analysis simulation results should be shown. Only supported if perFamilyTest=TRUE.

filename

For plotPed: the file name to which the pedigree plot should be exported. See plotPed for more details.

id

For plotPed: the id of an indiviual from a family for which the pedigree should be plotted. For plotRes: the id of an individual from a family for which the genealogical index analysis simulation results should be shown. Only supported if perFamilyTest=TRUE.

method

The multiple hypothesis testing method. All methods supported by p.adjust are allowed.

nsim

Number of simulations.

perFamilyTest

For runSimulation: whether the test should be performed on the whole pedigree (default) or separately within each family. In the latter case the test evaluates the presence of clustered affected individuals within each family.

rm.singletons

For runSimulation: whether unconnected individuals in the pedigree (i.e. singletons) should be removed.

object

The FAGenIndexResults object.

strata

For runSimulation: a numeric, character of factor characterizing each individual in the pedigree. The length of this vector and the ordering has to match the pedigree. This vector allows to perform stratified random sampling. See details or examples for more information.

type

For plotRes: either "density" (the default) or "hist" specifying whether the distribution of expected values from the simulation should be visualized as a density plot or histogram.

value

For trait<-: can be a named numeric, character or factor vector. The names (at least some of them) have to match the ids in the pedigree of the object.

...

For plotPed: additional arguments to be submitted to the internal buildPed call and to plotPed.

For runSimulation: additional arguments passed to the choosen controlSetMethod function (e.g. match.using for getExternalMatched).

Details

This implementation differs from the original method from Hill as it allows, in addition to perform per family analyses, to use also stratified sampling and allows a more flexible definition of the set of matched control individuals. The controlSetMethod parameter allows to specify a method to define the matched control set (e.g. matched by sex or matched by any externally provided vector).

Stratified sampling allows to even further fine tune the selection of matched controls. Assuming that in a pedigree the group of affected consists of 5 females and 3 male individuals, passing the sex of all individuals to the function (e.g. strata=fad$sex, with fad being the FAData object containing the pedigree to be analyzed) results in random sets with the same proportion of male/female individuals (i.e. 5 females, 3 males).

Note that, if strata is specified, all individuals with a missing value in strata (also affected individuals) are excluded from the analysis.

Note that by default singletons (i.e. unconnected individuals in the pedigree) are removed from the pedigree prior the analysis. Set rm.singletons=FALSE if you do not want them to be removed.

By default, the genealogical index is calculated on the whole pedigree, but it is also possible to evaluate within-family clustering of cases by specifying perFamilyTest=TRUE. In that case, it is also possible to use the getGenerationMatched and getGenerationSexMatched functions to define the set of matched controls from which random samples will be taken.

A call to the setter methods trait<- resets any simulation results present in the sim slot, thus, the object can be re-used to perform a simulation analysis using the new trait data.

Value

Refer to the method and function description above for detailed information on the returned result object.

Objects from the Class

FAGenIndexResults objects are created calling the genealogicalIndexTest method on a FAData object.

Extends

Class FAData directly.

Slots

controlSetMethod

A character specifying the name of the method used to define the set of control individuals from which random samples were taken.

nsim

Number of simulations.

perFamilyTest

Logical indicating whether a per-family test was performed.

sim

The result of the simulation. This slot should not be accessed directly, use the result method to extract result information.

Methods and Functions

plotPed Plots a pedigree for one of the affected individuals in the simulation results. The id of the selected affected individual (specified with argument id) is highlighted in red. See plotPed for more details.

plotRes Plots the results from a genealogical index simulation analysis. The distribution of the mean kinship values of the randomly drawn controls are displayed as a grey density plot, the observed mean kinship value of all affected as a blue vertical line.

result

Returns the result from the simulation as a data.frame with columns: "trait_name": the name of the trait. "total_phenotyped": total number of individuals in the pedigree phenotyped in the analyzed trait. "total_affected": total number ofindividuals in the pedigree that are affected in the analyzed trait (i.e. number of cases). "entity_id": the id for the analyzed entity, being either the whole pedigree (in which case the id will be "1") or the id of the family (if perFamilyTest=TRUE). "entity_ctrls": the number of (matched) control individuals from which the random samples were drawn. "entity_affected": the number of affected individuals in the entity. This number can differ from the number of affected, if strata was specified and some of the affected have a missing value in strata. "genealogical_index": the genealogical index of familiality (gif), i.e. the mean kinship value between all affected in the entity (pedigree or family). To be consistent with the original implementations, the genealogical index is the mean kinship multiplied with 100000. "pvalue": the p-value for the significance of the mean kinship. "padj": the p-value adjusted for multiple hypothesis testing (with the method specified with argument method).

The returned data.frame is sorted by column "pvalue", its rownames correspond to column "entity_id".

runSimulation

Performs the simulation analysis based on the pedigree and trait information stored in the object. Returns a FAGenIndexResults object with the results of the simulation.

trait<-

Set the trait information. This method will reset all simulation results saved in the sim slot.

Note

Subsetting (using the [ operator) is not supported.

Author(s)

Johannes Rainer

References

Hill, J. R. (1980) A survey of cancer sites by kinship in the Utah Mormon population. In Cairns J, Lyon JL, Skolnick M (eds): Cancer Incidence in Defined Populations. Banbury Report 4. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp 299–318.

See Also

FAData, trait, probabilityTest, kinshipGroupTest, kinshipSumTest, familialIncidenceRateTest, fsirTest, plotPed

Examples

##########################
##
##  Perform the simulation analysis
##
## Load the Minnesota Breast Cancer data set.
data(minnbreast)

## Subset to some families and generate a pedigree data.frame
mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 14 |
                  minnbreast$famid == 6 | minnbreast$famid == 8, ]
PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")]
colnames(PedDf) <- c("family", "id", "father", "mother", "sex")

## Generate the FAData.
fad <- FAData(pedigree=PedDf)

## Specify the trait.
tcancer <- mbsub$cancer
names(tcancer) <- mbsub$id

## Perform the test with default settings, i.e. use all individuals
## in the pedigree as control set from which random samples are drawn
## and perform the analysis on the whole pedigree.
gi <- genealogicalIndexTest(fad, trait=tcancer, traitName="cancer",
                            nsim=1000,)
## Just show some information
gi

## Show the results
result(gi)

## Plot the observed mean kinship and the distribution of the mean kinship of
## random samples.
plotRes(gi)

## Plot the pedigree for one of the families. All individuals
## used as matched control set are highlighted in red.
plotPed(gi, family="8")

## Repeat the analysis using the sex as strata. This will result in stratified
## random sampling with the number of female and male individuals selected in
## each permutation corresponding to the numbers below
table(gi$sex[affectedIndividuals(gi)])
giStrata <- runSimulation(gi, nsim=1000, strata=gi$sex)
result(giStrata)


## Alternatively, we can use "getSexMatched" as the function to define the set
## of control individuals. Just, in the present case both male and females
## individuals will be selected since also there are male and female individuals
## among the affected cases.
giPerFam <- runSimulation(gi, nsim=1000, controlSetMethod="getSexMatched",
                          perFamilyTest=TRUE)
result(giPerFam)

## For those families in which there are only female cases, random samples
## were drawn among only female individuals (within the same family). These
## are highlighted in red in the pedigree plot:
plotPed(giPerFam, family="14", cex=0.5)

## Plot the simulation result for this family:
plotRes(giPerFam, family="14")


EuracBiomedicalResearch/FamAgg documentation built on March 12, 2023, 7:45 p.m.