alleHap-package: Allele Imputation and Haplotype Reconstruction from Pedigree...

Description Details Author(s) References Examples

Description

Tools to simulate alphanumeric alleles, impute genetic missing data and reconstruct non-recombinant haplotypes from pedigree databases in a deterministic way. Allelic simulations can be implemented taking into account many factors (such as number of families, markers, alleles per marker, probability and proportion of missing genotypes, recombination rate, etc). Genotype imputation can be used with simulated datasets or real databases (previously loaded in .ped format). Haplotype reconstruction can be carried out even with missing data, since the program firstly imputes each family genotype (without a reference panel), to later reconstruct the corresponding haplotypes for each family member. All this considering that each individual (due to meiosis) should unequivocally have two alleles per marker (one inherited from each parent), and thus imputation and reconstruction results can be deterministically calculated.

Details

Package: alleHap
Type: Package
Version: 0.9.9
Date: 2017-08-19
Depends: abind, stats, tools, utils
License: GPL (>=2)

Author(s)

Nathan Medina-Rodriguez and Angelo Santana

Maintainer: Nathan Medina-Rodriguez <nathan.medina@ulpgc.es>

References

Medina-Rodriguez, N. Santana A. et al. (2014) alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees. BMC Bioinformatics, 15, A6 (S-3).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
## Generation of 10 simulated families with 2 children per family and 20 markers
dataset <- alleSimulator(10,2,20)  # List with simulated alleles and haplotypes
datasetAlls <- dataset[[1]]        # Dataset containing alleles
datasetHaps <- dataset[[2]]        # Dataset containing haplotypes

## Loading of a dataset in .ped format with alphabetical alleles (A,C,G,T)
example1 <- file.path(find.package("alleHap"), "examples", "example1.ped")
datasetAlls1 <- alleLoader(example1)

## Loading of a dataset in .ped format with numerical alleles
example2 <- file.path(find.package("alleHap"), "examples", "example2.ped")
datasetAlls2 <- alleLoader(example2)

## Allele imputation of families with parental missing data
datasetAlls <- alleSimulator(10,4,6,missParProb=0.2)[[1]]
famsImputed <- alleImputer(datasetAlls)

## Allele imputation of families with offspring missing data
datasetAlls <- alleSimulator(10,4,6,missOffProb=0.2)[[1]]
famsImputed <- alleImputer(datasetAlls)

## Haplotype reconstruction for 3 families without missing data.
simulatedFams <- alleSimulator(3,3,6)  
(famsAlls <- simulatedFams[[1]])      # Original data 
famsList <- alleHaplotyper(famsAlls)  # List containing families' alleles and haplotypes
famsList$reImputedAlls                # Re-imputed alleles
famsList$haplotypes                   # Reconstructed haplotypes

## Haplotype reconstruction from a PED file
pedFamPath <- file.path(find.package("alleHap"), "examples", "example3.ped") # PED file path
pedFamAlls <- alleLoader(pedFamPath,dataSummary=FALSE) 
pedFamList <- alleHaplotyper(pedFamAlls)
pedFamAlls                # Original data 
pedFamList$reImputedAlls  # Re-imputed alleles 
pedFamList$haplotypes     # Reconstructed haplotypes

Example output

===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/usr/local/lib/R/site-library/alleHap/examples/example1.ped

===== DATA COUNTING ======
Number of families: 50
Number of individuals: 227
Number of founders: 100
Number of children: 127
Number of males: 118
Number of females: 109
Number of markers: 12
===========================

======== DATA RANGES =========
Family IDs: [1,...,50]
Individual IDs: [1,...,8]
Paternal IDs: [0,1]
Maternal IDs: [0,2]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 0
Missing phenotypes: 0
Missing alleles: 0
Markers with missing values: 0
================================
===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/usr/local/lib/R/site-library/alleHap/examples/example2.ped

===== DATA COUNTING ======
Number of families: 11
Number of individuals: 50
Number of founders: 22
Number of children: 28
Number of males: 26
Number of females: 22
Number of markers: 3
===========================

======== DATA RANGES =========
Family IDs: [1036,...,1939]
Individual IDs: [1,...,7]
Paternal IDs: [0,1]
Maternal IDs: [0,2,99]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 2
Missing phenotypes: 0
Missing alleles: 42
Markers with missing values: 3
================================
===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/work/tmp

===== DATA COUNTING ======
Number of families: 10
Number of individuals: 60
Number of founders: 20
Number of children: 40
Number of males: 25
Number of females: 35
Number of markers: 6
===========================

======== DATA RANGES =========
Family IDs: [FAM01,...,FAM09]
Individual IDs: [1,...,6]
Paternal IDs: [0,1]
Maternal IDs: [0,2]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 0
Missing phenotypes: 0
Missing alleles: 54
Markers with missing values: 6
================================

======= IMPUTATION SUMMARY =======
0 markers (0 alleles) have been 
turned into missing in 0 families
due to familial inconsistencies.
Alleles initially missing: 54
Number of imputed alleles: 33
Imputation rate: 0.61
Imputation time: 0.24
==================================
===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/work/tmp

===== DATA COUNTING ======
Number of families: 10
Number of individuals: 60
Number of founders: 20
Number of children: 40
Number of males: 28
Number of females: 32
Number of markers: 6
===========================

======== DATA RANGES =========
Family IDs: [FAM01,...,FAM09]
Individual IDs: [1,...,6]
Paternal IDs: [0,1]
Maternal IDs: [0,2]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 0
Missing phenotypes: 0
Missing alleles: 96
Markers with missing values: 6
================================

======= IMPUTATION SUMMARY =======
0 markers (0 alleles) have been 
turned into missing in 0 families
due to familial inconsistencies.
Alleles initially missing: 96
Number of imputed alleles: 54
Imputation rate: 0.56
Imputation time: 0.12
==================================
   famID indID patID matID sex phen Mk1_1 Mk1_2 Mk2_1 Mk2_2 Mk3_1 Mk3_2 Mk4_1
1  FAM01     1     0     0   1    1     A     A     C     G     C     T     C
2  FAM01     2     0     0   2    1     A     C     G     G     C     T     C
3  FAM01     3     1     2   2    1     A     C     C     G     T     T     C
4  FAM01     4     1     2   2    1     A     A     G     G     C     C     C
5  FAM01     5     1     2   1    1     A     C     G     G     C     T     C
6  FAM02     1     0     0   1    2     A     A     C     G     C     C     C
7  FAM02     2     0     0   2    1     A     A     C     C     C     T     C
8  FAM02     3     1     2   2    1     A     A     C     G     C     T     C
9  FAM02     4     1     2   1    1     A     A     C     G     C     T     C
10 FAM02     5     1     2   2    2     A     A     C     G     C     C     C
11 FAM03     1     0     0   1    1     A     A     G     G     C     T     T
12 FAM03     2     0     0   2    2     A     A     C     G     C     T     C
13 FAM03     3     1     2   1    1     A     A     C     G     C     T     C
14 FAM03     4     1     2   2    1     A     A     C     G     T     T     C
15 FAM03     5     1     2   2    1     A     A     G     G     C     C     T
   Mk4_2 Mk5_1 Mk5_2 Mk6_1 Mk6_2
1      T     C     G     T     T
2      C     C     G     T     T
3      C     C     C     T     T
4      T     G     G     T     T
5      T     C     G     T     T
6      T     C     C     C     C
7      C     C     C     C     T
8      C     C     C     C     C
9      C     C     C     C     C
10     C     C     C     C     T
11     T     C     C     C     T
12     T     C     C     C     T
13     T     C     C     C     T
14     T     C     C     T     T
15     T     C     C     C     C
===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/work/tmp

===== DATA COUNTING ======
Number of families: 3
Number of individuals: 15
Number of founders: 6
Number of children: 9
Number of males: 6
Number of females: 9
Number of markers: 6
===========================

======== DATA RANGES =========
Family IDs: [FAM01,...,FAM03]
Individual IDs: [1,...,5]
Paternal IDs: [0,1]
Maternal IDs: [0,2]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 0
Missing phenotypes: 0
Missing alleles: 0
Markers with missing values: 0
================================

======= IMPUTATION SUMMARY =======
0 markers (0 alleles) have been 
turned into missing in 0 families
due to familial inconsistencies.
Alleles initially missing: 0
Number of imputed alleles: 0
Imputation rate: 0
Imputation time: 0.04
==================================

========= HAPLOTYPING SUMMARY ==========
Re-imputation rate: 0
Proportion of phased alleles: 1
Proportion of non-phased alleles: 0
Proportion of missing haplotypes: 0
Proportion of partial haplotypes: 0
Proportion of full haplotypes: 1
Haplotyping time: 0.979
========================================
   famID indID patID matID sex phen Mk1_1 Mk1_2 Mk2_1 Mk2_2 Mk3_1 Mk3_2 Mk4_1
1  FAM01     1     0     0   1    1     A     A     C     G     T     C     C
2  FAM01     2     0     0   2    1     C     A     G     G     T     C     C
3  FAM01     3     1     2   2    1     A     C     C     G     T     T     C
4  FAM01     4     1     2   2    1     A     A     G     G     C     C     T
5  FAM01     5     1     2   1    1     A     C     G     G     C     T     T
6  FAM02     1     0     0   1    2     A     A     G     C     C     C     C
7  FAM02     2     0     0   2    1     A     A     C     C     T     C     C
8  FAM02     3     1     2   2    1     A     A     G     C     C     T     C
9  FAM02     4     1     2   1    1     A     A     G     C     C     T     C
10 FAM02     5     1     2   2    2     A     A     G     C     C     C     C
11 FAM03     1     0     0   1    1     A     A     G     G     T     C     T
12 FAM03     2     0     0   2    2     A     A     C     G     T     C     C
13 FAM03     3     1     2   1    1     A     A     G     C     C     T     T
14 FAM03     4     1     2   2    1     A     A     G     C     T     T     T
15 FAM03     5     1     2   2    1     A     A     G     G     C     C     T
   Mk4_2 Mk5_1 Mk5_2 Mk6_1 Mk6_2
1      T     C     G     T     T
2      C     C     G     T     T
3      C     C     C     T     T
4      C     G     G     T     T
5      C     G     C     T     T
6      T     C     C     C     C
7      C     C     C     C     T
8      C     C     C     C     C
9      C     C     C     C     C
10     C     C     C     C     T
11     T     C     C     T     C
12     T     C     C     T     C
13     C     C     C     C     T
14     C     C     C     T     T
15     T     C     C     C     C
   famID indID patID matID sex phen   hap1   hap2
1  FAM01     1     0     0   1    1 ACTCCT AGCTGT
2  FAM01     2     0     0   2    1 CGTCCT AGCCGT
3  FAM01     3     1     2   2    1 ACTCCT CGTCCT
4  FAM01     4     1     2   2    1 AGCTGT AGCCGT
5  FAM01     5     1     2   1    1 AGCTGT CGTCCT
6  FAM02     1     0     0   1    2 AGCCCC ACCTCC
7  FAM02     2     0     0   2    1 ACTCCC ACCCCT
8  FAM02     3     1     2   2    1 AGCCCC ACTCCC
9  FAM02     4     1     2   1    1 AGCCCC ACTCCC
10 FAM02     5     1     2   2    2 AGCCCC ACCCCT
11 FAM03     1     0     0   1    1 AGTTCT AGCTCC
12 FAM03     2     0     0   2    2 ACTCCT AGCTCC
13 FAM03     3     1     2   1    1 AGCTCC ACTCCT
14 FAM03     4     1     2   2    1 AGTTCT ACTCCT
15 FAM03     5     1     2   2    1 AGCTCC AGCTCC
===========================================sh: 1: cannot create /dev/null: Permission denied

===== alleHap package: version 0.9.9 ======
===========================================

Data have been successfully loaded from: 
/work/tmp

===== DATA COUNTING ======
Number of families: 1
Number of individuals: 7
Number of founders: 2
Number of children: 5
Number of males: 4
Number of females: 3
Number of markers: 8
===========================

======== DATA RANGES =========
Family ID: 1
Individual IDs: [1,...,7]
Paternal IDs: [0,1]
Maternal IDs: [0,2]
Sex values: [1,2]
Phenotype values: [1,2]
==============================

========= MISSING DATA =========
Missing founders: 0
Missing ID numbers: 0
Missing paternal IDs: 0
Missing maternal IDs: 0
Missing sex: 0
Missing phenotypes: 0
Missing alleles: 40
Markers with missing values: 8
================================

======= IMPUTATION SUMMARY =======
0 markers (0 alleles) have been 
turned into missing in 0 families
due to familial inconsistencies.
Alleles initially missing: 40
Number of imputed alleles: 6
Imputation rate: 0.15
Imputation time: 0.02
==================================

========= HAPLOTYPING SUMMARY ==========
Re-imputation rate: 1
Proportion of phased alleles: 1
Proportion of non-phased alleles: 0
Proportion of missing haplotypes: 0
Proportion of partial haplotypes: 0
Proportion of full haplotypes: 1
Haplotyping time: 0.228
========================================
  famID indID patID matID sex phen Mk1_1 Mk1_2 Mk2_1 Mk2_2 Mk3_1 Mk3_2 Mk4_1
1     1     1     0     0   2    1     C     T  <NA>  <NA>  <NA>  <NA>  <NA>
2     1     2     0     0   2    1  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>
3     1     3     1     2   1    2     C     C     A     G     A     T     A
4     1     4     1     2   1    2     C     T     A     C  <NA>  <NA>     A
5     1     5     1     2   1    2     C     T     A     G     C     T  <NA>
6     1     6     1     2   1    2     C     T     A     G     C     T     A
7     1     7     1     2   2    1  <NA>  <NA>  <NA>  <NA>     C     G     A
  Mk4_2 Mk5_1 Mk5_2 Mk6_1 Mk6_2 Mk7_1 Mk7_2 Mk8_1 Mk8_2
1  <NA>  <NA>  <NA>     A     C  <NA>  <NA>  <NA>  <NA>
2  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>
3     A     G     T     A     C     A     C     A     G
4     T     C     G     C     C     C     T     C     G
5  <NA>     G     T     A     C     A     C     A     A
6     A     G     T     A     C     A     C     A     A
7     T     C     G  <NA>  <NA>     C     T  <NA>  <NA>
  famID indID patID matID sex phen Mk1_1 Mk1_2 Mk2_1 Mk2_2 Mk3_1 Mk3_2 Mk4_1
1     1     1     0     0   2    1     C     T     G     C     T     G     A
2     1     2     0     0   2    1     C     T     A     A     A     C     A
3     1     3     1     2   1    2     C     C     G     A     T     A     A
4     1     4     1     2   1    2     T     C     C     A     G     A     T
5     1     5     1     2   1    2     C     T     G     A     T     C     A
6     1     6     1     2   1    2     C     T     G     A     T     C     A
7     1     7     1     2   2    1     T     T     C     A     G     C     T
  Mk4_2 Mk5_1 Mk5_2 Mk6_1 Mk6_2 Mk7_1 Mk7_2 Mk8_1 Mk8_2
1     T     T     C     A     C     A     T     A     C
2     A     G     G     C     C     C     C     G     A
3     A     T     G     A     C     A     C     A     G
4     A     C     G     C     C     T     C     C     G
5     A     T     G     A     C     A     C     A     A
6     A     T     G     A     C     A     C     A     A
7     A     C     G     C     C     T     C     C     A
  famID indID patID matID sex phen     hap1     hap2
1     1     1     0     0   2    1 CGTATAAA TCGTCCTC
2     1     2     0     0   2    1 CAAAGCCG TACAGCCA
3     1     3     1     2   1    2 CGTATAAA CAAAGCCG
4     1     4     1     2   1    2 TCGTCCTC CAAAGCCG
5     1     5     1     2   1    2 CGTATAAA TACAGCCA
6     1     6     1     2   1    2 CGTATAAA TACAGCCA
7     1     7     1     2   2    1 TCGTCCTC TACAGCCA

alleHap documentation built on May 1, 2019, 8:08 p.m.