read.merlin.files: function to read input files in Merlin format

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/read.merlin.files.R

Description

Reads the pedigree, data and allele frequency input files. The data read is reformatted to be used by the function fat2Lpoly.withinR.

Usage

1
read.merlin.files(pedfilenames, datfilenames, freq.data, ibdfilenames = NULL)

Arguments

pedfilenames

vector of 1 or 2 (the number of loci involved in the design function) character strings giving the names of the pedigree files in Merlin format (see Merlin website [1]). Put the full path of the files if they are not in the current working directory. If the phenotype is polytomous with 4 levels created by all combinations of two dichotomous phenotypic variables Y[1] and Y[2], then the sixth and seventh columns of each file contain respectively Y[1] (e.g. the endophenotype) and Y[2] (e.g. the disease phenotype). If the phenotype is dichotomous, then the sixth column of each file contains it.

datfilenames

vector of 1 or 2 (the number of loci involved in the design function) character strings giving the names of the Merlin format data files corresponding to the pedigree files.

freq.data

Either (1) a vector of 1 or 2 (the number of loci involved in the design function) character strings giving the names of the allele frequency files corresponding to the pedigree files. These files must be in Merlin Classic format. or (2) a list of length 1 or 2 (the number of loci involved in the design function), each element of which is a numeric vector of length 'number of SNPs in datfilenames' and specifies each SNP's minor allele.

ibdfilenames

vector of 1 or 2 (the number of loci involved in the design function) character strings giving the names of the Merlin format ibd files corresponding to the pedigree files. If NULL (the default), the reading of the IBD files is skipped.

Details

All subjects included in the pedigree files must also be found in the IBD files.

All fields in the pedigree files must be numeric. No letters allowed, even for family and subject ID's.

Value

returns a list of six objects:

ped

data frame with columns fam.id, subject.ids, endophenotype and phenotype (in the given order)

x.all

data frame of SNP genotypes in the format "(number of minor alleles)/2", for all SNPs listed in the file(s) in datfilenames. It contains only the SNP data and it has as column names the SNP names in datfilenames. The lines come in the same order as in ped.

MA.table

data frame giving the minor allele numbers of all the SNPs. The first column consists of x.all's column names and the second column the minor allele numbers.

ibd.dat.list

list of one or two data frames containing the columns of the IBD data file(s) in ibdfilenames.

y1.name

affection name extracted from first line of the data file(s)

y2.name

affection name extracted from second line of the data file(s)

ibdfilenames

(same object as provided as argument) vector of 1 or 2 (the number of loci involved in the design function) character strings giving the names of the Merlin format ibd files corresponding to the pedigree files.

Author(s)

Alexandre Bureau and Jordie Croteau

References

1. http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html

See Also

fat2Lpoly.withinR

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
path.data=paste(.libPaths()[which(unlist(lapply(.libPaths(),
function(x) length(grep("fat2Lpoly",dir(x)))))>0)],
"/fat2Lpoly/extdata/",sep="")
if(length(path.data)>1) path.data=path.data[length(path.data)]

input.data=read.merlin.files(pedfilenames=
                  paste(path.data,c("loc1.ped","loc2.ped"),sep=""),
                  datfilenames=
				paste(path.data,c("loc1.dat","loc2.dat"),sep=""),
			      freq.data=
			      paste(path.data,c("loc1.freq","loc2.freq"),sep=""),
                       ibdfilenames=
				paste(path.data,c("loc1.ibd","loc2.ibd"),sep=""))

input.data2=read.merlin.files(pedfilenames=
					paste(path.data,"loc2.ped",sep=""),
                            datfilenames=
                            paste(path.data,"loc2.dat",sep=""),
                            freq.data=
                            paste(path.data,"loc2.freq",sep=""),
					ibdfilenames=
					paste(path.data,"loc2.ibd",sep=""))

Example output

Y1 data extracted from input files:    endo 
Y2 data extracted from input files:    pheno 

Warning messages:
1: In FUN(X[[i]], ...) : NAs introduced by coercion
2: In FUN(X[[i]], ...) : NAs introduced by coercion
3: In FUN(X[[i]], ...) : NAs introduced by coercion

Y1 data extracted from input files:    endo 
Y2 data extracted from input files:    pheno 

fat2Lpoly documentation built on Jan. 4, 2022, 5:08 p.m.