AM: multiple-locus association mapping

Description Usage Arguments Details Value See Also Examples

Description

AM performs association mapping within a multiple-locus linear mixed model framework. AM finds the best set of marker loci in strongest association with a trait while simultaneously accounting for any fixed effects and the genetic background.

Usage

1
2
3
AM(trait = NULL, fformula = NULL, availmemGb = 8, geno = NULL,
  pheno = NULL, map = NULL, ncpu = detectCores(), ngpu = 0,
  quiet = TRUE, maxit = 20)

Arguments

trait

the name of the column in the phenotype data file that contains the trait data. The name is case sensitive and must match exactly the column name in the phenotype data file.

fformula

the right hand side formula for the fixed effects. See below for details. If not specified, only an overall mean will be fitted.

availmemGb

a numeric value. It specifies the amount of available memory (in Gigabytes). This should be set to the maximum practical value of available memory for the analysis.

geno

the R object obtained from running ReadMarker. This must be specified.

pheno

the R object obtained from running ReadPheno. This must be specified.

map

the R object obtained from running ReadMap. If not specified, a generic map will be assumed.

ncpu

a integer value for the number of CPU that are available for distributed computing. The default is to determine the number of CPU automatically.

ngpu

a integer value for the number of gpu available for computation. The default is to assume there are no gpu available. This option has not yet been implemented.

quiet

a logical value. If set to TRUE, additional runtime output is printed. This is useful for error checking and monitoring the progress of a large analysis.

maxit

an integer value for the maximum number of forward steps to be performed. This will rarely need adjusting.

Details

How to perform a basic AM analysis

Suppose,

To analyse these data, we would use the following three functions:

1
2
3
4
5
  geno_obj <-  ReadMarker(filename='geno.txt', AA=0, AB=1, BB=2, type="text", missing='X')
  
  pheno_obj <- ReadPheno(filename='pheno.txt')

  res <- AM(trait='y', geno=geno_obj, pheno=pheno_obj)

A table of results is printed to the screen and saved in the R object res.

How to perform a more complicated AM analysis

Suppose,

To analyse these data, we would run the following:

1
2
3
4
5
6
7
8
  geno_obj <-  ReadMarker(filename='/my/dir/geno.ped', type='PLINK', availmemGb=32)
  
  pheno_obj <- ReadPheno(filename='/my/dir/pheno.txt', missing=99)

  map_obj   <- ReadMap(filename='/my/dir/map.txt')

  res <- AM(trait='y2', fformula=c('cov1 + cov2 + pc1 + pc2'), 
            geno=geno_obj, pheno=pheno_obj, map=map_obj, availmemGb=32)

A table of results is printed to the screen and saved in the R object res.

Dealing with missing marker data

AM can tolerate some missing marker data. However, ideally, a specialized genotype imputation program such as 'BEAGLE', 'MACH', 'fastPHASE', or 'PHASE2', should be used to impute the missing marker data before being read into 'Eagle'.

Dealing with missing trait data

AM deals automatically with individuals with missing trait data. These individuals are removed from the analysis and a warning message is generated.

Dealing with missing explanatory variable values

AM deals automatically with individuals with missing explanatory variable values. These individuals are removed from the analysis and a warning message is generated

Error Checking

Most errors occur when reading in the data. However, as an extra precaution, if quiet=TRUE, then additional output is printed during the running of AM. If AM is failing, then this output can be useful for diagnosing the problem.

Value

A list with the following components:

trait

column name of the trait being used by 'AM'.

fformula

Right hand size formula of the fixed effects part of the linear mixed model.

indxNA

a vector containing the row indexes of those individuals, whose trait and fixed effects data contain missing values and have been removed from the analysis.

Mrk

a vector with the names of the snp in strongest and significant association with the trait.If no loci are found to be significant, then this component is NA.

Chr

the chromosomes on which the identified snp lie.

Pos

the map positions for the identified snp.

Indx

the column indexes in the marker file of the identified snp.

ncpu

number of cpu used for the calculations.

availmemGb

amount of RAM in gigabytes that has been set by the user.

quiet

boolean value of the parameter.

extBIC

numeric vector with the extended BIC values for the loci found to be in significant association with the trait.

See Also

ReadMarker, ReadPheno, and ReadMap

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
  ## Not run:  
  # Since the following code takes longer than 5 seconds to run, it has been tagged as dontrun. 
  # However, the code can be run by the user. 
  #

  #-------------------------
  #  Example  
  #------------------------

  # read the map 
  #~~~~~~~~~~~~~~
  
  # File is a plain space separated text file with the first row 
  # the column headings
  complete.name <- system.file('extdata', 'map.txt', 
                                   package='Eagle')
  map_obj <- ReadMap(filename=complete.name) 

  # read marker data
  #~~~~~~~~~~~~~~~~~~~~
  # Reading in a PLINK ped file 
  # and setting the available memory on the machine for the reading of the data to 8  gigabytes
  complete.name <- system.file('extdata', 'geno.ped', 
                                     package='Eagle')
  geno_obj <- ReadMarker(filename=complete.name,  type='PLINK', availmemGb=8) 
 
  # read phenotype data
  #~~~~~~~~~~~~~~~~~~~~~~~

  # Read in a plain text file with data on a single trait and two covariates
  # The first row of the text file contains the column names y, cov1, and cov2. 
  complete.name <- system.file('extdata', 'pheno.txt', package='Eagle')
  
  pheno_obj <- ReadPheno(filename=complete.name)
           

 # Performing multiple-locus genome-wide association mapping with a model 
 #    with no fixed effects except for an intercept. 
 #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
  res <- AM(trait = 'y',
                           fformula=c('cov1+cov2'),
                           map = map_obj,
                           pheno = pheno_obj,
                           geno = geno_obj, availmemGb=8)

## End(Not run)

Eagle documentation built on May 2, 2019, 5:31 p.m.