ReadPheno: Read phenotype file
In Eagle: Multiple Locus Association Mapping on a Genome-Wide Scale

Description Usage Arguments Details Value See Also Examples

View source: R/ReadPheno.R

Read in the phenotype data.

1	ReadPheno(filename = NULL, header = TRUE, csv = FALSE, missing = "NA", ...)

`filename`	contains the name of the phenotype file. The file name needs to be in quotes. If the file is not in the working directory, then the full path to the file is required.
`header`	a logical value. When `TRUE`, the first row of the file contains the names of the columns. Default is `TRUE`.
`csv`	a logical value. When `TRUE`, a csv file format is assumed. When `FALSE`, a space separated format is assumed. Default is `FALSE`.
`missing`	the number or character for a missing phenotype value.
`...`	arguments to be passed to read.table such as `skip`, `sep`. See `read.table` so the list of arguments.

ReadPheno reads in the phenotype data which are data measured on traits and any fixed effects (or predictors/features/explanatory variables). A space separated plain text file is assumed. Each row in this file corresponds to an individual. The number of rows in the phenotype file must be the same as the number of rows in the marker data file. Also, the ordering of the individuals must be the same in the two files. A space separated file with column headings is the default but can be changed with the header and csv options.

The phenotype file may contain multiple traits and fixed effects variables.

Missing values are allowed. Eagle is told which value should be treated as missing by setting the missing parameter to the value.

For example, suppose we have three individuals for which we have collected data on two quantitative traits (y1 and y2), and four explanatory variables (age, weight, height, and sex). The data looks like

y1	y2	age	weight	height	sex
112.02	-3.123	26	75	168.5	M
156.44	1.2	45	102	NA	NA
10.3	NA	28	98	189.4	F

where the first row has the column headings and the next three rows contain the observed data on three individuals.

To load these data, we would use the command

1	pheno_obj <- ReadPheno(filename='pheno.dat', missing='NA')

where pheno.dat is the name of the phenotype file, and pheno_obj is the R object that contains the results from reading in the phenotype data. The file is located in the working directory so there is no need to specify the full path, just the file name is suffice.

Dealing with missing trait data

AM deals automatically with individuals with missing trait data. These individuals are removed from the analysis and a warning message is generated.

Dealing with missing fixed effects values

AM deals automatically with individuals with missing fixed effects values. These individuals are removed from the analysis and a warning message is generated

a data frame is returned of the phenotype data. If header is true, the names of the columns will be as specified by the first row of the phenotype file. If header is FALSE, generic names are supplied by R in the form of V1, V2, etc. If no column headings are given, these generic names will need to be used in the trait and fformula parameters in AM. You can print out the column names of the data frame by using

1	names(pheno_obj)

The column names are also printed along with other summary information when ReadPheno is run.

ReadMarker for reading in marker data, AM for performing association mapping.

# Read in  phenotype data from ./extdata/

# find the full location of the phenotype data 
complete.name <- system.file('extdata', 'pheno.txt', package='Eagle')

pheno_obj <- ReadPheno(filename=complete.name)
  
 ## print a couple of lines of the data file
 head(pheno_obj)