Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/makeGenotypes.R
Convert columns in a dataframe to genotypes or haplotypes.
1 2 | makeGenotypes(data, convert, sep = "/", tol = 0.5, ..., method=as.genotype)
makeHaplotypes(data, convert, sep = "/", tol = 0.9, ...)
|
data |
Dataframe containing columns to be converted |
convert |
Vector or list of pairs specifying which columns contain genotype/haplotype data. See below for details. |
sep |
Genotype separator |
tol |
See below. |
... |
Optional arguments to as.genotype function |
method |
Function used to perform the conversion. |
The functions makeGenotypes and makeHaplotypes allow the conversion of all of the genetic variables in a dataset to genotypes or haplotypes in a single step.
The parameter convert
may be missing, a vector of
column names, indexes or true/false indictators, or a list of column
name or index pairs.
When the argument convert
is not provided, the function will
look for columns where at least tol
*100% of the records
contain the separator character sep
('/' by default). These
columns will then be assumed to contain both of the genotype/haplotype
alleles and will be converted in-place to genotype variables.
When the argument convert
is a vector of column names, indexes
or true/false indictators, the corresponding columns will be assumed
to contain both of the genotype/haplotype alleles and will be
converted in-place to genotype variables.
When the argument convert
is a list containing column name or
index pairs, the two elements of each pair will be assumed to contain the
individual alleles of a genotype/haplotype. The first column
specified in each pair will be replaced with the new
genotype/haplotype variable named name1 + sep + name2
. The
second column will be removed.
Note that the method
argument may be used to supply a
non-standard conversion function, such as
as.genotype.allele.count
, which converts from [0,1,2] to
['A/A','A/B','A/C'] (or the specified allele names). See the example
below.
Dataframe containing converted genotype/haplotype variables. All other variables will be unchanged.
Gregory R. Warnes greg@warnes.net
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | ## Not run:
# common case
data <- read.csv(file="genotype_data.csv")
data <- makeGenotypes(data)
## End(Not run)
# Create a test data set where there are several genotypes in columns
# of the form "A/T".
test1 <- data.frame(Tmt=sample(c("Control","Trt1","Trt2"),20, replace=TRUE),
G1=sample(c("A/T","T/T","T/A",NA),20, replace=TRUE),
N1=rnorm(20),
I1=sample(1:100,20,replace=TRUE),
G2=paste(sample(c("134","138","140","142","146"),20,
replace=TRUE),
sample(c("134","138","140","142","146"),20,
replace=TRUE),
sep=" / "),
G3=sample(c("A /T","T /T","T /A"),20, replace=TRUE),
comment=sample(c("Possible Bad Data/Lab Error",""),20,
rep=TRUE)
)
test1
# now automatically convert genotype columns
geno1 <- makeGenotypes(test1)
geno1
# Create a test data set where there are several haplotypes with alleles
# in adjacent columns.
test2 <- data.frame(Tmt=sample(c("Control","Trt1","Trt2"),20, replace=TRUE),
G1.1=sample(c("A","T",NA),20, replace=TRUE),
G1.2=sample(c("A","T",NA),20, replace=TRUE),
N1=rnorm(20),
I1=sample(1:100,20,replace=TRUE),
G2.1=sample(c("134","138","140","142","146"),20,
replace=TRUE),
G2.2=sample(c("134","138","140","142","146"),20,
replace=TRUE),
G3.1=sample(c("A ","T ","T "),20, replace=TRUE),
G3.2=sample(c("A ","T ","T "),20, replace=TRUE),
comment=sample(c("Possible Bad Data/Lab Error",""),20,
rep=TRUE)
)
test2
# specifly the locations of the columns to be paired for haplotypes
makeHaplotypes(test2, convert=list(c("G1.1","G1.2"),6:7,8:9))
# Create a test data set where the data is coded as numeric allele
# counts (0-2).
test3 <- data.frame(Tmt=sample(c("Control","Trt1","Trt2"),20, replace=TRUE),
G1=sample(c(0:2,NA),20, replace=TRUE),
N1=rnorm(20),
I1=sample(1:100,20,replace=TRUE),
G2=sample(0:2,20, replace=TRUE),
comment=sample(c("Possible Bad Data/Lab Error",""),20,
rep=TRUE)
)
test3
# specifly the locations of the columns, and a non-standard conversion
makeGenotypes(test3, convert=c('G1','G2'), method=as.genotype.allele.count)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.