compareProbes | R Documentation |
On Affymetrix Axiom arrays it is possible to have two probes interrogating the same SNP position. This function compares the dosage scores and checkF1 results of the two probes; if they are sufficiently similar a new marker is generated combining the results of the two probes. A dosage file with the data for the separate probes as well as the combined markers is written with the same format as writeDosagefile, and also a file summarizing the comparison results.
compareProbes(chk, scores,
probe.suffix=c("P","Q","R"), fracdiff.threshold=0.04,
parent1, parent2, F1, ancestors=character(0), other=character(0),
polysomic=TRUE, disomic=FALSE, mixed=FALSE,
ploidy, ploidy2, qall_flavor="qall_mult", shiftParents,
compfile, combscorefile)
chk |
data frame as returned by checkF1, or a subset with at least
columns markername, parent1, parent2 (the consensus parental genotypes),
the columns for the samples specified by parameters parent1, parent2 and
ancestors, and bestParentfit, and containing only rows with selected markers.
If a column with a name as specified by qall_flavor (see below) is present
this will be written to file compfile, but it is not used: any selection of
marker based on qall (or other) must have been made beforehand, and the
rows for the unwanted markers must have been deleted from the chk data frame. |
scores |
data frame as read from the scores file produced by function
fitMarkers of package fitPoly, with at least columns MarkerName,
SampleName, P0 .. P<ploidyF1> and geno (where <ploidyF1> is the ploidy of the
F1, i.e. the average of parental ploidy and ploidy2). |
probe.suffix |
a 3-item character vector specifying the suffixes of the marker names that distinguish the two probes. The first two items identify the two probes; the third item is used to indicate a new marker combining the data from both probes. The three items must be different and have the same number of characters default is c("P","Q","R") |
fracdiff.threshold |
if more than this fraction of F1 scores differs between probes, don't combine |
parent1 |
character vector with the sample names of parent 1 |
parent2 |
character vector with the sample names of parent 2 |
F1 |
character vector with the sample names of the F1 individuals |
ancestors |
character vector with the sample names of any other ancestors |
other |
other samples that should be treated like the F1 |
polysomic |
TRUE or FALSE; should be the same as used by checkF1 to calculate the chk data frame |
disomic |
TRUE or FALSE; should be the same as used by checkF1 to calculate the chk data frame |
mixed |
TRUE or FALSE; should be the same as used by checkF1 to calculate the chk data frame |
ploidy |
the ploidy of parent 1 (must be even, 2 (diploid) or larger), and the same as used by checkF1 to calculate the chk data frame |
ploidy2 |
the ploidy of parent 2. If omitted it is assumed to be equal to ploidy. Should be the same as used by checkF1 to calculate the chk data frame |
qall_flavor |
which quality parameter column must be shown in compfile, default "qall_mult". If no quality data are wanted, specify "". |
shiftParents |
if there is a column shift in chk the F1 dosages will be
shifted. If shiftParents is TRUE the parents and ancestors will be shifted
together with the F1, if FALSE only the F1 will be shifted in that case. |
compfile |
filename for tab-separated text file summarizing the comparison results; if NA no file is written. For details of the contents see the return value, component compstat |
combscorefile |
filename for tab-separated text file with the dosages; if NA no file is written. For details of the contents see the return value, component combscores |
A combined marker is made in each case that a version of each of the
two probe markers is present and they are sufficiently similar. This means
that they have been assigned the same bestParentfit segregation type by
checkF1, and that the frequency of conflicting scores over all samples is
not more than fracdiff.threshold. The combined marker will have NA scores for
individuals where both probe markers are missing, the one available score if
it is scored for only one of the two probe markers or both scores are equal,
and the score with the highest P-value if the scores for both probe markers
are unequal.
Any single-probe markers in chk that do not have a bestParentfit segregation
type are ignored and will not affect or appear in the output.
A list with two components, compstat and combscores.
compstat is a data frame with columns:
MarkerName: name of the SNP marker. If a column shift is present in data.frame chk, unshifted and shifted markers will get a "n" or "s" suffixed to the MarkerName
segtypeP and segtypeQ: the segtype assigned by checkF1 to the first and second probe
qallP and qallQ: the quality scores specified by parameter qall_flavor, assigned by checkF1 to the two probes
countP and countQ: the number of versions of each of the probes (0, 1, or 2, depending on whether a shifted, unshifted or both versions were present)
countR: the number of combinations made of versions of the two probe markers (one for each combination of a version of each of the two probe markers, if they match well enough - see details)
If the chk data frame contains a column shift, there are separate columns for the non-shifted and shifted P and Q probe markers (suffix Pn, Ps, Qn, Qs), and four columns for the R markers (suffix Rnn, Rns, Rsn, Rss where the first n/s indicates if the P was non-shifted or shifted and the second n/s for the Q probe. combscores is a data frame with columns:
MarkerName: the name of the marker. If the chk data frame contains a column shift, the P and Q marker names are suffixed with n or s, and the R marker names with nn, ns, sn, ss as described above
segtype: the segregation type
parental and ancestor samples: the dosages of those samples
parent1: the consensus dosage for parent1 as determined by checkF1
parent2: the consensus dosage for parent2 as determined by checkF1
F1 samples: the dosages for those samples
other samples: the dosages for those samples
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.