Description Usage Arguments Details Value Author(s) See Also Examples
readFSA
reads and processes raw .fsa files into R.
1 2 3 4 5 | readFSA(files = NULL, path = "./", dye, lad.channel = 105, pretrim = NA,
posttrim = ".fsa", ladder = c(35, 50, 75, 100, 139, 150, 160, 200, 250,
300, 340, 350, 400, 450, 490, 500), SNR = 6000, ladder.check = 250,
sizing = "local", bin.width = 1, min.peak.height = 50,
baseline.width = 51, verbose = TRUE, smoothing = 3, CORES = 1)
|
files |
A list of fsa files to read. If NULL (the default), all
.fsa files in the directory specified by |
path |
The directory to search for |
dye |
A vector of dyes to include when reading data. Valid values include: "FAM", "VIC", "NED", "PET". |
lad.channel |
Which .fsa data channel has the size standard ladder data. The default is 105, which is the value for our system. |
pretrim |
A regexp - text to trim off the front of the sample names. |
posttrim |
A regexp - text to trim off the end of the sample names. |
ladder |
A vector with the fragments present in the ladder, in order. The default is the standard GS500(-250)LIZ ladder. |
SNR |
This is a cut-off value, used to exclude the primer-dimer spike at the beginning of the run from being erroneously interpreted as a ladder fragment. This spike is usually > 6000 rfus, and the true ladder peaks are usually (always?) well below this cut-off. Not setting this value may lead to slower, and poorer ladder-fitting. |
ladder.check |
If not null, the size of a ladder fragment that is present but not used for sizing. This size of this fragment will be estimated, and the estimate reported during scanning. Otherwise, it will be ignored. See below. |
sizing |
Currently two options are supported, "local" and "cubic". "local" provides the local Southern method, identical to the one used in PeakScanner et al., and recommended. "cubic" uses a cubic spline function. |
bin.width |
The width in basepairs of each bin. Used to tune the
peak-finding algorithm of the internal function |
min.peak.height |
The minimum rfu value to consider a true peak,
passed to |
baseline.width |
The width of the window to use when 'correcting'
the rfu intensity. Each rfu value will be corrected by having the
running minimum from a window |
verbose |
Do you want to see all the details scroll by or not?
|
smoothing |
This is a tuning value. If smoothing is > 1, the rfu values will be converted to the running mean of the actual values, with a window width of of 'smoothing'. 3 seems to work nicely and is the default. 1 may be fine too. Even numbers or non-integer values may break the time-space continuum (untested). |
pretrim
and posttrim
are regexps, passed to
grep. The substring at the front of each rowname matching
pretrim
(or the end for posttrim
) is removed. To cancel
trimming, set these to NA
.
ladder.check
In the standard ladder GS500, the 250bp fragment
commonly migrates at an odd rate, making it inappropriate for use in
sizing. Setting ladder.check = 250
, which is the default, will
exclude this fragment from the sizing process. Set ladder.check =
NA
if you want to use all the peaks in ladder
in sizing the
data.
readFSA
returns an object of class fsa
. The
elements include:
A list of electropherogram objects, each corresponding to one fsa file (see below.)
A list of the data channels (fluorescent dyes) read.
A data frame recording the total area under the curve used for each dye/sample combination, used for normalizing results.
If present, a vector of sample names for all samples that produced unsatisfactory sizing results. Most likely bad reactions that should be removed.
electropherogram
objects have three components:
A data frame, the columns of which are the heights (in RFUs) of each dye, including the size standard, for each time step in the capillary run. The data is ordered, with the first reads at the beginning of the table. There is an additional column, ‘bp’, which stores the size, in base pairs, of each row in the table.
A list of vectors, each of which contains the position of the peaks for each dye in the electropherogram, in base pairs.
The original sample name for the fsa file.
Tyler Smith
fsaNormalize
, fsa2PeakTab
,
plot.fsa
, fsaRGbin
, binSet
,
scanGel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | ## Not run:
## A set of fsa files are included in this package, which you can read
## with the following example. For your own data replace
## \code{system.file(...)} with the path to your fsa files.
## Read the raw files:
## Pretrim and postrim are optional, and serve only to remove
## extraneous components of the sample name added by the sequencing
## lab.
## Note that I've deliberately included a bad sample, which takes
## considerably longer to process than clean reads.
fsa.data <- readFSA(path = system.file("pp5", package = "binner"),
pretrim = "AFLP_.*AFLP_", posttrim = "-5_Frag.*",
dye = "FAM")
## The print function for fsa objects doesn't do much yet:
fsa.data
summary(fsa.data)
## Plot the second sample, which has a nice, clean ladder
plot(fsa.data, 2)
## Plot the bad sample, note the funky ladder
plot(fsa.data, fsa.data$errors[1])
## Kill it! KILL IT WITH FIRE!!
fsa.data = fsaDrop(fsa = fsa.data, epn = fsa.data$errors[1])
fsa.data
summary(fsa.data)
## Normalize the electropherograms
fsa.norm <- fsaNormalize(fsa.data)
## Plot the second sample again, note the peak heights (y-axis) have
## changed, but otherwise this plot is identical to the first plot
## above.
plot(fsa.norm, 2)
## Convert the electropherograms into a peak table
peaktab <- fsa2PeakTab(fsa.norm, dye = "FAM")
head(peaktab)
## Binning:
bins <- fsaRGbin(peaktab)
## Review the bins:
scanGel(peaktab, bins)
aflp <- binSet(peaktab, bins, pref = "A")
## Extract the scoring data and proceeed with analysis:
mydata <- aflp[, , "alleles"]
## See scangel() for additional examples
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.