ReScanCNVs: ReScanCNVs

Description Usage Arguments Details Value Author(s) Source Examples

Description

ReScanCNVs: Find Copy Number Variation (CNV) from SNP genotyping arrays.

Usage

1
2
3
4
5
6
7
8
ReScanCNVs(CNVs = CNVs,
  PathRawData = "/media/NeoScreen/NeSc_home/ILMN/iPSYCH/", MINNumSNPs = 5,
  Cores = 1, hg = "hg19", NumFiles = "All", Pattern = "*",
  MinLength = 10, SelectedFiles = NA, Skip = 10, LCR = FALSE,
  PFB = NULL, chr = NA, penalty = 60, Quantile = FALSE,
  QSpline = FALSE, sd = 0.18, recursive = FALSE, CPTmethod = "meanvar",
  CNVSignal = 0.1, penvalue = 10, OutputPath = NA, IndxPos = FALSE,
  ResPerSample = FALSE, Files = NA, OnlyCNVs = TRUE, SNPList = NULL)

Arguments

CNVs:

Data frame with hotspots to re-scan. Minimum information necessary is chromosome (Chr.), Start and Stop position.

PathRawData:

The path to the raw data files containing Log R Ratio (LRR) and B Allele Frequency (BAF) values.

MINNumSNPs:

Minimum number of SNPs per CNV, default = 20.

Cores:

Number of cores used; default = 1.

Hg:

Human genome version, default = hg19.

NumFiles:

Number of files to be analyzed from PathRawData.

Pattern:

File pattern in the PathRawData. Example: "*.txt".

MinLength:

Minimum CNV length, default = 10.

SelectedFiles:

List of file names that should be analyzed from PathRawData.

Skip:

Integer the number of lines of the data file to skip before beginning to read data.

LCR:

List low copy repeat region region, list of SNPs that should be removed.

PFB:

Vector population frequency 0 to 1 for each SNP in the array.

Chr:

Character, select a specific chromosome to be analyzed.

Penalty:

The coefficient of the penalty for degrees of freedom in the GCV criterion. From smooth.spline stats.

Quantile:

Logical, if quantile normalization should be applied or not, default = FALSE.

QSpline:

Logical, if a cubic smoothing spline should be used to normalize the data, default = FALSE.

Sd:

numeric, LRR standard deviation (sd) for the quantile nomarlization, default = 0.18.

Recursive:

Logical, should the listing recurse into directories (Unknown)? From list.files base.

CPTmethod:

Character, method to find change points from changepoint package by Rebecca Killick, default = "meanvar", or "mean".

CNVSignal:

Numeric, minumum CNV signal to be consider a CNV in absolute value, default = 0.1, any CNV with mean LLR in the CNV region with abs(X) < 0.1 is ignored.

Penvalue:

Same as pen.value from function cpt.mean at changepoint R package by Rebecca Killick. default = 10. "The theoretical type I error e.g.0.05 when using the Asymptotic penalty. A vector of length 2 (min,max) if using the CROPS penalty. The value of the penalty when using the Manual penalty option - this can be a numeric value or text giving the formula to use. Available variables are, n=length of original data, null=null likelihood, alt=alternative likelihood, tau=proposed changepoint, diffparam=difference in number of alternatve and null parameters".

OutputPath:

Character, path for output.

OutputFileName:

Character, output file name.

OnlyCNVs:

Logical, if TRUE only CNVs with copy number state 0,1,3,4 will be returned. If FALSE will return also changepoint regions with CN = 2.

IndxPos:

If index position for each hotspot is not included it will calculate. However, this step is time consuming.

ResPerSample:

If TRUE saves the results from each sample.

Files:

a vector with all samples name and path. If too many samples, list all files using recursive=TRUE can take long time.

SNPList:

Getting Chr. and Position from another source than the RawFile - input should be the full path of the SNPList with columns: Name, Chr, amd Position. Any positions from the RawFile will be erased. A PFB-column is also allowed but will be overwritten by the PFB-parameter or exchanged with 0.5

Details

Specifically designed to handle noisy data from amplified DNA on phenylketonuria (PKU) cards. The function is a pipeline using many subfunctions.

Value

Data frame with predicted CNVs.

Author(s)

Marcelo Bertalan; Louise K. Hoeffding.

Source

http://biopsych.dk/iPsychCNV

Examples

1
2
3
4
5
6
MockDataCNVs <- MockData(N=100, Type="PKU", Cores=20)
iPsych.Pred <- iPsychCNV(PathRawData=".", MINNumSNPs=20, Cores=20, Pattern="^MockSample", MinLength=10, Skip=0)
iPsych.Pred.hotspots <- HotspotsCNV(iPsych.Pred, Freq=2, OverlapCutoff=0.9, Cores=1)
iPsych.Pred.Rescan <- ReScanCNVs(iPsych.Pred.hotspots, PathRawData="./Data", hg="hg19", Pattern="*", Cores=20, Skip=0)
iPsych.Pred.Rescan$ID <- iPsych.Pred.Rescan$SampleID
PlotAllCNVs(iPsych.Pred.Rescan, Name="iPsych.Pred.Rescan.png", hg="hg19", Roi=MockDataCNVs.roi)

mbertalan/iPsychCNV documentation built on May 22, 2019, 12:19 p.m.