DupChecker: a package for checking high-throughput genomic data redundancy in meta-analysis
Version 1.14.0

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates would make study results questionable. We developed a Bioconductor package DupChecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data.

Browse man pages Browse package API and functions Browse package files

AuthorQuanhu Sheng, Yu Shyr, Xi Chen
Bioconductor views Preprocessing
Date of publicationNone
Maintainer"Quanhu SHENG" <shengqh@gmail.com>
LicenseGPL (>= 2)
Version1.14.0
Package repositoryView on Bioconductor
InstallationInstall the latest version of this package by entering the following in R:
source("https://bioconductor.org/biocLite.R")
biocLite("DupChecker")

Man pages

arrayExpressDownload: arrayExpressDownload
buildFileTable: buildFileTable
geoDownload: geoDownload
validateFile: validateFile

Functions

arrayExpressDownload Man page Source code
buildFileTable Man page Source code
deleteFileAndMd5 Source code
deleteFilesAndMd5 Source code
geoDownload Man page Source code
getFtpFilenames Source code
lappend Source code
validateFile Man page Source code

Files

DESCRIPTION
NAMESPACE
R
R/DupChecker.R
README.md
build
build/vignette.rds
inst
inst/CITATION
inst/doc
inst/doc/DupChecker.R
inst/doc/DupChecker.Rnw
inst/doc/DupChecker.pdf
man
man/arrayExpressDownload.Rd
man/buildFileTable.Rd
man/geoDownload.Rd
man/validateFile.Rd
vignettes
vignettes/DupChecker.Rnw
DupChecker documentation built on May 20, 2017, 10:17 p.m.