renameSequenceFiles: renames sequence files in a directory

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/Format_sequenceData_ENA.R

Description

This function changes the filenames in the given directory according to the information provided in name_df

Usage

1
2
3
4
renameSequenceFiles(name_df, colNameOld="OldName", 
  colNameNew="NewName", file.dir=NULL, paired=TRUE, 
  seq.file.extension=".fastq.gz", ask.input=TRUE, 
  pairedEnd.extension=c("_1", "_2"))

Arguments

name_df

a data.frame. The dataframe that lists the original sample same (without file extension) alongside the new file names (also without their file extension)

colNameOld

a character string. The name of the column with the original file names

colNameNew

a character string. The name of the column with the new file names

file.dir

a character string. The path to the directory where the sequence files are stored

paired

boolean. wether or not the sequence files are paired-end (forward _1, reverse_2) or single-end

seq.file.extension

a character string. The file-extension of the sequence files

ask.input

boolean. Will give a warning message before continuing.

pairedEnd.extension

a character vector of length 2. If the data is paired-end data, specify the forward (first element of te vector) and reverse (second) extension tags here. Default is c("_1", "_2")

Details

fastq files from sequencing facilities often come with long and complex file names that were automatically generated by the sequencer machine and no longer resemble the original name of the sample. This function is part of tools that help to get file names and easily convert them back into the original file names. It makes use of a table where each sequence file name is linked to a new name desired by the user. This can be generated by the FileNames.to.Table function.

Value

the number of files changed.

Author(s)

Maxime Sweetlove

See Also

Other data archiving functions: FileNames.to.Table(), get.ENAName(), prep.metadata.ENA(), sync.metadata.sequenceFiles()

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
fileNamesTable_rename <- data.frame(NewName=c("s1renamed", "s2renamed"), 
                                    OldName=c("seq_sample1", "seq_sample2"))
renameSequenceFiles(name_df=fileNamesTable_rename, colNameOld="OldName", 
                    colNameNew="NewName", file.dir=file_dir, paired=TRUE, 
                    seq.file.extension=".fastq.gz", ask.input=TRUE, 
                    pairedEnd.extension=c("_1", "_2"))

## End(Not run)

biodiversity-aq/OmicsMetaData documentation built on Dec. 19, 2021, 9:44 a.m.