check_annotation_file_completeness: Check whether an annotation file contains outlier lines

Description Usage Arguments Author(s) Examples

View source: R/check_annotation_file_completeness.R

Description

Some annotation files include lines with character lengths greater than 65000. This causes problems when trying to import such annotation files into R using import. To overcome this issue, this function screens for such lines in a given annotation file and removes these lines so that import can handle the file.

Usage

1
2
3
4
check_annotation_file_completeness(
  annotation_file,
  remove_annotation_outliers = FALSE
)

Arguments

annotation_file

a file path tp the annotation file.

remove_annotation_outliers

shall outlier lines be removed from the input annotation_file? If yes, then the initial annotation_file will be overwritten and the removed outlier lines will be stored at tempdir for further exploration.

Author(s)

Hajk-Georg Drost

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Not run: 
# download an example annotation file from NCBI RefSeq
Ath_path <- biomartr::getGFF(organism = "Arabidopsis thaliana")
# run annotation file check on the downloaded file
check_annotation_file_completeness(Ath_path)
# several outlier lines were detected, thus we re-run the
# function using 'remove_annotation_outliers = TRUE'
# to remove the outliers and overwrite the file
check_annotation_file_completeness(Ath_path, remove_annotation_outliers = TRUE)

## End(Not run)

drostlab/homologr documentation built on Sept. 28, 2020, 12:44 a.m.