R/extract_chrIDs.R

Defines functions extract_chrIDs

Documented in extract_chrIDs

#Extract Chromosome IDs

#' @title Extract Chromosome IDs
#' @description This function allows to extract the chromosome IDs, based on the meta information contained in the VCF file.
#'
#' @param meta meta information contained in the first element of the list returned by the readBSA_vcf() function
#'
#' @return Character vector containing a list of chromosome IDs.
#'
#' @details The meta information from the VCF file (which is stored in the first element of the list generated byreadBSA_vcf())is taken as input. 
#' 
#' The lines in the meta information which contain a sequence of key words (including ID and length) that make them unique from others,  
#' are extracted into a character vector. 
#'
#' For each element of that vector, the characters before and after the chromosome ID are removed, resulting in each element of the vector containing only the characters corresponding to a specific chromosome ID.
#' The final character vector contains all chromosomes IDs in the way they are named and ordered within the VCF file.
#'
#' @export extract_chrIDs
#' @examples
#' chromList <- extract_chrIDs(meta=vcf_list$meta)


extract_chrIDs <- function(meta) {
  
  #Extract those meta lines containing the length of the chromosomes
  lengthLines <- meta[grep("<ID.*length=", meta)]
  #Remove characters after chromosome ID (by replacing with "")
  chrList <- sub(",length.*", "", lengthLines)
  #Remove characters before chromosome ID (by replacing with "")
  chrList <- sub(".*ID=", "", chrList) #Contains all chromosomes IDs
  
  return(chrList)
}
EG-lisy/BSAvis documentation built on Dec. 17, 2021, 5:38 p.m.