R/readWombatNormFile.R
In wrProteo: Proteomics Data Analysis Functions

Documented in readWombatNormFile

#' Read (Normalized) Quantitation Data Files Produced By Wombat At Protein Level
#'
#' Protein quantification results from \href{https://github.com/wombat-p}{Wombat-P} using the Bioconductor package Normalizer can be read using this function and relevant information extracted.
#' Input files compressed as .gz can be read as well.
#' The protein abundance values (XIC), peptide counting get extracted. Since protein annotation is not very extensive with this format of data, the function allows reading the
#' initial fasta files (from the directory above the quantitation-results) allowing to extract more protein-annotation (like species).
#' Sample-annotation (if available) can be extracted from  sdrf files, which are typically part of the Wombat output, too.
#' The protein abundance values may be normalized using multiple methods (median normalization as default), the determination of normalization factors can be restricted to specific proteins
#' (normalization to bait protein(s), or to invariable matrix of spike-in experiments).
#' The protein annotation data gets parsed to extract specific fields (ID, name, description, species ...).
#' Besides, a graphical display of the distribution of protein abundance values may be generated before and after normalization.
#'
#' @details
#' By standard workflow of Wombat-P writes the results of each analysis-method/quantification-algorithm as .csv files
#' Meta-data describing the proteins may be available from two sources :
#' a) The 1st column of the Wombat/normalizer output.
#' b) Form the .fasta file in the directory above the analysis/quantiication results of the Wombar-workflow
#'
#'
#' Meta-data describing the samples and experimental setup may be available from a sdrf-file (from the directory above the analysis/quantiication results)
#' If available, the meta-data will be examined for determining groups of replicates and
#' the results thereof can be found in $sampleSetup$levels.
#' Alternatively, a dataframe formatted like sdrf-files (ie for each sample a separate line, see also function \code{readSdrf}) may be given, too.
#'
#' This import-function has been developed using Wombat-P version 1.x.
#' The final output is a list containing these elements: \code{$raw}, \code{$quant}, \code{$annot}, \code{$counts}, \code{$sampleSetup}, \code{$quantNotes}, \code{$notes}, or (if \code{separateAnnot=FALSE}) data.frame
#'   with annotation- and main quantification-content. If \code{sdrf} information has been found, an add-tional list-element \code{setup}
#' will be added containg the entire meta-data as \code{setup$meta} and the suggested organization as \code{setup$lev}.
#'
#'
#' @param fileName (character) name of file to be read (default 'proteinGroups.txt' as typically generated by Compomics in txt folder). Gz-compressed files can be read, too.
#' @param path (character) path of file to be read
#' @param quantSoft (character) qunatification-software used inside Wombat-P
#' @param fasta (logical or character) if \code{TRUE} the (first) fasta from one direcory higher than \code{fileName} will be read as fasta-file to extract further protein annotation;
#'    if \code{character} a fasta-file at this location will be read/used/
#' @param isLog2 (logical) typically data read from Wombat are expected to be \code{isLog2=TRUE}
#' @param normalizeMeth (character) normalization method, defaults to \code{median}, for more details see \code{\link[wrMisc]{normalizeThis}})
#' @param quantCol (character or integer) exact col-names, or if length=1 content of \code{quantCol} will be used as pattern to search among column-names for $quant using \code{grep}
#' @param contamCol (character or integer, length=1) which columns should be used for contaminants
#' @param pepCountCol (character) pattern to search among column-names for count data (1st entry for 'Razor + unique peptides', 2nd fro 'Unique peptides', 3rd for 'MS.MS.count' (PSM))
#' @param read0asNA (logical) decide if initial quntifications at 0 should be transformed to NA (thus avoid -Inf in log2 results)
#' @param sampleNames (character) custom column-names for quantification data; this argument has priority over \code{suplAnnotFile}
#' @param extrColNames (character) column names to be read (1st position: prefix for LFQ quantitation, default 'LFQ.intensity'; 2nd: column name for protein-IDs, default 'Majority.protein.IDs'; 3rd: column names of fasta-headers, default 'Fasta.headers', 4th: column name for number of protein IDs matching, default 'Number.of.proteins')
#' @param specPref (character) prefix to identifiers allowing to separate i) recognize contamination database, ii) species of main identifications and iii) spike-in species
#' @param refLi (character or integer) custom specify which line of data should be used for normalization, ie which line is main species; if character (eg 'mainSpe'), the column 'SpecType' in $annot will be searched for exact match of the (single) term given
#' @param remRev (logical) option to remove all protein-identifications based on reverse-peptides
#' @param remConta (logical) option to remove all proteins identified as contaminants
#' @param separateAnnot (logical) if \code{TRUE} output will be organized as list with \code{$annot}, \code{$abund} for initial/raw abundance values and \code{$quant} with final normalized quantitations
#' @param gr (character or factor) custom defined pattern of replicate association, will override final grouping of replicates from \code{sdrf} and/or \code{suplAnnotFile} (if provided)   \code{}
#' @param sdrf (logical, character, list or data.frame) optional extraction and adding of experimenal meta-data:
#'   if \code{sdrf=TRUE} the 1st sdrf in the directory above \code{fileName} will be used
#'   if character, this may be the ID at ProteomeExchange,
#'   the second element may give futher indicatations for automatic organization of groups of replicates.
#'   Besides, the output from \code{readSdrf} or a list from \code{defineSamples} may be provided; if \code{gr} is provided, \code{gr} gets priority for grouping of replicates
#' @param suplAnnotFile (logical or character) optional reading of supplemental files produced by Compomics; if \code{gr} is provided, it gets priority for grouping of replicates
#'  if \code{TRUE} default to files 'summary.txt' (needed to match information of \code{sdrf}) and 'parameters.txt' which can be found in the same folder as the main quantitation results;
#'  if \code{character} the respective file-names (relative ro absolute path), 1st is expected to correspond to 'summary.txt' (tabulated text, the samples as given to Compomics) and 2nd to 'parameters.txt' (tabulated text, all parameters given to Compomics)
#' @param groupPref (list) additional parameters for interpreting meta-data to identify structure of groups (replicates), will be passed to \code{readSampleMetaData}.
#'   May contain \code{lowNumberOfGroups=FALSE} for automatically choosing a rather elevated number of groups if possible (defaults to low number of groups, ie higher number of samples per group)
#'   May contain \code{chUnit} (logical or character) to be passed to \code{readSampleMetaData()} for (optional) adjustig of unit-prefixes in meta-data group labels, in case multiple different unit-prefixes 
#'   are used (eg '100pMol' and '1nMol').
#' @param plotGraph (logical) optional plot vioplot of initial and normalized data (using \code{normalizeMeth}); alternatively the argument may contain numeric details that will be passed to \code{layout} when plotting
#' @param titGraph (character) custom title to plot of distribution of quantitation values
#' @param wex (numeric)  relative expansion factor of the violin in plot
#' @param silent (logical) suppress messages
#' @param debug (logical) additional messages for debugging
#' @param callFrom (character) allow easier tracking of messages produced
#' @return This function returns a list with  \code{$raw} (initial/raw abundance values), \code{$quant} with final normalized quantitations, \code{$annot} (columns ), \code{$counts} an array with 'PSM' and 'NoOfRazorPeptides',
#'   \code{$quantNotes}, \code{$notes} and optional \code{setup} for meta-data from \code{sdrf}; or a data.frame with quantitation and annotation if \code{separateAnnot=FALSE}
#' @seealso \code{\link[utils]{read.table}}, \code{\link[wrMisc]{normalizeThis}}) , \code{\link{readProteomeDiscovererFile}}; \code{\link{readProlineFile}} (and other import-functions), \code{\link{matrixNAinspect}}
#' @examples
#' path1 <- system.file("extdata", package="wrProteo")
#' # Here we'll load a short/trimmed example file (originating from Compomics)
#' fiNa <- "tinyWombCompo1.csv.gz"
#' dataWB <- readWombatNormFile(file=fiNa, path=path1, tit="tiny Wombat/Compomics, Normalized ")
#' summary(dataWB$quant)
#' @export
readWombatNormFile <- function(fileName, path=NULL, quantSoft="(quant software not specified)", fasta=NULL, isLog2=TRUE, normalizeMeth="none", quantCol="abundance_", contamCol=NULL,
  pepCountCol=c("number_of_peptides"), read0asNA=TRUE, refLi=NULL, sampleNames=NULL,
  extrColNames=c("protein_group"), specPref=NULL,
  remRev=TRUE, remConta=FALSE, separateAnnot=TRUE, gr=NULL, sdrf=NULL, suplAnnotFile=NULL, groupPref=list(lowNumberOfGroups=TRUE, chUnit=TRUE),
  titGraph=NULL, wex=1.6, plotGraph=TRUE, silent=FALSE, debug=FALSE, callFrom=NULL) {
  ## prepare
  fxNa <- wrMisc::.composeCallName(callFrom, newNa="readWombatNormFile")
  oparMar <- if(plotGraph) graphics::par("mar") else NULL       # only if figure might be drawn
  remStrainNo <- TRUE                   # if TRUE extract Species in very stringent pattern
  cleanDescription <- TRUE              # clean 'Description' for artifacts of truncated text (tailing ';' etc)
  fixSpeciesNames <- TRUE  
  oparMar <- graphics::par("mar")

  ## functions
  .cleanMQann <- function(x, sep="\\|", silent=FALSE, debug=FALSE, callFrom=NULL) {
    ## split multiple protein entries as with 1st column of MaxQuant data
    ## return matrix with
    ## example ann1 <- read.delim(file.path(system.file("extdata", package="wrProteo"), "tinyWombCompo1.csv.gz"), sep=",", stringsAsFactors=FALSE)[,1]
    ##   .cleanMQann(ann1)
    #  x=rWB4a$tmp[c(5,31:32,81:82,111:114),1]
    xIni <- x       # keep backup for recuperating bizzare nonparsed
    isCont <- grepl("CON__", x)
    mult <- nchar(x) - nchar(gsub(";", "", x))
    chMult <- mult >0
    if(any(chMult)) {
      spl1 <- strsplit(x[which(chMult)], ";")
      ## use entry with most separators (when multiple entries, eg 'sp|P00761|CON__TRYP_PIG;CON__P00761')
      spl1 <- sapply(spl1, function(y) { nSep <- nchar(y) - nchar(gsub("|","",y)); y[which.max(nSep)] })
      x[which(chMult)] <- spl1 }
    ## split separators
    chSpl <- function(y) {chID <- grepl("^[[:upper:]]{1,3}[[:digit:]]{2,}|^[[:upper:]]{1,3}[[:digit:]]+[[:upper:]]+[[:digit:]]*", y); chName <- grepl("[A-Z0-9]_[[:upper:]]",y);   # extract db, ID & prot-name
      c(dbIni= if((length(y) >1 && grepl("^[[:lower:]]{1,8}$", y[1])) || length(y) >2 && grepl("^[[:lower:]]{2}|[[:lower:]]{2}$",
        y[1])) y[1] else NA, IDini=if(any(chID)) y[which(chID)[1]] else NA, nameIni=if(any(chName)) y[which(chName)[1]] else NA) }
    x <- t(sapply(strsplit(x, sep), chSpl))
    nColIni <- ncol(x)
    cleanID <- function(y, useCol=c(db=1, ID=2, name=3)) {
      ext <- grepl("[[:lower:]]+$", y[,useCol[2]])    # look for extension like 'P08758ups'
      extNoDb <- which(ext & is.na(y[,useCol[1]]))
      if(any(ext)) { cleanID <- sub("[[:lower:]]+$","", y[which(ext), useCol[2]])
        if(length(extNoDb) >0) y[which(ext), useCol[1]] <- substring(y[which(ext), useCol[2]], nchar(cleanID) +1 )
        y[which(ext), useCol[2]] <- cleanID }
      prefi <- grepl("^[[:upper:]]+__[[:upper:]]", y[,useCol[3]])       # look for prefix like 'CON__FA5_BOVIN'
      if(any(prefi)) { ch2 <- grepl("[A-Z0-9]_[[:upper:]]",  y[which(prefi), useCol[3]]); if(any(ch2)) {
        y[which(prefi)[which(ch2)], useCol[1]] <- tolower(sub("__[[:upper:]].+","", y[which(prefi)[which(ch2)], useCol[3]]))
        y[which(prefi)[which(ch2)], useCol[3]] <- sub("^[[:upper:]]+__","", y[which(prefi)[which(ch2)], useCol[3]])}}
      colnames(y) <- c("db","ID","name")
      y }
    x <- cbind(x, cleanID(x, useCol=c(db=1, ID=2, name=3)))
    x <- cbind(x, conta=grepl("^con|^REV_", x[,"db"]) | grepl("__CON__",xIni))
    ## recuperate all (bizarre) non-parsed into ID
    isNa <- rowSums(is.na(x)) > nColIni -2
    if(any(isNa)) x[which(isNa),c(2+nColIni)] <- xIni[which(isNa)]
    cbind(x[,c((nColIni+1):ncol(x), 1:nColIni)], iniSoftAnn=xIni) }

  .cleanPLann <- function(anno, inclLowerCaseExt=TRUE) {
    ## parse annotation from Proline quantification sheet (2nd col 'protein_group')
    ## entries like c("sp|P06169|PDC1_YEAST","sp|P02994|EF1A_YEAST","P00549|KPYK1_YEAST","sp|P00925|ENO2_YEAST")
    ann2 <- matrix(ncol=3, nrow=length(anno), dimnames=list(NULL,c("database","uniqueIdentifier","proteinName")))
    ## use longest of multiple ID-entries (eg "Q14CN4-2|CON__K2C72_HUMAN; sp|Q14CN4-3|CON__K2C72_HUMAN; sp|Q14CN4|CON__K2C72_HUMAN")
    ## split concatenated multiple IDs, use longest
    chMult <- nchar(sub(":","", anno)) < nchar(anno)
    if(any(chMult)) { spl1 <- strsplit(as.character(anno[chMult]),";")
      anno[which(chMult)] <- sapply(spl1, function(x) x[which.max(nchar(x))]) }

    ## database identifier
    chDB <- grep("^[[:lower:]]+\\|.", anno)
    if(length(chDB) >0) { ann2[chDB,1] <- sub("\\|.+", "", anno[chDB])
      anno[chDB] <- sub("^[[:lower:]]+\\|", "", anno[chDB])}
    ## protein/uniProtID
    chPat0 <- "^[[:upper:]]+[[:digit:]]+[0-9A-Z]*\\|."
    chPat <-  "^[[:upper:]]+[[:digit:]]+[0-9A-Z]*(\\-[[:digit:]]{1,3}){0,1}\\|."                    # like 'P10636|', 'P10636-8|'
    chPat2 <- "^[[:upper:]]+[[:digit:]]+[0-9A-Z]*(\\-[[:digit:]]{1,3}){0,1}[[:lower:]]{0,12}\\|."    # like 'P10636|', 'P10636-8|', 'P10636-8ups|'
    chID <- grep(if(isTRUE(inclLowerCaseExt)) chPat2 else chPat, anno)
    if(length(chID) >0) { ann2[chID,2] <- sub("\\|.+", "", anno[chID])   # retreive entry (until next separator)
      anno[chID] <- sub(".+\\|", "", anno[chID])}
    ## last entry
    nch <- which(nchar(anno) >0)
    if(length(nch) >0)  ann2[nch,3] <- anno[nch]
    ann2 <- cbind(database=ann2[1], protein_group=sub("[[:lower:]]+$","", ann2[,2]), ann2[,-1], iniSoftAnn=anno)
    ann2 }

  .cleanSingleAnn <- function(anno, inclLowerCaseExt=TRUE) {
     ## parse annotation from single column (ie character vector) , eg Compomics quatification sheet (2nd col 'protein_group')
     ## return 3 columns ; ID, IDext and iniSoftAnn
     if(length(dim(anno)) >1) {warning(".cleanSingleann : expecting text vector and NOT matrix"); anno <- try(as.character(as.matrix(anno)))}
     iniSoftAnn <- WOext <- anno
     chPat3 <- "[[:upper:]][0-9A-Z]+(\\-[[:digit:]]{1,3}){0,1}[[:lower:]]*$"    # recognize IDs eg 'P00915ups', 'P00915-3ups',
     ## still need to split multiple and use longest  (eg from 'P02768,P02768ups')
     chMult <- grep(",", anno)
     if(length(chMult) >0) {
       spl <- strsplit(as.character(anno[chMult]), ",")
       anno[chMult] <- sapply(spl, function(x) x[which.max(nchar(x))])
     }
     ## split lower-case extenstions from IDs
     if(isTRUE(inclLowerCaseExt)) { chExt <- grep(chPat3, anno)
       if(length(chExt) >0) {
         WOext <- anno
         WOext[chExt] <- sub("[[:lower:]]+$", "", anno[chExt])
         ext <- rep(NA, length(anno))
         ext[chExt] <- substring(anno[chExt], nchar(WOext[chExt]) +1)
         anno <- cbind(iniSoftAnn=iniSoftAnn, ID=WOext, IDext=ext)
     } }
     anno }
  ## end functions




  ## init check
  reqPa <- c("utils","wrMisc")
  chPa <- sapply(reqPa, requireNamespace, quietly=TRUE)
  if(any(!chPa)) stop("Package(s) '",paste(reqPa[which(!chPa)], collapse="','"),"' not found ! Please install first from CRAN")
  if(!isTRUE(silent)) silent <- FALSE
  if(isTRUE(debug)) silent <- FALSE else debug <- FALSE
         excluCol <- "^Abundances.Count"   # exclude this from quantifications columns
  cleanDescription <- TRUE                 # clean 'Description' for artifacts of truncated text (tailing ';' etc)
  infoDat <- infoFi <- setupSd <- parametersD <- annot <- annotMQ <- annotPL <- NULL        # initialize

  ## check if path & file exist
  paFi <- wrMisc::checkFilePath(fileName, path, expectExt="csv", compressedOption=TRUE, stopIfNothing=TRUE, callFrom=fxNa, silent=silent,debug=debug)
  ## read (main) file
  ## future: look for fast reading of files
    #  read.delim("C:\\E\\projects\\MassSpec\\smallProj\\ElixirBenchmark\\deWombat\\deGit24may23\\PXD009815dev\\dev\\stand_prot_quant_mergedcompomics.csv", sep=",")
  tmp <- try(utils::read.delim(paFi, sep=",", stringsAsFactors=FALSE), silent=TRUE)

  if(length(tmp) <1 || inherits(tmp, "try-error") || length(dim(tmp)) <2) {
    if(inherits(tmp, "try-error")) warning("Unable to read input file ('",paFi,"')!  (check format or if rights to read)") else {
      if(!silent) message(fxNa,"Content of  file '",paFi,"' seeps empty or non-conform !  Returning NULL; check if this is really a Compomics-file") }
    tmp <- NULL
    return(NULL)
  } else {
    ## start checking format of initial read of Normalizer-output
    if(debug) { message(fxNa,"rWB1 .. dims of initial data : ", nrow(tmp)," li and ",ncol(tmp)," col "); rWB1 <- list(fileName=fileName,path=path,paFi=paFi,tmp=tmp,normalizeMeth=normalizeMeth,read0asNA=read0asNA,quantCol=quantCol,
      refLi=refLi,separateAnnot=separateAnnot   )}  # annotCol=annotCol,FDRCol=FDRCol
    ## check which columns can be extracted (for annotation)

    if(length(extrColNames) <1) { extrColNames <- colnames(tmp)[1]        # default : use only 1st col for annotation
    } else {
      if(is.integer(contamCol) && length(contamCol) >0) contamCol <- colnames(tmp)[contamCol]
      extrColNames <- union(extrColNames, contamCol)                     # add contamCol if not included in extrColNames
      chCol <- extrColNames %in% colnames(tmp)
      if(!any(chCol, na.rm=TRUE)) { extrColNames <- gsub("\\."," ",extrColNames)
        chCol <- extrColNames %in% colnames(tmp) }
      if(all(!chCol, na.rm=TRUE)) stop("Problem locating annotation columns (",wrMisc::pasteC(extrColNames, quoteC="''"),")")
      if(any(!chCol, na.rm=TRUE)) {
        if(!silent) message(fxNa,"Note: Can't find columns ",wrMisc::pasteC(extrColNames[!chCol], quoteC="'")," !")}
    }
    if(debug) {message(fxNa,"rWB1c"); rWB1c <- list(tmp=tmp,paFi=paFi,fasta=fasta,quantCol=quantCol,extrColNames==extrColNames)}
  }

  if(length(tmp) >0) {
    ## check for lines with absent IDs  => eliminate
    chNa <- which(if("protein_group" %in% colnames(tmp)) is.na(tmp[,"protein_group"]) else rowSums(is.na(tmp))==ncol(tmp))
    if(length(chNa) >0) {
      if(!silent) message(fxNa,"Removing ",length(chNa)," lines since absent ID or all NA  (won't be able to do anything lateron withour ID ..)")
      tmp <- tmp[-chNa,]
    }
    if(debug) {message(fxNa,"rWB1d"); rWB1d <- list(tmp=tmp,paFi=paFi,fasta=fasta,quantCol=quantCol,extrColNames==extrColNames)}

    ## further extracting : quantitation
    useDCol <- grep(paste0("^",quantCol), colnames(tmp))
    if(length(useDCol) <1) stop("NO columns matching term ",wrMisc::pasteC(quantCol, quoteC="'")," from argument 'quantCol' found !")
    abund <- as.matrix(tmp[,useDCol])               # normalized log2 abundances
    if(debug) {message(fxNa,"rWB1e"); rWB1e <- list(tmp=tmp,paFi=paFi,fasta=fasta,abund=abund)}
       #iniAbundColNa <- colnames(abund)

    chNum <- try(is.numeric(abund), silent=TRUE)
    if(!chNum) {abund <- try(apply(tmp[,quantCol], 2, wrMisc::convToNum, convert="allChar", silent=silent, callFrom=fxNa), silent=TRUE)
      if(inherits(abund, "try-error")) {datOK <- FALSE; warning(fxNa,"CANNOT transform 'abund' to numeric data !'")} }
    if(length(dim(abund)) <2 && !is.numeric(abund)) abund <- matrix(as.numeric(abund), ncol=ncol(abund), dimnames=dimnames(abund))
    ch1 <- grepl("^abundance_CT\\.mixture\\.QY\\.", colnames(abund))
    if(all(ch1)) colnames(abund) <- sub("abundance_CT\\.mixture\\.QY\\.", "", colnames(abund))
    ch1 <- grepl("^abundance_", colnames(abund))
    if(all(ch1)) colnames(abund) <- sub("abundance_", "", colnames(abund))
    ch1 <- grepl("\\.CV\\.Standards\\.Research\\.Group", colnames(abund))
    if(all(ch1)) colnames(abund) <- sub("\\.CV\\.Standards\\.Research\\.Group", "", colnames(abund))

    trimColNames <- FALSE      ## further trim quantitation colnames
    if(trimColNames) {  ## further trim
      colnames(abund) <- wrMisc::.trimLeft(wrMisc::.trimRight( sub(paste0("^",quantCol),"", colnames(abund))))
      ## no trim needed for Wombat ?
    }
    if(debug) {message(fxNa,"rWB3"); rWB3 <- list(abund=abund,paFi=paFi,path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,remConta=remConta,pepCountCol=pepCountCol,fasta=fasta)}

    ## convert 0 to NA
    ## note tpp abundance values are all neg !!
    chNa <- !all(is.na(abund))
    if(chNa) {
      chRa <- range(abund, na.rm=TRUE)
      if(all(chRa <0) & isTRUE(read0asNA)) {read0asNA <- FALSE
      if(debug) message(fxNa,"All abundance values are <0 ! (omit transforming neg values to NA)") } }
    if(!isFALSE(read0asNA)) {
      ch1 <- abund <= 0
      if(any(ch1, na.rm=TRUE)) { abund[which(ch1)] <- NA
        if(!silent) message(fxNa,"Transform ",sum(ch1),"(",100*round(sum(ch1)/length(ch1),3),"%) initial '0' values to 'NA'")}}

    ## further extracting : prepare for countig data
    ch1 <- grep(pepCountCol[1], colnames(tmp))
    if(length(ch1) ==ncol(abund)) {
      counts <- array(dim=c(nrow(tmp), ncol(abund), 1), dimnames=list(NULL, colnames(abund), pepCountCol))
      counts[,,1] <- suppressWarnings(as.numeric(as.matrix(tmp[,ch1])))
    } else {
      counts <- NULL
      if(!silent) message(fxNa,"Could not find column(s) with peptide per protein counts (argument 'pepCountCol') matching to '",pepCountCol,"'") }
    if(debug) {message(fxNa,"rWB4"); rWB4 <- list(abund=abund,counts=counts,annot=annot,paFi=paFi,path=path,chPa=chPa,tmp=tmp,pepCountCol=pepCountCol,extrColNames=extrColNames,chCol=chCol,remConta=remConta,quantSoft=quantSoft,fasta=fasta,annotMQ=annotMQ)}

    ## add more measure-types if avail
    if(length(pepCountCol) >1) {
      for(i in 2:length(pepCountCol)) {
        ch1 <- grep(pepCountCol[i], colnames(tmp))
        if(length(ch1) ==ncol(abund)) {
          counts[,,i] <- suppressWarnings(as.numeric(as.matrix(tmp[,ch1]))) }
      }
    }
    if(debug) {message(fxNa,"rWB4a"); rWB4a <- list(abund=abund,counts=counts,annot=annot,paFi=paFi,path=path,chPa=chPa,tmp=tmp,pepCountCol=pepCountCol,extrColNames=extrColNames,chCol=chCol,remConta=remConta,quantSoft=quantSoft,fasta=fasta,annotMQ=annotMQ)}

    ## Annotation
    if(any(c("CP","Compomics","compomics") %in% quantSoft) && extrColNames[1] %in% colnames(tmp)) {    # special case PL: parse annot from column 'protein_group'
      annotPL <- .cleanSingleAnn(tmp[,extrColNames[1]])    #"protein_group"
      colnames(annotPL)[2] <- "protein_group"
    }
    if(any(c("MQ","MaxQuant","maxquant") %in% quantSoft) && extrColNames[1] %in% colnames(tmp)) {    # special case MQ: parse annot from column 'protein_group'
      annotMQ <- .cleanMQann(tmp[,extrColNames[1]])    #"protein_group"
    }

    if(any(c("PL","Proline","proline") %in% quantSoft) && extrColNames[1] %in% colnames(tmp)) {    # special case PL: parse annot from column 'protein_group'
      annotPL <- .cleanPLann(tmp[,extrColNames[1]])    #"protein_group"

    }
    if(any(c("TPP","tpp") %in% quantSoft) && extrColNames[1] %in% colnames(tmp)) {    # special case CP: remove lower-case extensions from 'uniqueIdentifier'
      annotPL <- .cleanPLann(tmp[,extrColNames[1]])    #"protein_group"
    }

    if(debug) {message(fxNa,"rWB4aa"); rWB4aa <- list(annotMQ=annotMQ,annotPL=annotPL,abund=abund,counts=counts,annot=annot,paFi=paFi,path=path,chPa=chPa,tmp=tmp,pepCountCol=pepCountCol,extrColNames=extrColNames,chCol=chCol,remConta=remConta,quantSoft=quantSoft,fasta=fasta,annotMQ=annotMQ)}

    ## read fasta from higher dir (specific to Wombat)
    if(length(fasta) >0) {fasta <- fasta[1]; if(isFALSE(fasta) || is.na(fasta)) fasta <- NULL}
    if(isTRUE(fasta)) {
      hiDir <- dir(file.path(dirname(paFi),".."))
      chFa <- grep("\\.fasta$", hiDir)
      faFi <- if(length(chFa) >0) file.path(dirname(paFi),"..",hiDir[chFa[1]]) else NULL
    } else faFi <- fasta
    if(length(faFi) >0) {     # has fasta for recuperating annotation
      fasta <- try(readFasta2(filename=faFi, tableOut=TRUE, silent=silent,debug=debug,callFrom=fxNa), silent=TRUE)
      ## Potential problem with inconsistent format of fasta
      if(inherits(fasta, "try-error")) { fasta <- NULL
        if(!silent) message(fxNa,"Unable to read/open fasta file '",faFi,"'  (check rights to read ?)")
      } else {
        tmpAnn <- if(length(annotMQ) >0) annotMQ[,2] else {if(length(annotPL) >0) annotPL[,2] else tmp[,extrColNames[1]]}      # 'P02768' still missing
        tm2 <- wrMisc::concatMatch(tmpAnn, fasta[,2], sepPattern=NULL, globalPat="digitExtension", silent=silent, debug=debug, callFrom=fxNa)   # clean protein-names (eg digit extensions, concateneated IDs) & match to data
        iniAnn <- if(length(annotMQ) >0) annotMQ else {if(length(annotPL) >0) annotPL else cbind(iniSoftAnn=tmp[,extrColNames[1]])}
        colnames(iniAnn) <- c("iniSoftAnn", if(ncol(iniAnn) >1) paste0(colnames(iniAnn)[-1],".",quantSoft))

        useFaCol <- match(c("uniqueIdentifier","entryName","proteinName","OS","OX","GN","database"), colnames(fasta))             # do not export full 'sequence'
        annot <- cbind(trimIdentifier=names(tm2), fasta[tm2, useFaCol], iniAnn=tmpAnn)
        if(debug) {message(fxNa,"rWB4ab"); rWB4ab <- list(abund=abund,counts=counts,annot=annot,fasta=fasta,tmp=tmp,chNa=chNa,annotPL=annotPL,annotMQ=annotMQ,extrColNames=extrColNames,extrColNames=extrColNames) }

        foundFastaCol <- !is.na(useFaCol)
        if(any(foundFastaCol)) colnames(annot)[1:sum(foundFastaCol)] <- c("Accession","AccessionFull","Description","EntryName","Species","OX","GeneName","Database")[which(foundFastaCol)]
        ##  strip species details
        if("Species" %in% annot) annot[,"Species"] <- sub(" \\(.+", "", annot[,"Species"])
      }
    } else {
      if(debug) message(fxNa,"NO fasta available !")
      annot <- if(length(annotMQ) >0) annotMQ else {if(length(annotPL) >0) annotPL else tmp[,extrColNames[1]]}
        if(length(annotMQ) <1 && length(annotPL) <1) {
          sep <- ","
          if(debug) message(fxNa,"rWB4ab")
          ## if no other annot : pick 1st of multiple, clean from lowerCase extension, clean from -digit extension
          ## need to test further ?
          chMult <- nchar(tmp[,extrColNames[1]]) - nchar(sub(sep, "", annot))
          if(any(chMult) >0) annot <- sapply(strsplit(annot, sep), function(x) {nCha <- nchar(x); if(length(x) >1) x[which.max(nCha)] else x  })
          annot <- cbind(Accession=sub("\\-[[:digit:]]+$","", sub("[[:lower:]]+$","", annot)), IniAccession=tmp[,extrColNames[1]])
        } else {
          ## need to arrange columns to proper order & names
          warning(fxNa,"NOTE : Supplemental arranging of columns to proper order & names not yet implemented")


        }
      if(debug) message(fxNa,"NOTE : No fasta-file found in main directory ...")
    }
    if(debug) {message(fxNa,"dim annot  ",nrow(annot)," ",ncol(annot),"  rWB4b"); rWB4b <- list(annot=annot,faFi=faFi,abund=abund,tmp=tmp,fasta=fasta,annotMQ=annotMQ)}

    ## check ID col of annot
    chID <- match(c("Accession","protein_group","uniqueIdentifier"), colnames(annot))
    if(all(is.na(chID))) { chID <- wrMisc::naOmit(match(c("protein_group","ID"), colnames(annot)))
      if(length(chID) < 1) warning("PROBLEM : UNEXPECTED colnames in annot") #else colnames(annot)[chID][1] <- "Accession"
    }
    if(!identical(wrMisc::naOmit(chID), 1) || length(wrMisc::naOmit(chID)) >0) {
      annot <- annot[,c(wrMisc::naOmit(chID)[1], (1:ncol(annot))[-wrMisc::naOmit(chID)[1]] )]     # adjust order to have ID in 1st column
      colnames(annot)[1] <- "Accession"
      if(!silent) message(fxNa,"Adjusting order of  annot to have ID in 1st column") }

    ## remove lines wo IDs
    chNa <- is.na(annot[,1])
    if(any(chNa)) {
      if(!silent) message(fxNa,"Removing ",sum(chNa)," out of ",nrow(abund)," lines wo ID")
      rmLi <- which(chNa)
      tmp <- tmp[-rmLi,]
      annot <- annot[-rmLi,]
      if(length(dim(annot)) <2) annot <- matrix(annot, ncol=1, dimnames=list(NULL,colnames(tmp)[1]))
      abund <- abund[-rmLi,]
      if(length(counts) >0) counts <- if(length(dim(counts))==3) counts[-rmLi,,] else counts[-rmLi,]
    }
    if(debug) {message(fxNa,"dim annot",nrow(annot)," ",ncol(annot),"  rWB4d"); rWB4d <- list(annot=annot,faFi=faFi,abund=abund,tmp=tmp)}

    ## unique ID
    chD <- duplicated(annot[,1])
    uniqueID <- if(any(chD, na.rm=TRUE)) wrMisc::correctToUnique(annot[,1], silent=silent, callFrom=fxNa) else annot[,1]  # extrColNames[1]
    rownames(annot) <- rownames(abund) <- uniqueID
    if(length(counts) >0) rownames(counts) <- uniqueID
    if(debug) {message(fxNa,"rWB4e"); rWB4e <- list(paFi=paFi,path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,counts=counts,
      quantCol=quantCol,abund=abund,chNum=chNum,annot=annot,remConta=remConta,specPref=specPref)}

    ## remove Wombat contaminants
   #useColumn <- wrMisc::naOmit(match(c("Accession","protein_group"), colnames(annot)))
    conLi <- grep("CON__[[:alnum:]]", annot[, if(ncol(annot) >1) wrMisc::naOmit(match(c("Accession","protein_group"), colnames(annot)))[1] else 1])
    if(remConta) {
      if(length(conLi) >0) {
        iniLi <- nrow(annot)
        annot <- annot[-conLi,]
        abund <- abund[-conLi,]
        counts <- if(length(dim(counts))==3) counts[-conLi,,] else counts[-conLi,,]
        if(debug) message(fxNa,"Removing ",length(conLi)," instances of contaminants to final ",nrow(annot)," lines/IDs")}
    }

    ## split Annotation
    if(debug) {message(fxNa,"rWB4f"); rWB4f <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,counts=counts,
      quantCol=quantCol,abund=abund,chNum=chNum,annot=annot,remConta=remConta,specPref=specPref)}

    ## finalize annotation
    chCols <- c("EntryName","GeneName","Species","Contam","Description")
    chCol2 <- chCols %in% colnames(annot)
    if(any(!chCol2)) annot <- cbind(annot, matrix(NA, nrow=nrow(annot), ncol=sum(!chCol2), dimnames=list(NULL, chCols[which(!chCol2)]))) # add columns so far not present
    if(!remConta && length(conLi) >0) annot[conLi, "Contam"] <- "TRUE"

    if(debug) {message(fxNa,"rWB5"); rWB5 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,counts=counts,
      quantCol=quantCol,abund=abund,chNum=chNum,annot=annot,remConta=remConta,remStrainNo=remStrainNo, specPref=specPref)}

    ## extract species according to custom search parameters 'specPref'
    if(remStrainNo && any(!is.na(annot[,"Species"]))) {
      annot[,"Species"] <- sub(" \\(strain [[:alnum:]].+","", annot[,"Species"])
    }
    ## complete species annot   by info extracted from fasta : ' OS='
    .completeSpeciesAnnot <- function(spe=c("Homo sapiens", "_HUMAN"), anno=annot, exCoNa=c("Species", "EntryName","name","proteinName")) {    # re-written 12jun23
      ## complete species if missing in anno[,exCoNa[2]] but found in anno[,exCoNa[1]]; return corrected anno
      chNa <- is.na(anno[,exCoNa[1]]) | nchar(anno[,exCoNa[1]]) <1             # missing (species) annotation
      if(any(chNa, na.rm=TRUE)) {        # suppose that all 'exCoNa' are present as colnames in 'annot'
        useColumn <- if(all(is.na(anno[,exCoNa[2]]))) wrMisc::naOmit(match(exCoNa[3:length(exCoNa)], colnames(anno))) else exCoNa[2]
        if(length(useColumn) >1) useColumn <- useColumn[1]
        chS <- grep(spe[1], anno[,useColumn])
        if(length(chS) >0) anno[chS, exCoNa[1]] <- spe[2]
      }
      anno }
    if(isTRUE(fixSpeciesNames)) {          # try to recuperate/fix non-given/bad formatted species
      chNa <- is.na(annot[,"Species"])
      if(any(chNa)) {
        commonSpec <- .commonSpecies()
        for(i in 1:nrow(commonSpec)) annot[which(chNa),] <- .completeSpeciesAnnot(commonSpec[i,], annot[which(chNa),], exCoNa=c("Species","EntryName","name","proteinName")) }
      if(debug) {message(fxNa,"rWB6"); rWB6 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,counts=counts,
        quantCol=quantCol,abund=abund,chNum=chNum,annot=annot,remConta=remConta,remStrainNo=remStrainNo, specPref=specPref)}

      ## check/complete for truncated species names (ie names found inside other ones)
      chSpe <- which(!is.na(annot[,"Species"]) & nchar(annot[,"Species"]) >0)
      if(length(chSpe) >0) {
        OS <- gsub(";{1,5}$", "", annot[chSpe,"Species"])              # remove tailing separators
        OSna <- unique(OS)
        ch1 <- nchar(OSna) <1
        if(debug) {message(fxNa,"rWB6b")}
        if(any(ch1, na.rm=TRUE)) OSna <- OSna[which(nchar(OSna) >0)]     # (just in case) remove empty tags
        ch2 <- lapply(OSna, grep, OSna)
        chTr <- sapply(ch2, length) >1
        if(any(chTr, na.rm=TRUE)) { if(!silent) message(fxNa,"Found ",sum(chTr)," species name(s) appearing inside other ones, assume as truncated (eg  ",OSna[which(chTr)[1]],")")
          for(i in which(chTr)) OS[which(OS==OSna[i])] <- OSna[ch2[[i]][1]]
        }
        annot[chSpe,"Species"] <- OS }
    }
    ## in case "Accession" is avail not "EntryName" is not

    if(debug) {message(fxNa,"rWB7"); rWB7 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,quantCol=quantCol,remStrainNo=remStrainNo,
      abund=abund,chNum=chNum,specPref=specPref, annot=annot,remConta=remConta,counts=counts)}

    ## look for tags from  specPref
    if(length(specPref) >0) {
      ## set annot[,"specPref"] according to specPref
      annot <- .extrSpecPref(specPref, annot, useColumn=c("Description","Species","EntryName","GeneName"), silent=silent, debug=debug, callFrom=fxNa)
    } else if(debug) message(fxNa,"Note: Argument 'specPref' not specifed (empty)")
    if(debug) {message(fxNa,"rWB7b") }

    if(!silent) { chSp <- sum(is.na(annot[,"Species"]))
    if(chSp >0) message(fxNa,"Note: ",chSp," proteins with unknown species")

    tab <- table(annot[,"Species"])
    if(length(tab) >0) {
      tab <- rbind(names(tab), paste0(": ",tab,",  "))
      if(!silent) message("     data by species : ", apply(tab, 2, paste)) } }              # all lines assigned

    if(debug) {message(fxNa,"rWB8"); rWB8 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,quantCol=quantCol,remStrainNo=remStrainNo,
      abund=abund,chNum=chNum, annot=annot,remConta=remConta,counts=counts) }

    ## look for unique col from $annot to use as rownames
    if(nrow(annot) <1) warning("annot is empty (NO lines)")
     ## maybe annot is empty ?

    chAn <- colSums(apply(annot[,c(1:min(ncol(annot),7))], 2, duplicated), na.rm=TRUE)          # look at first 6 cols : how many elements per column duplicated
    if(!silent) message(fxNa,"Use column '",colnames(annot)[which.min(chAn)],"' as identifyer (has fewest, ie ",chAn[which.min(chAn)]," duplicated entries) as rownames")
    rownames(abund) <- rownames(annot) <- if(any(chAn==0)) annot[,which(chAn==0)[1]] else wrMisc::correctToUnique(annot[,which.min(chAn)], callFrom=fxNa)
    if(length(counts) >0) rownames(counts) <- rownames(annot)
    if(debug) {message(fxNa,"rWB9"); rWB9 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,quantCol=quantCol,abund=abund,chNum=chNum,
      annot=annot,refLi=refLi,remConta=remConta)}

    ## check for reference for normalization
    refLiIni <- refLi
    if(is.character(refLi) && length(refLi)==1) {
       refLi <- which(annot[,"SpecType"]==refLi)
      if(length(refLi) <1 ) { refLi <- 1:nrow(abund)
        if(!silent) message(fxNa,"Could not find any proteins matching argument 'refLi=",refLiIni,"', ignoring ...")
      } else {
        if(!silent) message(fxNa,"Normalize using (custom) subset of ",length(refLi)," lines specified as '",refLiIni,"'")}}    # may be "mainSpe"

    ## take log2 & normalize
    quant <- try(wrMisc::normalizeThis(if(isLog2) abund else log2(abund), method=normalizeMeth, mode="additive", refLines=refLi, silent=silent, debug=debug, callFrom=fxNa), silent=TRUE)
    if(inherits(quant, "try-error")) { warning(fxNa,"PROBLEMS ahead : Unable to normalize as log2-data !!") }

    if(debug) {message(fxNa,"rWB10"); rWB10 <- list(path=path,chPa=chPa,tmp=tmp,extrColNames=extrColNames,chCol=chCol,quantCol=quantCol,abund=abund,chNum=chNum,
      quant=quant,annot=annot,remConta=remConta,groupPref=groupPref,quantSoft=quantSoft,suplAnnotFile=suplAnnotFile, sdrf=sdrf,paFi=paFi )}

    ### GROUPING OF REPLICATES AND SAMPLE META-DATA
    ## prepare for sdrf (search in directory above)
    if(isTRUE(sdrf)) {
      hiDir <- dir(file.path(dirname(paFi),".."))
      chFa <- grep("^sdrf.+\\.tsv$", hiDir)
      if(length(chFa) >0) sdrf <- file.path(dirname(paFi),"..",hiDir[chFa[1]]) else {sdrf <- NULL
        if(!silent) message(fxNa,"NO sdrf file found in directory above main data !")}
    }

    if(length(suplAnnotFile) >0 || length(sdrf) >0) {
      headAbund <- utils::head(quant)
      chX <- grepl("^X[[:digit:]]",colnames(quant))                     #check for heading X in all colnames
      if(any(chX)) colnames(headAbund)[which(chX)] <- sub("^X", "", colnames(headAbund)[which(chX)])
      ## check for matching : (as done within readSampleMetaData) - can't , sdrf not read yet ...
      if(length(sampleNames) %in% c(1, ncol(abund))) groupPref$sampleNames <- sampleNames
      if(length(gr) %in% c(1, ncol(abund))) groupPref$gr <- gr
      setupSd <- readSampleMetaData(sdrf=sdrf, suplAnnotFile=suplAnnotFile, quantMeth="WB", path=path, abund=utils::head(quant), chUnit=isTRUE(groupPref$chUnit), groupPref=groupPref, silent=silent, debug=debug, callFrom=fxNa)
    }
    if(debug) {message(fxNa,"rWB13 .."); rWB13 <- list(sdrf=sdrf,gr=gr,suplAnnotFile=suplAnnotFile,abund=abund, quant=quant,refLi=refLi,annot=annot,setupSd=setupSd,sampleNames=sampleNames)}

    ## finish groups of replicates & annotation setupSd
    setupSd <- .checkSetupGroups(abund=abund, setupSd=setupSd, gr=gr, sampleNames=sampleNames, quantMeth="WB", silent=silent, debug=debug, callFrom=fxNa)

    ## harmonize sample-names/1
    colNa <- if(length(setupSd$sampleNames)==ncol(abund)) setupSd$sampleNames else setupSd$groups

    ## option : set order of samples as sdrf
    if("sdrfOrder" %in% names(sdrf) && isTRUE(as.logical(sdrf["sdrfOrder"])) && length(setupSd$iniSdrfOrder)==ncol(abund) && ncol(abund) >1) {  # set order according to sdrf (only if >1 samples)
      nOrd <- order(setupSd$iniSdrfOrder)
      ## rename columns according to sdrf and set order of quant and abund ..
      abund <- abund[,nOrd]
      if(length(quant) >0) quant <- quant[,nOrd]
      if(length(setupSd$sampleNames)==ncol(quant)) {
        colNa <- colnames(abund) <- setupSd$sampleNames <- setupSd$sampleNaSdrf[nOrd]  #old# setupSd$sampleNames[nOrd]         ## take sample names from sdrf via  setupSd$sampleNaSdrf
        if(length(quant) >0) colnames(quant) <- setupSd$sampleNaSdrf[nOrd]  #old# setupSd$sampleNames[nOrd]
      } else colNa <- colnames(abund)
      ## now adapt order of setupSd, incl init Sdrf
      if(length(setupSd) >0) { 
        is2dim <- sapply(setupSd, function(x,le) length(dim(x))==2 && nrow(x)==le, le=length(nOrd))    # look for matr or df to change order of lines
        if(any(is2dim) >0) for(i in which(is2dim)) setupSd[[i]] <- setupSd[[i]][nOrd,]
        isVe <- sapply(setupSd, function(x,le) length(x)==le && length(dim(x)) <1, le=length(nOrd))    # look for vector to change order in setupSd
        if(any(isVe) >0) for(i in which(isVe)) setupSd[[i]] <- setupSd[[i]][nOrd] }
      gr <- gr[nOrd]

      if(length(counts) >0 && length(dim(counts))==3) counts <- array(counts[,nOrd,], dim=c(nrow(counts), length(nOrd), dim(counts)[3]), 
        dimnames=list(rownames(counts), colnames(counts)[nOrd], dimnames(counts)[[3]]))
      ## try re-adjusting levels
      tm1 <- sub("^[[:alpha:]]+( |_|-|\\.)+[[:alpha:]]+","", colnames(abund))  # remove heading text
      if(all(grepl("^[[:digit:]]", tm1))) {
        tm1 <- try(as.numeric(sub("( |_|-|\\.)*[[:alpha:]].*","", tm1)), silent=TRUE)   # remove tailing text and try converting to numeric
        if(!inherits(tm1, "try-error")) {
          setupSd$level <- match(tm1, sort(unique(tm1)))
          names(setupSd$level) <- tm1
          if(!silent) message(fxNa,"Sucessfully re-adjusted levels after bringing in order of Sdrf")}
      }     
    } else {

      ## harmonize sample-names/2
      colNa <- colnames(abund)
      chGr <- grepl("^X[[:digit:]]", colNa)                                                # check & remove heading 'X' from initial column-names starting with digits
      if(any(chGr)) colNa[which(chGr)] <- sub("^X","", colNa[which(chGr)])                 #
      colnames(quant) <- colNa
      if(length(abund) >0) colnames(abund) <- colNa  
    }  
    if(length(setupSd$sampleNames)==ncol(abund)) setupSd$sampleNames <- colNa #no#else setupSd$groups <- colNa
    if(length(dim(counts)) >1 && length(counts) >0) colnames(counts) <- colNa

    if(debug) {message(fxNa,"Read sample-meta data, rWB14"); rWB14 <- list(sdrf=sdrf,suplAnnotFile=suplAnnotFile,abund=abund, quant=quant,refLi=refLi,annot=annot,setupSd=setupSd,plotGraph=plotGraph,normalizeMeth=normalizeMeth,isLog2=isLog2)}

    ## main plotting of distribution of intensities
    custLay <- NULL
    if(is.numeric(plotGraph) && length(plotGraph) >0) {custLay <- as.integer(plotGraph); plotGraph <- TRUE} else {
        if(!isTRUE(plotGraph)) plotGraph <- FALSE}
    if(plotGraph) .plotQuantDistr(abund=if(isFALSE(isLog2) || "none" %in% normalizeMeth) NULL else abund, quant=quant, custLay=custLay, normalizeMeth=normalizeMeth, softNa=paste("Wombat-P",quantSoft),
      refLi=refLi, refLiIni=refLiIni, tit=titGraph, silent=silent, callFrom=fxNa, debug=debug)
    
    ## meta-data
    notes <- c(inpFile=paFi, qmethod=paste("Wombat-P",quantSoft), qMethVersion=if(length(infoDat) >0) unique(infoDat$Software.Revision) else NA,
    	rawFilePath= if(length(infoDat) >0) infoDat$File.Name[1] else NA, normalizeMeth=normalizeMeth, call=deparse(match.call()),
      created=as.character(Sys.time()), wrProteo.version=utils::packageVersion("wrProteo"), machine=Sys.info()["nodename"])
    ## final output
    if(isTRUE(separateAnnot)) list(raw=abund, quant=quant, annot=annot, counts=counts, sampleSetup=setupSd, quantNotes=parametersD, notes=notes) else data.frame(quant,annot) }
}