R/htrx_max.R

Defines functions htrx_max

Documented in htrx_max

#' @title Maximum independent features for HTRX
#' @description The maximum number of independent features in principle from
#' haplotypes (i.e. interactions between SNPs) generated by Haplotype Trend Regression with eXtra flexibility (HTRX).
#' @name htrx_max
#' @param nsnp a positive integer giving the number of single nucleotide polymorphisms (SNPs) included in the haplotypes.
#' @param n_haps a positive integer giving the number of haplotypes,
#' which is also the number of columns of the HTRX or HTR matrix.
#' @param cap a positive integer which manually sets the maximum number of independent features.
#' By default, \code{cap=40}.
#' @param max_int a positive integer which specifies the maximum number of SNPs that can interact.
#' If no value is given (by default), interactions between all the SNPs will be considered.
#' @param htr logical. If \code{htr=TRUE}, the functions returns the maximum number of independent
#' features for HTR. By default, \code{htr=FALSE}.
#'
#' @details The maximum number of independent features in principle is \ifelse{html}{\out{2<sup>nsnp</sup>}}{\eqn{2^nsnp}}-1
#' for haplotypes containing interactions between all different numbers of SNPs.
#' However, if \code{max_int < nsnp}, i.e. only the interactions between at most \code{max_int} SNPs are investigated,
#' there will be fewer maximum number of independent features.
#' You can also manually set the upper limit of independent features (by setting \code{cap}) that can be included in the final HTRX or HTR model.
#'
#' @return \code{htrx_max} returns a positive integer giving the maximum
#' number of independent features to be included in the analysis.
#' @examples
#' ## the maximum number of independent haplotypes consisted of 4 SNPs from HTRX
#' htrx_max(nsnp=4,n_haps=(3^4-1))
NULL

#' @rdname htrx_max
#' @export
htrx_max<-function(nsnp,n_haps=NULL,cap=40,max_int=NULL,htr=FALSE){
  if(htr){
    n_hap=2^nsnp-1
    if(n_hap>cap) n_hap=cap;
    return(n_hap)
  }else{
    if(is.null(max_int)) max_int=nsnp
    ## This is based on the theoretical number of linearly independent features:
    ## If we have N snps for all the interactions, there are at most 2^n-1 independent haplotypes
    if(nsnp==1) n_hapx=1
    if(nsnp==2) n_hapx=c(2,3)[max_int]
    if(nsnp==3) n_hapx=c(3,6,7)[max_int]
    if(nsnp==4) n_hapx=c(4,10,14,15)[max_int]
    if(nsnp==5) n_hapx=c(5,15,25,30,31)[max_int]
    if(nsnp==6) n_hapx=c(6,21,41,56,62,63)[max_int]
    if(nsnp==7) n_hapx=c(7,28,63,98,119,126,127)[max_int]
    if(nsnp==8){
      if(max_int<=3) n_hapx=c(8,36,92)[max_int]
      if(max_int>3){
        n_hapx=c(6,21,41,56,62,63)[min(max_int,6)]
        warning('Please use cumulative HTRX, the cap is suggested to be smaller than the returned value')
      }
    }
    if(nsnp==9){
      if(max_int<=3) n_hapx=c(9,45,129)[max_int]
      if(max_int>3){
        n_hapx=c(6,21,41,56,62,63)[min(max_int,6)]
        warning('Please use cumulative HTRX, the cap is suggested to be smaller than the returned value')
      }
    }
    if(nsnp==10){
      if(max_int<=2) n_hapx=c(9,55)[max_int]
      if(max_int>2){
        n_hapx=c(6,21,41,56,62,63)[min(max_int,6)]
        warning('Please use cumulative HTRX, the cap is suggested to be smaller than the returned value')
      }
    }
    if(nsnp>10){
      n_hapx=c(6,21,41,56,62,63)[min(max_int,6)]
      warning('Please use cumulative HTRX, the cap is suggested to be smaller than the returned value')
    }


    ## count the total number of haplotypes if n_haps are not given
    if(is.null(n_haps)){
      if(max_int==nsnp){
        n_haps=3^nsnp-1
      }else{
        n_haps=Reduce(sum,lapply(1:max_int,function(s) choose(nsnp,s)*(2^s)))
      }
    }
    if(n_hapx>n_haps) n_hapx=n_haps;
    if(n_hapx>cap) n_hapx=cap;
    return(n_hapx)
  }

}

Try the HTRX package in your browser

Any scripts or data that you put into this service are public.

HTRX documentation built on May 29, 2024, 5:53 a.m.