R/ParameterEstimation.R

Defines functions ParameterEstimation

Documented in ParameterEstimation

################################################################################################
###################################     Library functions    ###################################
################################################################################################

#' @title "ParameterEstimation"
#' @description Estimated the parameters that represent the probabilities of three active DNA methylation change types during n cell cycle(s).
#' @param original_methyl The original methylation level of each CpG/gene/region.
#' @param terminal_methyl The paired terminal methylation level of each CpG/gene/region.
#' @param iter The iteration times of the parameter estimation using the Newton-Raphson method with different initial guesses.
#' @param cell_cycle The cell cycle times from the original state to the terminal state.
#' @return estimated_parameters The estimated parameters using the maximum likelihood estimation and the Newton-Raphson method.
#' @return predicted_matrix The calculated transition matrix using the estimated parameters.
#' @details The transition matrix of this model describes the changes of DNA methylation during one cell cycle in three steps:
#' passive demethylation by DNA replication, active DNA methylation changes affected by DNA methylation-modifying enzymes
#' and DNA methylation combinations during homologous recombination.
#' For each CpG site in a chromsome, the methylation states are one of these four types: 0-0, 0-1, 1-0, 1-1.
#' The transition matrix after DNA replication would be :
#' 	\tabular{cccccc}{
#' 		\tab original(0-0) \tab original(0-1) \tab original(1-0) \tab original(1-1) \cr
#' 		after_replication(0-0) \tab \eqn{1} \tab \eqn{a} \tab \eqn{1-a} \tab \eqn{0} \cr
#' 		after_replication(0-1) \tab \eqn{0} \tab \eqn{1-a} \tab \eqn{0} \tab \eqn{1-a} \cr
#' 		after_replication(1-0) \tab \eqn{0} \tab \eqn{0} \tab \eqn{0} \tab \eqn{a} \cr
#' 		after_replication(1-1) \tab \eqn{0} \tab \eqn{0} \tab \eqn{0} \tab \eqn{0} \cr
#' 	}
#' 	among this matrix ,\eqn{a} is the methylation change probability with DNA replication and equal to 0.5.
#' 	Then the transition matrix after active DNA methylation changes would be:
#' 	\tabular{cccccc}{
#' 		\tab after_replication(0-0) \tab after_replication(0-1) \tab after_replication(1-0) \tab after_replication(1-1) \cr
#' 		after_enzymemodifying(0-0) \tab \eqn{(1-u) \times (1-u)} \tab \eqn{(1-u-p+u \times p) \times d} \tab \eqn{d \times (1-u-p+u \times p)} \tab \eqn{0} \cr
#' 		after_enzymemodifying(0-1) \tab \eqn{u \times (1-u)} \tab \eqn{(1-u-p+u \times p) \times (1-d)} \tab \eqn{d \times (u+p-u \times p)} \tab \eqn{0} \cr
#' 		after_enzymemodifying(1-0) \tab \eqn{u \times (1-u)} \tab \eqn{(u+p-u \times p) \times d} \tab \eqn{(1-d) \times (1-u-p+u \times p)} \tab \eqn{0} \cr
#' 		after_enzymemodifying(1-1) \tab \eqn{u \times u} \tab \eqn{(u+p-u \times p) \times (1-d)} \tab \eqn{(1-d) \times (u+p-u \times p)} \tab \eqn{1} \cr
#' 	}
#' 	The paramater \eqn{u} described the methylation probablity on CpG site.
#' 	The paramater \eqn{d} described the de-methylation probablity on 5mCpG site.
#' 	The paramater \eqn{p} described the methylation probablity on semi-CpG site.
#' 	Thus the mathlation state change is \eqn{P = Penzymemodifying \cdot Preplication \cdot original_classes}
#' 	We using \eqn{t_{i,j}} represent the \eqn{i} and \eqn{j} vector of the matrix \eqn{Penzymemodifying \cdot Preplication}
#' 	Then two chromsomes are combined during homologous recombination. The observed DNA methylation of each CpG site is the combination of the methylation types in both chromsomes.
#'  The observed transition matrix would be :
#' 	\tabular{cccccc}{
#' 		\tab original_class1(0) \tab original_class2(1/4) \tab original_class3(1/2) \tab original_class4(3/4) \tab original_class5(1)\cr
#' 		terminal_class1(0) \tab \eqn{x_{1,1}} \tab \eqn{x_{1,2}} \tab \eqn{x_{1,3}} \tab \eqn{x_{1,4}} \tab \eqn{x_{1,5}}\cr
#' 		terminal_class2(1/4) \tab \eqn{x_{2,1}} \tab \eqn{x_{2,2}} \tab \eqn{x_{2,3}} \tab \eqn{x_{2,4}} \tab \eqn{x_{2,5}}\cr
#' 		terminal_class3(1/2) \tab \eqn{x_{3,1}} \tab \eqn{x_{3,2}} \tab \eqn{x_{3,3}} \tab \eqn{x_{3,4}} \tab \eqn{x_{3,5}}\cr
#' 		terminal_class4(3/4) \tab \eqn{x_{4,1}} \tab \eqn{x_{4,2}} \tab \eqn{x_{4,3}} \tab \eqn{x_{4,4}} \tab \eqn{x_{4,5}}\cr
#' 		terminal_class5(1) \tab \eqn{x_{5,1}} \tab \eqn{x_{5,2}} \tab \eqn{x_{5,3}} \tab \eqn{x_{5,4}} \tab \eqn{x_{5,5}}\cr
#' 	}
#' 	and \deqn{x_{1,1}=t_{1,1} \times t_{1,1}}
#' \deqn{x_{1,1}=t_{{,1},1} \times t_{{,1},1}}
#' \deqn{x_{1,2}=1/4 \times (t_{1,1} \times t_{1,2}+t_{1,1} \times t_{1,3}+t_{1,2} \times t_{1,1}+t_{1,3} \times t_{1,1})}
#' \deqn{x_{1,3}=1/6 \times (t_{1,1} \times t_{1,4}+t_{1,2} \times t_{1,2}+t_{1,2} \times t_{1,3}+t_{1,3} \times t_{1,2}+t_{1,3} \times t_{1,3}+t_{1,4} \times t_{1,1})}
#' \deqn{x_{1,4}=1/4 \times (t_{1,2} \times t_{1,4}+t_{1,3} \times t_{1,4}+t_{1,4} \times t_{1,2}+t_{1,4} \times t_{1,3})}
#' \deqn{x_{1,5}=t_{1,4} \times t_{1,4}}
#' \deqn{x_{2,1}=t_{1,1} \times t_{2,1}+t_{1,1} \times t_{3,1}+t_{2,1} \times t_{1,1}+t_{3,1} \times t_{1,1}}
#' \deqn{x_{2,2}=1/4 \times (t_{1,1} \times t_{2,2}+t_{1,1} \times t_{2,3}+t_{1,2} \times t_{2,1}+t_{1,3} \times t_{2,1}+t_{1,1} \times t_{3,2}+t_{1,1} \times t_{3,3}+t_{1,2} \times t_{3,1}+t_{1,3} \times t_{3,1}+t_{2,1} \times t_{1,2}+t_{2,1} \times t_{1,3}+t_{2,2} \times t_{1,1}+t_{2,3} \times t_{1,1}+t_{3,1} \times t_{1,2}+t_{3,1} \times t_{1,3}+t_{3,2} \times t_{1,1}+t_{3,3} \times t_{1,1})}
#' \deqn{x_{2,3}=1/6 \times (t_{1,1} \times t_{2,4}+t_{1,2} \times t_{2,2}+t_{1,2} \times t_{2,3}+t_{1,3} \times t_{2,2}+t_{1,3} \times t_{2,3}+t_{1,4} \times t_{2,1}+t_{1,1} \times t_{3,4}+t_{1,2} \times t_{3,2}+t_{1,2} \times t_{3,3}+t_{1,3} \times t_{3,2}+t_{1,3} \times t_{3,3}+t_{1,4} \times t_{3,1}+t_{2,1} \times t_{1,4}+t_{2,2} \times t_{1,2}+t_{2,2} \times t_{1,3}+t_{2,3} \times t_{1,2}+t_{2,3} \times t_{1,3}+t_{2,4} \times t_{1,1}+t_{3,1} \times t_{1,4}+t_{3,2} \times t_{1,2}+t_{3,2} \times t_{1,3}+t_{3,3} \times t_{1,2}+t_{3,3} \times t_{1,3}+t_{3,4} \times t_{1,1})}
#' \deqn{x_{2,4}=1/4 \times (t_{1,2} \times t_{2,4}+t_{1,3} \times t_{2,4}+t_{1,4} \times t_{2,2}+t_{1,4} \times t_{2,3}+t_{1,2} \times t_{3,4}+t_{1,3} \times t_{3,4}+t_{1,4} \times t_{3,2}+t_{1,4} \times t_{3,3}+t_{2,2} \times t_{1,4}+t_{2,3} \times t_{1,4}+t_{2,4} \times t_{1,2}+t_{2,4} \times t_{1,3}+t_{3,2} \times t_{1,4}+t_{3,3} \times t_{1,4}+t_{3,4} \times t_{1,2}+t_{3,4} \times t_{1,3})}
#' \deqn{x_{2,5}=t_{1,4} \times t_{2,4}+t_{1,4} \times t_{3,4}+t_{2,4} \times t_{1,4}+t_{3,4} \times t_{1,4}}
#' \deqn{x_{3,1}=t_{1,1} \times t_{4,1}+t_{2,1} \times t_{2,1}+t_{2,1} \times t_{3,1}+t_{3,1} \times t_{2,1}+t_{3,1} \times t_{3,1}+t_{4,1} \times t_{1,1}}
#' \deqn{x_{3,2}=1/4 \times (t_{1,1} \times t_{4,2}+t_{1,1} \times t_{4,3}+t_{1,2} \times t_{4,1}+t_{1,3} \times t_{4,1}+t_{2,1} \times t_{2,2}+t_{2,1} \times t_{2,3}+t_{2,2} \times t_{2,1}+t_{2,3} \times t_{2,1}+t_{2,1} \times t_{3,2}+t_{2,1} \times t_{3,3}+t_{2,2} \times t_{3,1}+t_{2,3} \times t_{3,1}+t_{3,1} \times t_{2,2}+t_{3,1} \times t_{2,3}+t_{3,2} \times t_{2,1}+t_{3,3} \times t_{2,1}+t_{3,1} \times t_{3,2}+t_{3,1} \times t_{3,3}+t_{3,2} \times t_{3,1}+t_{3,3} \times t_{3,1}+t_{4,1} \times t_{1,2}+t_{4,1} \times t_{1,3}+t_{4,2} \times t_{1,1}+t_{4,3} \times t_{1,1})}
#' \deqn{x_{3,3}=1/6 \times (t_{1,1} \times t_{4,4}+t_{1,2} \times t_{4,2}+t_{1,2} \times t_{4,3}+t_{1,3} \times t_{4,2}+t_{1,3} \times t_{4,3}+t_{1,4} \times t_{4,1}+t_{2,1} \times t_{2,4}+t_{2,2} \times t_{2,2}+t_{2,2} \times t_{2,3}+t_{2,3} \times t_{2,2}+t_{2,3} \times t_{2,3}+t_{2,4} \times t_{2,1}+t_{2,1} \times t_{3,4}+t_{2,2} \times t_{3,2}+t_{2,2} \times t_{3,3}+t_{2,3} \times t_{3,2}+t_{2,3} \times t_{3,3}+t_{2,4} \times t_{3,1}+t_{3,1} \times t_{2,4}+t_{3,2} \times t_{2,2}+t_{3,2} \times t_{2,3}+t_{3,3} \times t_{2,2}+t_{3,3} \times t_{2,3}+t_{3,4} \times t_{2,1}+t_{3,1} \times t_{3,4}+t_{3,2} \times t_{3,2}+t_{3,2} \times t_{3,3}+t_{3,3} \times t_{3,2}+t_{3,3} \times t_{3,3}+t_{3,4} \times t_{3,1}+t_{4,1} \times t_{1,4}+t_{4,2} \times t_{1,2}+t_{4,2} \times t_{1,3}+t_{4,3} \times t_{1,2}+t_{4,3} \times t_{1,3}+t_{4,4} \times t_{1,1})}
#' \deqn{x_{3,4}=1/4 \times (t_{1,2} \times t_{4,4}+t_{1,3} \times t_{4,4}+t_{1,4} \times t_{4,2}+t_{1,4} \times t_{4,3}+t_{2,2} \times t_{2,4}+t_{2,3} \times t_{2,4}+t_{2,4} \times t_{2,2}+t_{2,4} \times t_{2,3}+t_{2,2} \times t_{3,4}+t_{2,3} \times t_{3,4}+t_{2,4} \times t_{3,2}+t_{2,4} \times t_{3,3}+t_{3,2} \times t_{2,4}+t_{3,3} \times t_{2,4}+t_{3,4} \times t_{2,2}+t_{3,4} \times t_{2,3}+t_{3,2} \times t_{3,4}+t_{3,3} \times t_{3,4}+t_{3,4} \times t_{3,2}+t_{3,4} \times t_{3,3}+t_{4,2} \times t_{1,4}+t_{4,3} \times t_{1,4}+t_{4,4} \times t_{1,2}+t_{4,4} \times t_{1,3})}
#' \deqn{x_{3,5}=t_{1,4} \times t_{4,4}+t_{2,4} \times t_{2,4}+t_{2,4} \times t_{3,4}+t_{3,4} \times t_{2,4}+t_{3,4} \times t_{3,4}+t_{4,4} \times t_{1,4}}
#' \deqn{x_{4,1}=t_{2,1} \times t_{4,1}+t_{3,1} \times t_{4,1}+t_{4,1} \times t_{2,1}+t_{4,1} \times t_{3,1}}
#' \deqn{x_{4,2}=1/4 \times (t_{2,1} \times t_{4,2}+t_{2,1} \times t_{4,3}+t_{2,2} \times t_{4,1}+t_{2,3} \times t_{4,1}+t_{3,1} \times t_{4,2}+t_{3,1} \times t_{4,3}+t_{3,2} \times t_{4,1}+t_{3,3} \times t_{4,1}+t_{4,1} \times t_{2,2}+t_{4,1} \times t_{2,3}+t_{4,2} \times t_{2,1}+t_{4,3} \times t_{2,1}+t_{4,1} \times t_{3,2}+t_{4,1} \times t_{3,3}+t_{4,2} \times t_{3,1}+t_{4,3} \times t_{3,1})}
#' \deqn{x_{4,3}=1/6 \times (t_{2,1} \times t_{4,4}+t_{2,2} \times t_{4,2}+t_{2,2} \times t_{4,3}+t_{2,3} \times t_{4,2}+t_{2,3} \times t_{4,3}+t_{2,4} \times t_{4,1}+t_{3,1} \times t_{4,4}+t_{3,2} \times t_{4,2}+t_{3,2} \times t_{4,3}+t_{3,3} \times t_{4,2}+t_{3,3} \times t_{4,3}+t_{3,4} \times t_{4,1}+t_{4,1} \times t_{2,4}+t_{4,2} \times t_{2,2}+t_{4,2} \times t_{2,3}+t_{4,3} \times t_{2,2}+t_{4,3} \times t_{2,3}+t_{4,4} \times t_{2,1}+t_{4,1} \times t_{3,4}+t_{4,2} \times t_{3,2}+t_{4,2} \times t_{3,3}+t_{4,3} \times t_{3,2}+t_{4,3} \times t_{3,3}+t_{4,4} \times t_{3,1})}
#' \deqn{x_{4,4}=1/4 \times (t_{2,2} \times t_{4,4}+t_{2,3} \times t_{4,4}+t_{2,4} \times t_{4,2}+t_{2,4} \times t_{4,3}+t_{3,2} \times t_{4,4}+t_{3,3} \times t_{4,4}+t_{3,4} \times t_{4,2}+t_{3,4} \times t_{4,3}+t_{4,2} \times t_{2,4}+t_{4,3} \times t_{2,4}+t_{4,4} \times t_{2,2}+t_{4,4} \times t_{2,3}+t_{4,2} \times t_{3,4}+t_{4,3} \times t_{3,4}+t_{4,4} \times t_{3,2}+t_{4,4} \times t_{3,3})}
#' \deqn{x_{4,5}=t_{2,4} \times t_{4,4}+t_{3,4} \times t_{4,4}+t_{4,4} \times t_{2,4}+t_{4,4} \times t_{3,4}}
#' \deqn{x_{5,1}=t_{4,1} \times t_{4,1}}
#' \deqn{x_{5,2}=1/4 \times (t_{4,1} \times t_{4,2}+t_{4,1} \times t_{4,3}+t_{4,2} \times t_{4,1}+t_{4,3} \times t_{4,1})}
#' \deqn{x_{5,3}=1/6 \times (t_{4,1} \times t_{4,4}+t_{4,2} \times t_{4,2}+t_{4,2} \times t_{4,3}+t_{4,3} \times t_{4,2}+t_{4,3} \times t_{4,3}+t_{4,4} \times t_{4,1})}
#' \deqn{x_{5,4}=1/4 \times (t_{4,2} \times t_{4,4}+t_{4,3} \times t_{4,4}+t_{4,4} \times t_{4,2}+t_{4,4} \times t_{4,3})}
#' \deqn{x_{5,5}=t_{4,4} \times t_{4,4}}
#' The cost function was defined by \deqn{f_{cost}=\sum_{i=1,j=1}^{n=5} (o_{i,j}-x_{i,j})} and minimized using the Newton-Raphson method.
#' @references \cite{Zhao, C. et.al.(2018). A DNA methylation state transition model reveals the programmed epigenetic heterogeneity in pre-implantation embryos. Under revision.}
#' @examples # Let's simulate the original DNA methylation level vector and the terminal one.
#' set.seed(0)
#' original_methyl <- runif(10000, min = 0, max = 1)
#' set.seed(1)
#' terminal_methyl <- runif(10000, min = 0, max = 1)
#' # Then the parameters that represent the probabilities of three active DNA methylation change types can be calculated:
#' # By using the default para cell_cycle=1, the DNA methylation states change from original state to the terminal state after 1 cell cycle.
#' #' ParameterEstimation(original_methyl,terminal_methyl)
#' # The DNA methylation states change from original state to the terminal state after 2 cell cycle.
#' # ParameterEstimation(original_methyl,terminal_methyl,iter=1,cell_cycle=2)
#' # if this function was not successful to estimated the function, you may try more iterations with different initial guesses
#' ParameterEstimation(original_methyl,terminal_methyl,iter=50,cell_cycle=2)
#' # The parameters would be gradually stable after enough cell cycle times:
#' # The DNA methylation states change from original state to the terminal state after 20 cell cycle.
#' ParameterEstimation(original_methyl,terminal_methyl,iter=50,cell_cycle=20)
#' # The DNA methylation states change from original state to the terminal state after 30 cell cycle.
#' ParameterEstimation(original_methyl,terminal_methyl,iter=50,cell_cycle=30)
#' # The DNA methylation states change from original state to the terminal state after 40 cell cycle.
#' ParameterEstimation(original_methyl,terminal_methyl,iter=50,cell_cycle=40)

#' @concept MethylTransition
#' @export ParameterEstimation

ParameterEstimation <- function(original_methyl,terminal_methyl,iter=50,cell_cycle=1){
	cat("\n")
	cat("\tParameter Estimation\n")
	trans_matrix <- TransitionMatrixGeneration(original_methyl,terminal_methyl)
	print(trans_matrix)
	para_list <- TMParameterEstimation(trans_matrix,iter,cell_cycle)
	return(para_list)
}
ChengchenZhao/MethylTransition documentation built on Nov. 30, 2020, 11:09 a.m.