york: Fitting Linear Models With York's Method.
In JENScoding/York: York's regression of X and Y-variables with correlated errors

Description Usage Arguments Details Value References Examples

View source: R/Algorithm.R

The function york is used to fit a model when both X and Y variables are subject to measurement errors. The york function returns an object of class "york", which is a fit of a York's regression. The model can also take into account correlations between x and y errors.

1
2
3

york(x, y, weights_x = NULL, weights_y = NULL, r_xy_errors = NULL,
  tolerance = 1e-05, max_iterations = 50, sd_x = NULL, sd_y = NULL,
  mult_samples = FALSE, approx_solution = FALSE)

`x`	a 1 times n vector or in case of multiple samples a dataframe, where the first column represents the first sample, of the `x`-variable(s).
`y`	a 1 times n vector or in case of multiple samples a a dataframe, where the first column represents the first sample, of the `y`-variable(s).
`weights_x`	the prespecified 1 times n weighting vector for `x`-values.
`weights_y`	the prespecified 1 times n weighting vector for `y`-values.
`r_xy_errors`	the prespecified correlation coefficient between the errors in `X` and `Y`. Either a 1 times n vector or a single value.
`tolerance`	the tolerance for convergence of the slope coefficent. The default is `1e-5`.
`max_iterations`	the maximum number of iterations for convergence. The default is `50`.
`sd_x`	the standard error of the `x`-values. If the true errors in x are known.
`sd_y`	the standard error of the `y`-values. If the true errors in y are known.
`mult_samples`	`logical`. Default is `FALSE`. Change to TRUE, if the input is a data frame with multiple samples of both x and y.
`approx_solution`	`logical`. Default is `FALSE`. Change to TRUE, if you want an approximate solution of the slope coefficients. No iteration is needed. TRUE is not recommended when the accuracy for the slope coefficient shall be more than one decimal point.

york implements the algorithm for the problem of the best-ﬁt straight line to independent points with errors in both x and y variables. General York (1969) solution according to the algorithm of Wehr & Saleska (2017).

Given n pairs of (x_i, y_i), i = 1, …, n, their weights (ω(x_i), ω(y_i)), i = 1, …, n or their standard errors sd(x_i) and sd(y_i), the york function finds the best-fit straight line using the algorithm of York et al. (1966)/ York et al. (1969) as presented in Wehr & Saleska (2017). In addition, the function provides numerous statistics, parameters and goodness of fit criteria. If the data contains NA values they will be omitted.

York Returns an object of class "york". An object of class "york" is a list containing the following components:

coefficients: a matrix which contains the York estimates for intercept and slope of the best-fit straight line with their respective standard errors.
x_residuals: a vector of the York x-residuals.
y_residuals: a vector of the York y-residuals.
fitted_y: a vector of the fitted York y-values.
weights: a matrix representation of the prespecified or calculated weights for the x- and y-observations.
data: a data matrix which contains as columns the observed points x-, y-values, the errors sd_x- and sd_y and the correlation of the errors (error_correlation). If the input are multiple samples the data element will be a list containing the observed points x-, y-values, the errors sd_x- and sd_y and the correlation of the errors (error_correlation), the errors in x and y (x_errors// y_errors) and the mean of each obersvation i for variable x and y, respectively (mean_x_i// mean).
reduced_chisq: the reduced chi-squared statistic (See https://en.wikipedia.org/wiki/Reduced_chi-squared_statistic), i.e. the goodness of fit measure of York's regression.
se_chisq: the standard error of the chi-squared statistic.
goodness_of_fit: a list with the test results of a chi-squared-test, containing the test-statistic, the degrees of freedom, the p-value and a string saying whether H0 (the assumption of a good fit) can be rejected or not for α = 0.01.
n_iterations: the total number of iterations.
slope_per_iteration: the York slope after each iteration.
weighted_mean_x: the weighted.mean of x.
weighted_mean_y: the weighted.mean of y.
ols_summary: a list containing ols statistics.
york_arguments: a list containing the specfied input arguments.

Wehr, Richard, and Scott R. Saleska. "The long-solved problem of the best-fit straight line: Application to isotopic mixing lines." Biogeosciences 14.1 (2017). pp. 17-29.

York, Derek. "Least squares fitting of a straight line with correlated errors." Earth and planetary science letters 5 (1968), pp. 320-324.

York, Derek. "Least-squares fitting of a straight line.", Canadian Journal of Physics 44.5 (1966), pp. 1079-1086.

York, Derek, et al. "Unified equations for the slope, intercept, and standard errors of the best straight line." American Journal of Physics 72.3 (2004), pp. 367-375.

 # Example: York's regression with weight data taken from Pearson (1901):
 x <- c(0.0, 0.9, 1.8, 2.6, 3.3, 4.4, 5.2, 6.1, 6.5, 7.4)
 y <- c(5.9, 5.4, 4.4, 4.6, 3.5, 3.7, 2.8, 2.8, 2.4, 1.5)
 weights_x <- c(1e+3, 1e+3, 5e+2, 8e+2, 2e+2, 8e+1, 6e+1, 2e+1, 1.8, 1)
 weights_y <- c(1, 1.8, 4, 8, 20, 20, 70, 70, 1e+2, 5e+2)
 r_xy_errors <- 0
 york(x, y, weights_x, weights_y, r_xy_errors)

 # Example: York's regression arbitrary values for sd_x and sd_y:
 x <- c(0.0, 0.9, 1.8, 2.6, 3.3, 4.4, 5.2, 6.1, 6.5, 7.4)
 y <- c(5.9, 5.4, 4.4, 4.6, 3.5, 3.7, 2.8, 2.8, 2.4, 1.5)
 sd_x <- 0.2
 sd_y <- 0.4
 r_xy_errors <- 0.3

 # fit york model
 york(x, y, sd_x = sd_x, sd_y = sd_y, r_xy_errors = r_xy_errors)

## Not run: 
 # Example: No standard errors or weights specified
 york(x, y, r_xy_errors = 0)

 # Example: You can't specify weights and standard errors at the same time
 york(x , y, sd_x, sd_y, weights_x, weights_y, r_xy_errors = 0)

## End(Not run)