datacheck: Two Criteria Database Check

datacheckR Documentation

Two Criteria Database Check

Description

Generate a list of records that probably have errors in chemical components concentratios, based in two criteria: correlation between chemical components concentrations with total soluble solids, and correlation between chemical ionic components concentrations with conductivity

Usage

datacheck(dataICHS, dataTSSS, conflevel = 0.95, pchdata = 19, coldata = "green", 
cexdata = 0.5, pchsample = 19, colsample = "red", cexsample = 3, xaxis = xaxis, 
yaxis = yaxis, title = title, linetyprediction = 2, linewidthprediction = 1, 
linecolorprediction = 5)

Arguments

dataICHS

Registers of a database with concentrations of chemical components of water, including concentration of ionic chemical components and conductivity.

dataTSSS

egisters of a database with concentrations of chemical components of water, including concentration of chemical components and total soluble solids.

conflevel

Significance level used in the predict function.

pchdata

Symbol used to graph all the data in the data.frame.

coldata

Color of the symbols of all the data in the data.frame.

cexdata

Symbol size of all data in the data frame.

pchsample

Symbol chosen to represent the point whose measurement quality is to be represented.

colsample

Color chosen to represent the point whose measurement quality is to be represented.

cexsample

Size of the symbol chosen to represent the point whose measurement quality is to be represented.

xaxis

X axis label.

yaxis

Y axis label.

title

Title of the graph including the code of the chosen sample.

linetyprediction

Linear model prediction line type.

linewidthprediction

Linear model prediction line thickness.

linecolorprediction

Linear model prediction line color.

Details

The datacheck() function performs two linear regressions using de functions TSSS() and ICHS() of this package. TSSS() function performs a linear model using column 2 (total soluble solids) as the dependent variable and the other components of water as independent variables (columns 3 onwards). Based on the linear model, a data prediction interval is obtained with a certain confidence level and displays as a red point the samples that are outside the prediction interval. The ICHS() function performs a linear model using column 2 (conductivity) as the independent variable and the other components of water as dependent variables (columns 3 onwards). Based on the linear model, a data prediction interval is obtained with a certain confidence level and ICHS graphs in red points those samples that are outside de prediction interval. The datacheck() function select the samples of the database, that are outside of both prediction intervals. If a sample is outside both prediction intervals, probably has an important error and must be revised.

Value

The datacheck() function returns a graph with two plots. The first plot display de linear regresion of charge summation as a function of conductivity, and the second one, the linear regresion of mass summation as a function of total soluble solids. In both plots are presented the prediction interval and the samples that are outside of it, which probably has a problem of accuracy or precision, are display as red dots. The identification code of the samples that are outside both prediction intervals are display as a list.

Author(s)

Maela Lupo, Andrea Porpatto, Alfredo Rigalli


AQuality documentation built on April 13, 2025, 5:09 p.m.