clean: Clean faulty values

View source: R/clean.R

cleanR Documentation

Clean faulty values

Description

Cleans faulty values of a met mast object, set or specified set of a met mast. Faulty values are replaced by NA.

Usage

clean(mast, set, v.avg.min=0.4, v.avg.max=50, dir.clean=TRUE, 
	turb.clean=4, icing=FALSE, rep=NULL, n.rep=5, ts=FALSE)
cln(mast, set, v.avg.min=0.4, v.avg.max=50, dir.clean=TRUE, 
	turb.clean=4, icing=FALSE, rep=NULL, n.rep=5, ts=FALSE)

Arguments

mast

Met mast object created by mast. To be ignored, if a single dataset shall be cleaned.

set

Set object created by set (if no mast is given) or set of met mast specified as set number or set name. To be ignored, if all datasets of mast shall be cleaned.

v.avg.min

Lower limit for wind speeds as numeric value. Default is 0.4 m/s. Set to NULL to omit minimum wind speed.

v.avg.max

Upper limit for wind speeds as numeric value. Default is 50 m/s. Set to NULL to omit maximum wind speed.

dir.clean

If TRUE (default), faulty wind direction values are excluded. Faulty values are dir.avg<0, dir.avg>360 and dir.avg, where the wind speed is lower than the v.avg.min specified.

turb.clean

Wind speed limit for turbulence intensity as numeric value. Turbulence intesity values are excluded for wind speeds lower then this limit. Default is 4 m/s.

icing

If TRUE, wind direction values are excluded, where standard deviation of wind direction is 0, assuming icing. Default is FALSE.

rep

Signal (or a vector of signals), for which repetitions shall be cleaned – default is NULL.

n.rep

Minimum number of repetitions that shall be cleaned, as integer value – default is 5. Only used if rep is not NULL.

ts

If TRUE, uneven time intervals are corrected. Can only be used with mast objects. Default is FALSE.

Details

Turbulence can be ignored for low wind speeds. Use turb.clean to clean the respective turbulence intensity values. See turbulence for more details.

If icing is detected using icing, the time stamp should be checked to exclude implausible assumptions, e.g. in summer.

Repetitions are often generated by a corrupted data stream between sensor and data logger. Some sensors also repeat their last captured value during calm conditions. Although they are unlikely for averaged time intervals, repetitions are no faulty values by default. Do only clean repetitions if you know your data. Note: the number of repetitions n.rep means n.rep+1 consecutive values in a dataset are identical. The default of 5 repetitions corresponds to one hour in case of a ten minutes interval.

Uneven time intervals might come up due to rounding errors. ts corrects them to the median time interval.

Value

Returns the input met mast or dataset object with cleaned data.

Author(s)

Christian Graul

See Also

mast, set

Examples

## Not run: 
# load and prepare data
data("winddata", package="bReeze")
set40 <- set(height=40, v.avg=winddata[,2], v.std=winddata[,5],
  dir.avg=winddata[,14])
set30 <- set(height=30, v.avg=winddata[,6], v.std=winddata[,9],
  dir.avg=winddata[,16])
set20 <- set(height=20, v.avg=winddata[,10], v.std=winddata[,13])
ts <- timestamp(timestamp=winddata[,1])
neubuerg <- mast(timestamp=ts, set40=set40, set30=set30, 
  set20=set20)

# clean faulty values of a met mast
neubuerg.clean <- clean(mast=neubuerg)

# compare a subset of the original and cleaned data
neubuerg$sets$set40$data$v.avg[660:670]
neubuerg.clean$sets$set40$data$v.avg[660:670]


# clean faulty values of a dataset
set40.clean <- clean(set=set40)
  
# clean just one dataset of a met mast
neubuerg.clean.2 <- clean(mast=neubuerg, set=1)
neubuerg.clean.2 <- clean(mast=neubuerg, set="set40")	# same as above

# change lower wind speed limit 
neubuerg.clean.3 <- clean(mast=neubuerg, v.avg.min=0.3)

# compare number of samples set to 'NA', due to lowered limit
length(which(is.na(neubuerg.clean$sets$set40$data$v.avg)==TRUE))
length(which(is.na(neubuerg.clean.3$sets$set40$data$v.avg)==TRUE))


# change wind speed limit for cleaning of turbulence intensity
neubuerg.clean.4 <- clean(mast=neubuerg, turb.clean=3)

# compare number of samples set to 'NA', due to turb.clean
neubuerg.clean$sets$set40$data$turb.int[75:100]
neubuerg.clean.4$sets$set40$data$turb.int[75:100]


# check whether icing is assumed for any samples
neubuerg.clean.5 <- clean(mast=neubuerg, set=1, v.avg.min=0, 
  v.avg.max=100, dir.clean=FALSE, turb.clean=0, icing=TRUE)
not.cleaned <- which(is.na(neubuerg$sets$set40$data$dir.avg)==TRUE)
cleaned <- which(is.na(neubuerg.clean.5$sets$set40$data$dir.avg)==TRUE)
length(cleaned)-length(cleaned)	# no icing here

# checked time stamp to exclude implausible icing assumptions (e.g. in summer)
# (which makes no sense here, since cleaned is empty)
neubuerg.clean.5$timestamp[cleaned]


# clean repetitions
neubuerg.clean.6 <- clean(mast=neubuerg, rep=c("v.avg", "dir.avg"))
neubuerg.clean.7 <- clean(mast=neubuerg, rep="v.avg", n.rep=3)

## End(Not run)

chgrl/bReeze documentation built on Feb. 10, 2024, 2:27 a.m.