Man pages for specleanr
Detecting Environmental Outliers in Data Analysis Pipelines

abdataAlburnoides bipunctatus species data from GBIF and...
adjustboxplotsAdjust the boxplots bounding fences using medcouple to flag...
bestmethodIdentifies the best method for outlier detection for a single...
bootsTo implement bootstrapping procedures. Sampling with...
broad_classifyOutlier detection method broad classification.
check.excludeindicate excluded columns.
check_namesCheck species names for inconsistencies
check_packagesCheck for packages to install and respond to use
checksPost checks for PCA and bootstrapping
classify_dataExtract final clean data using either absolute or best method...
cosineCosine similarity index based on (Gautam & Kulkarni 2014; Joy...
datacleaner-classOutlier detection class for multiple methods
distboxplotDistribution boxplot
ecological_rangesCheck for environmental outliers using species optimal...
efidataEFIPLUS data used to develop ecological sensitivity...
eifComputes the empirical influence function for each values in...
extentvaluesTo check for a bounding box
extract_clean_dataExtract final clean data using either absolute or best method...
extractMethodsList of outlier detection methods implemented in this...
extractoutliersExtract outliers for a one species
geo_rangesChecks for geographic ranges from FishBase
getdataDownload species records from online database.
getdiffget dataframe from the large dataframe.
ggenvironmentalspaceTitle Plotting to show the quality controlled data in...
ggoutlieraccumIdentify if enough methods are selected for the outlier...
ggoutliersVisualize the outliers identified by each method
hammingIdentify best outlier detection method using Hamming...
hampelFlag suspicious outliers based on the Hampel filter method..
handle_true_errorsCatch errors during methods implementation.
interquartileComputes interquartile range to flag environmental outliers
isoforestIdentify outliers using isolation forest model.
jaccardIdentifies the best outlier detection method using Jaccard...
jdsdataJoint Danube Survey Data
jknifeIdentifies outliers using Reverse Jackknifing method based on...
kdatSequential fences constants
logboxplotLog boxplot based for outlier detection.
mahalFlags outliers based on Mahalanobis distance matrix for all...
match.argcCustomized match function
match_datasetsData harmonizing for offline data based on Darwin Core terms...
medianruleMedian rule method
mixediqrMixed Interquartile range and semiInterquartile range 'Walker...
mthmth datasets with constant at each confidence interval...
multiabsoluteIdentifies absolute outliers for multiple species.
multibestmethodIdentify best method for outlier removal for multiple species...
multidetectEnsemble multiple outlier detection methods.
ocindexIdentifies absolute outliers and their proportions for a...
onesvmIdentify outliers using One Class Support Vector Machines
optimal_thresholdOptimize threshold for clean data extraction.
overlapIdentifies best outlier detection method using Overlap...
pcaImplement principal component analysis for dimension...
pcbootTo package both principal component analysis and...
pred_extractPreliminary data cleaning including removing duplicates,...
search_thresholdDetermine the threshold using Locally estimated or weighted...
semiIQRComputes semi-interquantile range to flag suspicious outliers
seqfencesSequential fences method
show-datacleaner-methodset method for displaying output details after outlier...
smcIdentify best outlier detection method using simple matching...
sorensenIdentifies best outlier detection method suing Sorensen...
thermal_rangesCollates minimum, maximum, and preferable temperatures from...
ttdataThymallus thymallus species data from GBIF and iNaturalist
xgloshGlobal-Local Outlier Score from Hierarchies
xkmeansFlags outliers using kmeans clustering method
xknnk-nearest neighbors for outlier detection
xlofFlags suspicious using the local outlier factor or...
zscoreComputes z-scores to flag environmental outliers.
specleanr documentation built on Nov. 26, 2025, 1:07 a.m.