Original AutoBeale results for the same data file

For reference, here are the summarized results generated by Peter Richard's original AutoBeale code:


Stratification applied:

block  stratum   start      end

1       1     19970101 19970420
2       2     19970421 19970526
3       3     19970527 19970817
4       4     19970818 19971201
5       5     19971202 19971231


Summary over 5 strata:

Mean daily loading:      
                         459187.6  g/day
                      167603487.   g/year
                         167603.4  kg/year

Mean squared error:   
                     893200710.529 ( g/day)**2
                     118996664.660 (kg/year)**2

Root mean squared error:

                     29886.46 ( g/day)
                     10908.56 (kg/year)

based on     9.348 degrees of freedom
The 95% confidence interval half-width is       24539.3375 kg/year

Detailed AutoBeale program output for each stratum

Stratum  1
----------

This stratum includes dates from 19970101 to 19970420

Biased estimate:       656.9  kg/day
Unbiased estimate:     658.2  kg/day
Bias correction:         1.3  kg/day

RMSE:                   42.9 ( kg/day)
based on   14. degrees of freedom

over 110 day period:     62502 kg

Stratum  2
----------

This stratum includes dates from 19970421 to 19970526

Biased estimate:          937.5  kg/day
Unbiased estimate:       1005.7  kg/day
Bias correction:           68.1  kg/day

RMSE:                     257.2 ( kg/day)
based on    5. degrees of freedom

over 36 day period:      36205.2 kg

Stratum  3
----------

This stratum includes dates from 19970527 to 19970817

Biased estimate:             536.7 kg/day
Unbiased estimate:           535.3 kg/day
Bias correction:              -1.5 kg/day

RMSE:                         34.1 ( kg/day)
based on   10. degrees of freedom

over 83 day period:        44429.9 kg

Stratum  4
----------

Biased estimate:             28.1  kg/day
Unbiased estimate:           28.3  kg/day
Bias correction:              0.2  kg/day

RMSE:                        10.3 ( kg/day)
based on   15. degrees of freedom

over 106 day period:        2999.8 kg

Stratum  5
----------

This stratum includes dates from 19971202 to 19971231

Biased estimate:            379.3  kg/day
Unbiased estimate:          385.4  kg/day
Bias correction:              6.1  kg/day

RMSE:                        43.5 ( kg/day)
based on    3. degrees of freedom

over 30 day period:          11562 kg

Current R package results when partitioning is the same as original AutoBeale

library(lubridate)
library(autobeale)

my_partition_dates <- lubridate::ymd( c( "1997-04-20", "1997-05-26", "1997-08-17", "1997-12-01" ) )

number_of_stratums <- 5

mystrata <- strata$new( number_of_stratums, qcdata )

mystrata$rearrange_stratums( my_partition_dates )

all_stratums_valid <- mystrata$all_valid()

if ( all_stratums_valid ) {
  mystrata$calc_loads()
  myresults_df <- mystrata$summarize_results()
  # round values to produce a more readable table
  myresults_df <- data.frame(lapply(myresults_df, function(y) if(is.numeric(y)) round(y, 2) else y)) 
  knitr::kable( myresults_df, digits=2 )
}

Detailed comparison

We'll start by verifying that we're calculating the daily loads correctly.

library(autobeale)
daily_loads_df <- data.frame( date=daily_loads$date, stratum=daily_loads$stratum,
                              orig_autobeale=daily_loads$load_kg_day )

new_autobeale_loads_df <- data.frame( date=mystrata$stratums[[1]]$c_date, 
                                      new_ab_load=mystrata$stratums[[1]]$sl$load_daily )

for ( indx in seq(2,5) ) {
  new_autobeale_loads_df <- rbind( new_autobeale_loads_df, 
                                   data.frame( date=mystrata$stratums[[indx]]$c_date, 
                                               new_ab_load=mystrata$stratums[[indx]]$sl$load_daily ) )
}

daily_loads_df <- dplyr::left_join( daily_loads_df, new_autobeale_loads_df, by=c( "date" = "date") )

print( sum( daily_loads_df$new_ab_load - daily_loads_df$orig_autobeale, na.rm=TRUE ) )

OK, the daily partitioning and loads look alright. Now lets compare the specifics of each load calculation.

n <- nrow( myresults_df )

comparison_df <- data.frame( start_date=ab$start_date, end_date=ab$end_date,
                             unbiased_daily_ab=ab$unbiased_daily_kg_day,
                             unbiased_daily_R=myresults_df$load_daily_corrected[ 1:n-1 ],
                             biased_daily_ab=ab$load_biased_daily_kg,
                             biased_daily_R=myresults_df$load_daily_biased[ 1:n-1 ] )


smwesten-usgs/autobeale-R documentation built on May 30, 2019, 5:04 a.m.