Title: | Weighted Regression for Water Quality Evaluation in Tidal Waters |
---|---|
Description: | An adaptation for estuaries (tidal waters) of weighted regression on time, discharge, and season to evaluate trends in water quality time series. Please see Beck and Hagy (2015) <doi:10.1007/s10666-015-9452-8> for details. |
Authors: | Marcus W. Beck [aut, cre] |
Maintainer: | Marcus W. Beck <[email protected]> |
License: | CC0 |
Version: | 1.1.4 |
Built: | 2024-10-14 03:03:39 UTC |
Source: | https://github.com/fawda123/WRTDStidal |
Get AIC values for a single weighted quantile regression as used in WRTDS models
aiccrq(mod_in, tau = 0.5)
aiccrq(mod_in, tau = 0.5)
mod_in |
input crq model |
tau |
numeric indicating quantile to evaluate |
The AIC value is based on the log-likelihood estimate of the model that accounts for the specific quantile, the minimum of the objective function (rho), and the number of model parameters. The residuals are specific to the WRTDS model such that this function cannot be applied to arbitrary crq models.
AIC estimate
# get wts for a model centered on the first observation ref_in <- tidobj[1, ] ref_wts <- getwts(tidobj, ref_in) # get the model mod <- quantreg::crq( survival::Surv(res, not_cens, type = "left") ~ dec_time + flo + sin(2*pi*dec_time) + cos(2*pi*dec_time), weights = ref_wts, data = tidobj, method = "Portnoy" ) aiccrq(mod)
# get wts for a model centered on the first observation ref_in <- tidobj[1, ] ref_wts <- getwts(tidobj, ref_in) # get the model mod <- quantreg::crq( survival::Surv(res, not_cens, type = "left") ~ dec_time + flo + sin(2*pi*dec_time) + cos(2*pi*dec_time), weights = ref_wts, data = tidobj, method = "Portnoy" ) aiccrq(mod)
Simulate a response variable time series using all simulation functions
all_sims(dat_in, ...)
all_sims(dat_in, ...)
dat_in |
input |
... |
additional arguments passed to |
This is a convenience function that combines lnQ_sim
, lnres_err
, and lnres_sim
. See the help documentation function for more details on each.
Original data frame with additional columns for simulated discharge (lnQ_sim
), the random errors of the response variable (errs
), the standard error estimates for each residual (scls
), a flow-independent response variable time series (lnres_noQ
), and a flow-dependent time series (lnres_Q
).
daydat
for example data, lnQ_sim
for simulating discharge, lnres_err
for estimating the error structure of the response variable, and lnres_sim
for simulating the response variable
## Not run: ## example data data(daydat) ## simulate tmp <- all_sims(daydat) ## End(Not run)
## Not run: ## example data data(daydat) ## simulate tmp <- all_sims(daydat) ## End(Not run)
Create annual aggregations of WRTDS output
annual_agg(dat_in, ...) ## Default S3 method: annual_agg(dat_in, mo_strt = 10, min_mo = 9, logspace = TRUE, ...)
annual_agg(dat_in, ...) ## Default S3 method: annual_agg(dat_in, mo_strt = 10, min_mo = 9, logspace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years |
logspace |
logical indicating if aggregated data are to be shown in log-space or not |
WRTDS output is averaged by year for both predictions and flow-normalized predictions. Years are averaged only if one observation is contained in each of the minimum number of months specified by min_mo
averaging, otherwise results are not returned for the given year. Note that setting min_mo
to values smaller than the default can produce inaccurate trends for years with very few results.
The function is used internally within prdnrmplot
and fitplot
.
An aggregated data object for plotting, returns only model output and response variable.
## tidal object annual_agg(tidfit) ## tidalmean object annual_agg(tidfitmean)
## tidal object annual_agg(tidfit) ## tidalmean object annual_agg(tidfitmean)
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay. Data are the median values of monthly observations across all water quality stations in Hillsborough Bay. Date ranges are from January 1974 to December 2012 (452 observations). Variables are date, chlorophyll-a (in log-space, labelled res
for response), salinity as fraction of freshwater (i.e., 0 - 1, with higher values indicating more freshwater, label flo
for flow), and the detection limit for all stations for the respective date.
chldat
chldat
A data frame with 452 rows and 4 variables:
date
Date
res
numeric
flo
numeric
lim
numeric
Get the chlorophyll axis label for observed or log-space, including units
chllab(logspace = TRUE)
chllab(logspace = TRUE)
logspace |
logical indicating if chlorophyll is in log space, otherwise observed |
An expression indicating (log)-chlorophyll in the appropriate units
## default chllab()
## default chllab()
Get trend for a single time period
chngest(x, y, single = F)
chngest(x, y, single = F)
x |
numeric of year values |
y |
norm numeric of annually averaged response variable |
single |
logical if trends are based on only first and last year of aggregated, i.e., no averaging of first/last three |
Estimates trends using methods described in wrtdstrnd
. Used internally and is not to be called by the user. Function runs on individual time period groups defined by arguments in wrtdstrnd
.
Create a grid of all unique combinations of half-window widths to evaluate. The result can be passed to winsrch_grid
.
createsrch( mos = c(seq(0.5, 1, by = 0.25), 2, 10), yrs = c(seq(5, 15, by = 3), 50), flo = c(seq(0.5, 1, by = 0.1), 5) )
createsrch( mos = c(seq(0.5, 1, by = 0.25), 2, 10), yrs = c(seq(5, 15, by = 3), 50), flo = c(seq(0.5, 1, by = 0.1), 5) )
mos |
numeric vector of half-window widths for months, a value of one indicates twelve months |
yrs |
numeric vector of half-window widths for years, a value of one indicates one-year |
flo |
numeric vector of half-window widths for salinity or flow, a value of one indicates the full range of values (100 percent) |
The weighting function uses a tri-cube weighting scheme such that weights diminish with distance from the center of the window. For example, a value of one for the month window does not mean that all months are weighted equally even though the window covers an entire calendar year.
A matrix with number of rows equal to the product of the lengths of each input vector, where each row is a unique combination for the selected half-window widths.
createsrch() createsrch(1, 1, 1)
createsrch() createsrch(1, 1, 1)
Combined daily flow observations from the USGS stream gage station 01594440 near Bowie, Maryland and daily chlorophyll and salinity records from the Jug Bay station of the Chesapeake Bay Maryland National Estuarine Research Reserve. Daily chlorophyll concentrations were estimated from fluorescence values that did not include blue-green algae blooms. These date are provided to illustrate examples with time series simulation.
daydat
daydat
A data frame with 3506 rows and 9 variables, 1985 to 2014:
date
Date as daily time step
flo
numeric for salinity, ppt
lnres
numeric for chlorophyll fluorescence as ln-transformed ug/l
Q
numeric for daily discharge m3/s
lnQ
numeric for daily discharge ln-transformed m3/s
jday
numeric for julian day
year
numeric for year
day
numeric for day from 1-31
dec_time
numeric for decimal time on the annual period
Create decimal time on an annual scale from an input time vector
dec_time(date_in) ## S3 method for class 'Date' dec_time(date_in)
dec_time(date_in) ## S3 method for class 'Date' dec_time(date_in)
date_in |
input time vector, usually a |
Function is used internally within the package.
A named list of four numeric vectors including day_num
(decimal day on an annual scale), month
(month of the year as integer), year
, and dec_time
(decimal time as sum of year
and day_num
)
dt <- Sys.Date() dts <- seq.Date(dt - 365, dt, by = 'day') dec_time(dts)
dt <- Sys.Date() dts <- seq.Date(dt - 365, dt, by = 'day') dec_time(dts)
Plot the relationship between the modelled response and salinity/flow across the time series using line plots for each month. Each line corresponds to a unique year. This can be used to evaluate temporal variation between the two.
dynaplot(dat_in, ...) ## S3 method for class 'tidal' dynaplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... ) ## S3 method for class 'tidalmean' dynaplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... )
dynaplot(dat_in, ...) ## S3 method for class 'tidal' dynaplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... ) ## S3 method for class 'tidalmean' dynaplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric input from 1 to 12 indicating the monthly predictions to plot |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
years |
numeric vector of years to plot, one to many, defaults to all |
col_vec |
chr string of plot colors to use, passed to |
alpha |
numeric value from zero to one indicating line transparency |
size |
numeric value for line size |
logspace |
logical indicating if plots are in log space |
floscl |
logical indicating if salinity/flow on x-axis is standardized (default) or in original scale |
allflo |
logical indicating if the salinity or flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
scales |
chr string passed to ggplot to change x/y axis scaling on facets, acceptable values are |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
use_bw |
logical indicating if the |
fac_nms |
optional chr string for facet labels, which must be equal in length to |
These plots can be used to examine how the relationship between the response variable and flow varies throughout the time series. It is essentially identical to the plot produced by gridplot
, except lines plots are returned that show the relationship of the response variable with salinity/flow using different lines for each year. The interpolation grid that is stored as an attribute in a fitted tidal object is used to create the plot. Each plot is limited to the same month throughout the time series to limit seasonal variation. Plots are also constrained to the fifth and ninety-fifth percentile of observed salinity/flow values during the month of interest to limit the predictions within the data domain. This behavior can be suppressed by changing the allflo
argument.
Note that the year variable used for color mapping is treated as a continuous variable although it is an integer by definition.
A ggplot
object that can be further modified
# load a fitted tidal object data(tidfit) # plot using defaults, # defaults to the fiftieth quantile for all years dynaplot(tidfit) ## Not run: # change the defaults dynaplot(tidfit, tau = 0.9, month = 2, years = seq(1980, 1990), col_vec = rainbow(7), alpha = 0.5, size = 3) # plot a tidalmean object data(tidfitmean) dynaplot(tidfitmean) ## End(Not run)
# load a fitted tidal object data(tidfit) # plot using defaults, # defaults to the fiftieth quantile for all years dynaplot(tidfit) ## Not run: # change the defaults dynaplot(tidfit, tau = 0.9, month = 2, years = seq(1980, 1990), col_vec = rainbow(7), alpha = 0.5, size = 3) # plot a tidalmean object data(tidfitmean) dynaplot(tidfitmean) ## End(Not run)
Add date, year, month, and day columns to the interpolation grids using dates from dat_in
. The interp.surface
function is used after splitting the interpolation matrix by month to fill missing values. Function is used in wrtds
.
fill_grd(grd_in, dat_in, interp = FALSE)
fill_grd(grd_in, dat_in, interp = FALSE)
grd_in |
interpolation grid to fill, either a single mean grid or an individual quantile grid created within |
dat_in |
|
interp |
logical for interpolation |
predonobs
attributeFill the predonobs
attribute
fillpo(dat_in, ...) ## S3 method for class 'tidal' fillpo(dat_in, ...) ## S3 method for class 'tidalmean' fillpo(dat_in, ...)
fillpo(dat_in, ...) ## S3 method for class 'tidal' fillpo(dat_in, ...) ## S3 method for class 'tidalmean' fillpo(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
Used in respred
to fill the predonobs
attribute. This attribute is used to estimate performance metrics with wrtdsperf
.
The input tidal or tidalmean object with the filled predonobs
attribute as predictions for the observed data as a data frame.
Plot a tidal object to view the response variable observations, predictions, and normalized results separately for each month.
fitmoplot(dat_in, ...) ## S3 method for class 'tidal' fitmoplot( dat_in, month = c(1:12), tau = NULL, predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitmoplot( dat_in, month = c(1:12), predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
fitmoplot(dat_in, ...) ## S3 method for class 'tidal' fitmoplot( dat_in, month = c(1:12), tau = NULL, predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitmoplot( dat_in, month = c(1:12), predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
ncol |
numeric argument passed to |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
The plots are similar to those produced by fitplot
except the values are faceted by month. This allows an evaluation of trends over time independent of seasonal variation. Multiple observations within each month for each year are averaged for a smoother plot.
A ggplot
object that can be further modified
fitplot
, prdnrmplot
, sliceplot
## load a fitted tidal object data(tidfit) # plot using defaults fitmoplot(tidfit) ## Not run: # get the same plot but use default ggplot settings fitmoplot(tidfit, pretty = FALSE) # plot specific quantiles fitmoplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitmoplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) fitmoplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitmoplot(tidfitmean) ## End(Not run)
## load a fitted tidal object data(tidfit) # plot using defaults fitmoplot(tidfit) ## Not run: # get the same plot but use default ggplot settings fitmoplot(tidfit, pretty = FALSE) # plot specific quantiles fitmoplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitmoplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) fitmoplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitmoplot(tidfitmean) ## End(Not run)
Plot a tidal object to view response variable observations, predictions, and normalized results.
fitplot(dat_in, ...) ## S3 method for class 'tidal' fitplot( dat_in, tau = NULL, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitplot( dat_in, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
fitplot(dat_in, ...) ## S3 method for class 'tidal' fitplot( dat_in, tau = NULL, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitplot( dat_in, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
annuals |
logical indicating if plots are annual aggregations of results |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, applies only if |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September, applies only if |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
A ggplot
object that can be further modified
fitmoplot
, prdnrmplot
, sliceplot
## load a fitted tidal object data(tidfit) # plot using defaults fitplot(tidfit) # get the same plot but use default ggplot settings fitplot(tidfit, pretty = FALSE) # plot in log space fitplot(tidfit, logspace = TRUE) # plot specific quantiles fitplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitplot(tidfit, predicted = FALSE) # plot as monthly values fitplot(tidfit, annuals = FALSE) # format the x-axis is using annual aggregations library(ggplot2) fitplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) # modify the plot as needed using ggplot scales, etc. fitplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitplot(tidfitmean)
## load a fitted tidal object data(tidfit) # plot using defaults fitplot(tidfit) # get the same plot but use default ggplot settings fitplot(tidfit, pretty = FALSE) # plot in log space fitplot(tidfit, logspace = TRUE) # plot specific quantiles fitplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitplot(tidfit, predicted = FALSE) # plot as monthly values fitplot(tidfit, annuals = FALSE) # format the x-axis is using annual aggregations library(ggplot2) fitplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) # modify the plot as needed using ggplot scales, etc. fitplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitplot(tidfitmean)
Get weights for WRTDS for a single observation using a tri-cubic weighting function
getwts(dat_in, ...) ## Default S3 method: getwts( dat_in, ref_in, wt_vars = c("day_num", "dec_time", "flo"), wins = list(0.5, 10, NULL), all = FALSE, slice = TRUE, ngrzero = FALSE, wins_only = FALSE, min_obs = 100, ... )
getwts(dat_in, ...) ## Default S3 method: getwts( dat_in, ref_in, wt_vars = c("day_num", "dec_time", "flo"), wins = list(0.5, 10, NULL), all = FALSE, slice = TRUE, ngrzero = FALSE, wins_only = FALSE, min_obs = 100, ... )
dat_in |
input tidal object |
... |
arguments passed to or from other methods |
ref_in |
row of tidal object as reference for weights |
wt_vars |
chr string of three elements indicating names of columns in tidal object that are used for reference row weights |
wins |
list of half-window widths for time, year, and flow |
all |
logical to return individual weights rather than the product of all three, default |
slice |
logical indicating if data are subset by observations within the maximum window width for faster calculations |
ngrzero |
logical indicating if count of observations with weights greater than zero is returned |
wins_only |
logical if the half-window widths should be returned as a list |
min_obs |
numeric vector for window widening if the number of observations with non-zero weights is less than the specified value, use |
The default half-window widths for day_num
, year
, and flow
are half a day (12 hours), 10 years, and half the range of salinity/flow in the input data. The half-window widths are expanded by 10% until at least 100 observations have weights greater than zero. This behavior can be suppressed by setting min_obs = NULL
.
A vector of weights with length equal to the number of observations (rows) in the tidal object. Vectors for all three weighting variables are returned if all = TRUE
.
## data(tidobj) # get weights for first row first <- tidobj[1, ] wts <- getwts(tidobj, first) plot(wts, type = 'l') ## Not run: # get count of observations with grzero weights sapply(1:nrow(tidobj), function(row) getwts(tidobj, tidobj[row, ], ngrzero = TRUE)) ## End(Not run)
## data(tidobj) # get weights for first row first <- tidobj[1, ] wts <- getwts(tidobj, first) plot(wts, type = 'l') ## Not run: # get count of observations with grzero weights sapply(1:nrow(tidobj), function(row) getwts(tidobj, tidobj[row, ], ngrzero = TRUE)) ## End(Not run)
Calculate quantile regression goodness of fit using residuals and non-conditional residuals
goodfit(resid, resid_nl, tau)
goodfit(resid, resid_nl, tau)
resid |
numeric vector of residuals from the conditional quantile model |
resid_nl |
numeric vector of residuals from the non-conditional (null) quantile model |
tau |
numeric value from zero to one for the estimated quantile |
The goodness of fit measure for quantile regression is estimated as 1 minus the ratio between the sum of absolute deviations in the fully parameterized models and the sum of absolute deviations in the null (non-conditional) quantile model. The values are useful for comparisons between quantile models, but they are not comparable to standard coefficients of determination. The latter is based on the variance of squared deviations, whereas goodness of fit values for quantile regression are based on absolute deviations. Goodness of fit values will always be smaller than R2 values.
A numeric value from 0 to 1 indicating goodness of fit
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
wrtdsrsd
for residuals
library(quantreg) ## random variables x <- runif(100, 0, 10) y <- x + rnorm(100) ## quantile model mod <- rq(y ~ x, tau = 0.5) res <- resid(mod) ## non-conditional quantile model mod_nl <- rq(y ~ 1, tau = 0.5) rsd_nl <- resid(mod_nl) goodfit(res, rsd_nl, 0.5) ## r2 of mean model for comparison mod_lm <- lm(y ~ x) summary(mod_lm)$r.squared
library(quantreg) ## random variables x <- runif(100, 0, 10) y <- x + rnorm(100) ## quantile model mod <- rq(y ~ x, tau = 0.5) res <- resid(mod) ## non-conditional quantile model mod_nl <- rq(y ~ 1, tau = 0.5) rsd_nl <- resid(mod_nl) goodfit(res, rsd_nl, 0.5) ## r2 of mean model for comparison mod_lm <- lm(y ~ x) summary(mod_lm)$r.squared
Gets colors used for WRTDS plots
gradcols(col_vec = NULL)
gradcols(col_vec = NULL)
col_vec |
chr string of plot colors to use, typically passed to |
This is a convenience function for retrieving a color palette that is used by most of the plotting functions. Palettes from RColorBrewer will use the maximum number of colors. The default palette is 'Spectral'.
A character vector of colors in hexadecimal notation.
## defaults gradcols() ## another RColorBrewer palette gradcols('Pastel2') ## a silly example gradcols(rainbow(7))
## defaults gradcols() ## another RColorBrewer palette gradcols('Pastel2') ## a silly example gradcols(rainbow(7))
Plot the relationship between the response variable and salinity/flow across the time series using a gridded surface for all months. The response is shaded by relative values across all dates for comparison.
gridplot(dat_in, ...) ## S3 method for class 'tidal' gridplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidalmean' gridplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... )
gridplot(dat_in, ...) ## S3 method for class 'tidal' gridplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidalmean' gridplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot or chr string 'all' to indicate all months with no plot facets |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
years |
numeric vector for range of years to plot |
col_vec |
chr string of plot colors to use, passed to |
col_lim |
numeric vector of length two that defines the range of the color ramp in the legend, passed to |
logspace |
logical indicating if plots are in log space |
floscl |
logical indicating if salinity/flow on x-axis is standardized (default) or in original scale |
allflo |
logical indicating if the salinity/flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
flo_fac |
numeric value indicating the factor for smoothing the response variable across salinity/flow values. Increasing the value creates more smoothing and setting the value to 1 removes all smoothing. |
yr_fac |
numeric value indicating the factor for smoothing the response variable across integer years. Increasing the value creates more smoothing and setting the value to 1 removes all smoothing. |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
These plots can be used to examine how the relationship between the response variable and salinity/flow varies throughout the time series for multiple months. The plot is similar to that returned by dynaplot
except changes in the response are shown on a gridded surface of salinity/flow versus time. Multiple months can also be viewed. Color shading is in proportion to the value of the response variable and is relative across the plotted months. The interpolation grid that is stored as an attribute in a fitted tidal object is used to create the plot. By default, the plots are constrained to the fifth and ninety-fifth percentile of observed salinity/flow values during each month to limit the predictions within the data domain. This behavior can be suppressed by changing the allflo
argument, although the predicted values of the response variable that are outside of the salinity/flow range for the plotted month are typically unrealistic.
A ggplot
object that can be further modified
dynaplot
, fitplot
, gridplot
, prdnrmplot
## Not run: ## load a fitted tidal object data(tidfit) ## defaults to the fiftieth quantile gridplot(tidfit) ## no facets, all months gridplot(tidfit, month = 'all') ## change the defaults gridplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) gridplot(tidfitmean) ## End(Not run)
## Not run: ## load a fitted tidal object data(tidfit) ## defaults to the fiftieth quantile gridplot(tidfit) ## no facets, all months gridplot(tidfit, month = 'all') ## change the defaults gridplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) gridplot(tidfitmean) ## End(Not run)
Nonparametric test for monotonic trend Within each season based on Kendall's Tau statistic
kendallSeasonalTrendTest(y, ...) ## Default S3 method: kendallSeasonalTrendTest( y, season, year, alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, independent.obs = TRUE, data.name = NULL, season.name = NULL, year.name = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'data.frame' kendallSeasonalTrendTest(y, ...) ## S3 method for class 'formula' kendallSeasonalTrendTest(y, data = NULL, subset, na.action = na.pass, ...) ## S3 method for class 'matrix' kendallSeasonalTrendTest(y, ...)
kendallSeasonalTrendTest(y, ...) ## Default S3 method: kendallSeasonalTrendTest( y, season, year, alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, independent.obs = TRUE, data.name = NULL, season.name = NULL, year.name = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'data.frame' kendallSeasonalTrendTest(y, ...) ## S3 method for class 'formula' kendallSeasonalTrendTest(y, data = NULL, subset, na.action = na.pass, ...) ## S3 method for class 'matrix' kendallSeasonalTrendTest(y, ...)
y |
an object containing data for the trend test. In the default method, the argument |
... |
methods passed to or from other methods |
season |
numeric or character vector or a factor indicating the seasons in which the observations in y were taken. The length of |
year |
numeric vector indicating the years in which the observations in |
alternative |
character string indicating the kind of alternative hypothesis. The possible values are |
correct |
logical scalar indicating whether to use the correction for continuity in computing the z-statistic that is based on the test statistic S'. The default value is |
ci.slope |
logical scalar indicating whether to compute a confidence interval for the slope. The default value is |
conf.level |
numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the slope. The default value is |
independent.obs |
logical scalar indicating whether to assume the observations in y are seially independent. The default value is |
data.name |
character string indicating the name of the data used for the trend test. The default value is |
season.name |
character string indicating the name of the data used for the season. The default value is |
year.name |
character string indicating the name of the data used for the year. The default value is |
parent.of.data |
character string indicating the source of the data used for the trend test. |
subset.expression |
character string indicating the expression used to subset the data. |
data |
specifies an optional data frame, list or environment (or object coercible by |
subset |
specifies an optional vector specifying a subset of observations to be used. |
na.action |
specifies a function which indicates what should happen when the data contain |
Perform a nonparametric test for a monotonic trend within each season based on Kendall's tau statistic, and optionally compute a confidence interval for the slope across all seasons.
A list object with elements for results of the test
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
kendallSeasonalTrendTest(res ~ month + year, tidfitmean)
kendallSeasonalTrendTest(res ~ month + year, tidfitmean)
Nonparametric test for monotonic trend based on Kendall's Tau statistic
kendallTrendTest(y, ...) ## Default S3 method: kendallTrendTest( y, x = seq(along = y), alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, warn = TRUE, data.name = NULL, data.name.x = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'formula' kendallTrendTest(y, data = NULL, subset, na.action = na.pass, ...)
kendallTrendTest(y, ...) ## Default S3 method: kendallTrendTest( y, x = seq(along = y), alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, warn = TRUE, data.name = NULL, data.name.x = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'formula' kendallTrendTest(y, data = NULL, subset, na.action = na.pass, ...)
y |
an object containing data for the trend test. In the default method, the argument |
... |
methods passed to or from other methods |
x |
numeric vector of "predictor" values. The length of |
alternative |
character string indicating the kind of alternative hypothesis. The possible values are |
correct |
logical scalar indicating whether to use the correction for continuity in computing the z-statistic that is based on the test statistic S'. The default value is |
ci.slope |
logical scalar indicating whether to compute a confidence interval for the slope. The default value is |
conf.level |
numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the slope. The default value is |
warn |
logical scalar indicating whether to print a warning message when |
data.name |
character string indicating the name of the data used for the trend test. The default value is |
data.name.x |
character string indicating the name of the data used for the predictor variable |
parent.of.data |
character string indicating the source of the data used for the trend test. |
subset.expression |
character string indicating the expression used to subset the data. |
data |
specifies an optional data frame, list or environment (or object coercible by |
subset |
specifies an optional vector specifying a subset of observations to be used. |
na.action |
specifies a function which indicates what should happen when the data contain |
kendallTrendTest
performs Kendall's nonparametric test for a monotonic trend, which is a special case of the test for independence based on Kendall's tau statistic (see cor.test
). The slope is estimated using the method of Theil (1950) and Sen (1968). When ci.slope=TRUE
, the confidence interval for the slope is computed using Gilbert's (1987) Modification of the Theil/Sen Method.
Kendall's test for a monotonic trend is a special case of the test for independence based on Kendall's tau statistic. The first section below explains the general case of testing for independence. The second section explains the special case of testing for monotonic trend. The last section explains how a simple linear regression model is a special case of a monotonic trend and how the slope may be estimated.
A list object with elements for results of the test
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
kendallTrendTest(res ~ dec_time, tidfitmean)
kendallTrendTest(res ~ dec_time, tidfitmean)
Simulate a discharge time series by modelling the statistical properties of an existing daily time series
lnQ_sim(dat_in, comps = FALSE, seed = NULL)
lnQ_sim(dat_in, comps = FALSE, seed = NULL)
dat_in |
input |
comps |
logical indicating if components of the simulated time series are returned, see value. |
seed |
optional numeric value for random generation seed |
Daily flow data are simulated as the additive combination of a stationary seasonal component and serially-correlated errors estimated from the observed data. The stationary seasonal component is based on a seasonal regression of discharge over time. The residuals from this regression are used to estimate the error distribution using an ARIMA model. Parameters of the ARIMA model are chosen using stepwise estimation for nonseasonal univariate time series with the auto.arima
function. Random errors from a standard normal distribution for the length of the original time series are generated using the model estimates with the arima.sim
function. Finally, the errors are multiplied by the standard deviation of the original residuals and added to the seasonal component to create a simulated, daily log-flow time series.
The original data frame with an additional column of simulated data named lnQ_sim
if comps = FALSE
. Otherwise, a two-element list is returned where the first element is 1) a list with the linear seasonal model fit to the observed time series and ARIMA model fit to the seasonal residuals, and 2) a data frame with the original data, the fit from the seasonal linear model (seas_fit
), residuals from observed flow and seasonal fit (seas_res
), standard deviation of seasonal residuals (sd_seas
), simulated errors from the ARIMA model (errs
), and simulated discharge time series (sim_out
). Note that sim_out
vector is converted to the same range as the input flow record.
## example data data(daydat) ## simulate lnQ_sim(daydat)
## example data data(daydat) ## simulate lnQ_sim(daydat)
Simulate random errors of a water quality time series by modelling the statistical properties of an observed dataset
lnres_err(dat_in, yr = NULL, comps = FALSE, seed = NULL)
lnres_err(dat_in, yr = NULL, comps = FALSE, seed = NULL)
dat_in |
input |
yr |
numeric year value to use for the stationary model, defaults to the median year |
comps |
logical indicating if the WRTDS model used to get response error measures is also returned, see value. |
seed |
optional numeric value for random generation seed |
Random errors for a stationary seasonal water quality time series on a daily time step are generated by modelling residuals from an observed dataset. First, a stationary seasonal model is created by fitting a wrtds
model and estimating an error distribution the residuals using the auto.arima
function. Accumulated standard errors from the regression are also retained for each residual. Random errors using the estimated auto-regressive structures are simulated using arima.sim
for the entire year and multiplied by the corresponding standard error estimate from the regression. The entire year is then repeated for every year in the observed time series. The final simulated errors are rescaled to the range of the original residuals that were used to estimate the distribution.
The original data frame with additional columns for the random errors (errs
) and the standard error estimates for each residual (scls
). If comps = TRUE
, a two-element list is returned that also includes the WRTDS model used as a basis for errors and scale values.
## Not run: ## example data data(daydat) ## get errors lnres_err(daydat) ## End(Not run)
## Not run: ## example data data(daydat) ## get errors lnres_err(daydat) ## End(Not run)
Simulate a water quality time series with an estimated error structure and simulated discharge effect
lnres_sim(dat_in, lnQ_coef = NULL)
lnres_sim(dat_in, lnQ_coef = NULL)
dat_in |
input |
lnQ_coef |
numeric vector of coefficients of the same length of the time series in |
This function creates a simulated water quality time series and requires error estimates for an observed water quality dataset and a simulated discharge time series. The water quality time series is created as the additive combination of a seasonal, stationary time component (as in lnQ_sim
for discharge), a random error component from lnres_err
, and a simulated discharge time series from lnQ_sim
. The discharge time series is considered an explicit component of the water quality time series and is first centered at zero prior to adding. The optional vector of coefficients passed to lnQ_coef
can mediate the influence of discharge on the water quality time series. For example, a vector of all zeroes implies no effect, whereas a vector of all ones implies a constant effect (default).
The original data frame with additional columns for the seasonal water quality model (lnres_seas
), a flow-independent water quality time series (lnres_noQ
), and a flow-dependent time series (lnres_Q
).
daydat
for the format of an input dataset, lnQ_sim
for simulating discharge, and lnres_err
for estimating the error distribution of the water quality time series, all_sims
for completing all steps at once.
## Not run: ## example data data(daydat) ## get simulated discharge sims <- lnQ_sim(daydat) ## get error structure of wq time series sims <- lnres_err(sims) ## get simulated wq time series using results from previous lnres_sim(sims) ## End(Not run)
## Not run: ## example data data(daydat) ## get simulated discharge sims <- lnQ_sim(daydat) ## get error structure of wq time series sims <- lnres_err(sims) ## get simulated wq time series using results from previous lnres_sim(sims) ## End(Not run)
Fit weighted regression and get predicted/normalized response variable from a data frame. This is a wrapper for multiple function used to create a weighted regression model and should be used rather than the individual functions.
modfit(dat_in, ...) ## Default S3 method: modfit(dat_in, ...) ## S3 method for class 'tidal' modfit(dat_in, ...) ## S3 method for class 'tidalmean' modfit(dat_in, ...) ## S3 method for class 'data.frame' modfit(dat_in, resp_type = "quantile", ...)
modfit(dat_in, ...) ## Default S3 method: modfit(dat_in, ...) ## S3 method for class 'tidal' modfit(dat_in, ...) ## S3 method for class 'tidalmean' modfit(dat_in, ...) ## S3 method for class 'data.frame' modfit(dat_in, resp_type = "quantile", ...)
dat_in |
input |
... |
arguments passed to or from other methods |
resp_type |
chr string indicating the type of model response to use, quantile or mean model |
This function is used as a convenience to combine several functions that accomplish specific tasks, primarily the creation of a tidal or tidalmean object, fitting of the weighted regression models with wrtds
, extraction of fitted values from the interpolation grids using respred
, and normalization of the fitted values from the interpolation grid using resnorm
. The format of the input should be a data.frame
with response variable observations as rows and the first four columns as date, response variable, salinity/flow, and detection limits. The order of the columns may vary provided the order of each of the four critical variables is specified by the ind
argument that is passed to the tidal
or tidalmean
function. The response variable data are also assumed to be in log-space, otherwise use reslog = FALSE
which is also passed to the tidal
or tidalmean
function. The dataset described in chldat
is an example of the correct format.
For quantile models, the default conditional quantile that is predicted is the median (tau = 0.5
, passed to the wrtds
function). Numerous other arguments affect the output and the default parameters may not be appropriate for all scenarios. Arguments used by other functions can be specified explicitly with the initial call. The documentation for the functions under ‘see also’ should be consulted for available arguments, as well as the examples that illustrate common changes to the default values.
A tidal object with predicted and normalized response variable predictions, attributes updated accordingly.
See the help files for tidal
, tidalmean
, wrtds
, getwts
, respred
, and resnorm
for arguments that can be passed to this function.
## Not run: ## load data data(chldat) ## fit the model and get predicted/normalized data for response variable # default median fit # grids predicted across salinity range with ten values res <- modfit(chldat) # for mean models res <- modfit(chldat, resp_type = 'mean') ## fit different quantiles and smaller interpolation grid res <- modfit(chldat, tau = c(0.2, 0.8), flo_div = 5) ## fit with different window widths # half-window widths of one day, five years, and 0.3 salinity res <- modfit(chldat, wins = list(1, 5, 0.3)) ## suppress console output res <- modfit(chldat, trace = FALSE) ## End(Not run)
## Not run: ## load data data(chldat) ## fit the model and get predicted/normalized data for response variable # default median fit # grids predicted across salinity range with ten values res <- modfit(chldat) # for mean models res <- modfit(chldat, resp_type = 'mean') ## fit different quantiles and smaller interpolation grid res <- modfit(chldat, tau = c(0.2, 0.8), flo_div = 5) ## fit with different window widths # half-window widths of one day, five years, and 0.3 salinity res <- modfit(chldat, wins = list(1, 5, 0.3)) ## suppress console output res <- modfit(chldat, trace = FALSE) ## End(Not run)
Plot number of observations for each point in a WRTDS interpolation grid. This is a diagnostic plot to identify sample size for each unique location in the domain of the time series that is considered during model fitting.
nobsplot(dat_in, ...) ## Default S3 method: nobsplot( dat_in, month = "all", years = NULL, col_vec = NULL, allflo = TRUE, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidal' nobsplot(dat_in, ...) ## S3 method for class 'tidalmean' nobsplot(dat_in, ...)
nobsplot(dat_in, ...) ## Default S3 method: nobsplot( dat_in, month = "all", years = NULL, col_vec = NULL, allflo = TRUE, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidal' nobsplot(dat_in, ...) ## S3 method for class 'tidalmean' nobsplot(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot or chr string 'all' to indicate all months with no plot facets |
years |
numeric vector of years to plot, defaults to all |
col_vec |
chr string of plot colors to use, passed to |
allflo |
logical indicating if the salinity/flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
The plots can be used sample size as an indication of model fit for each unique location in the domain space of the time series. The plots show grids of the number of observations with weights greater than zero for each unique date and salinity/flow combination. The obs
attribute in the tidal
or tidalmean
object is created during model fitting and has the same dimensions as the interpolation grid. Each row is a unique date in the original dataset and each column is a salinity/flow value used to fit each regression (i.e., values in the flo_grd
attribute). In general, low points in the grid may indicate locations in the time series where insufficient data could affect model fit.
Unlike gridplot
, interpolation of the grids for a smoother appearance is not allowed because the objective is to identify specific locations with low sample size. For the former function, the objective is to characterize general trends over time rather values at specific locations.
A ggplot
object that can be further modified
wtsplot
for an alternative to evaluating weights with different window width combinations
## Not run: ## load a fitted tidal object data(tidfit) ## default plot nobsplot(tidfit) ## no facets, all months nobsplot(tidfit) ## change the defaults nobsplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) nobsplot(tidfitmean) ## End(Not run)
## Not run: ## load a fitted tidal object data(tidfit) ## default plot nobsplot(tidfit) ## no facets, all months nobsplot(tidfit) ## change the defaults nobsplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) nobsplot(tidfitmean) ## End(Not run)
Plot observed response variable and salinity/flow time series from a tidal object
obsplot(dat_in, ...) ## Default S3 method: obsplot( dat_in, lines = TRUE, logspace = TRUE, dt_rng = NULL, pretty = TRUE, col = "black", lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidal' obsplot(dat_in, ...) ## S3 method for class 'tidalmean' obsplot(dat_in, ...)
obsplot(dat_in, ...) ## Default S3 method: obsplot( dat_in, lines = TRUE, logspace = TRUE, dt_rng = NULL, pretty = TRUE, col = "black", lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidal' obsplot(dat_in, ...) ## S3 method for class 'tidalmean' obsplot(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
lines |
logical indicating if a line plot is used, otherwise points |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
col |
chr string of plot color to use |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
A ggplot
object that can be further modified
## load a fitted tidal object data(tidfit) ## plot using defaults obsplot(tidfit) ## changing default obsplot(tidfit, alpha = 0.5, size = 4, col = 'blue', lines = FALSE) ## plot a tidalmean object data(tidfitmean) obsplot(tidfitmean)
## load a fitted tidal object data(tidfit) ## plot using defaults obsplot(tidfit) ## changing default obsplot(tidfit, alpha = 0.5, size = 4, col = 'blue', lines = FALSE) ## plot a tidalmean object data(tidfitmean) obsplot(tidfitmean)
Plot combined predicted and normalized results from a tidal object to evaluate the influence of salinity or flow changes on the response variable. The plot is similar to that produced by fitplot
except predicted values are shown as points and observed values are removed.
prdnrmplot(dat_in, ...) ## S3 method for class 'tidal' prdnrmplot( dat_in, tau = NULL, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... ) ## S3 method for class 'tidalmean' prdnrmplot( dat_in, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... )
prdnrmplot(dat_in, ...) ## S3 method for class 'tidal' prdnrmplot( dat_in, tau = NULL, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... ) ## S3 method for class 'tidalmean' prdnrmplot( dat_in, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
annuals |
logical indicating if plots are annual aggregations of results |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, applies only if |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September, applies only if |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
plot |
logical if plot is returned, otherwise data used in the plot |
A ggplot
object that can be further modified
## load a fitted tidal object data(tidfit) ## plot using defaults prdnrmplot(tidfit) ## get the same plot but use default ggplot settings prdnrmplot(tidfit, pretty = FALSE) ## plot in log space prdnrmplot(tidfit, logspace = TRUE) ## plot specific quantiles prdnrmplot(tidfit, tau = c(0.1, 0.9)) ## plot the normalized predictions prdnrmplot(tidfit, predicted = FALSE) ## plot as monthly values prdnrmplot(tidfit, annuals = FALSE) ## format the x-axis is using annual aggregations library(ggplot2) prdnrmplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) ## modify the plot as needed using ggplot scales, etc. prdnrmplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( '', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) ## plot a tidalmean object data(tidfitmean) prdnrmplot(tidfitmean)
## load a fitted tidal object data(tidfit) ## plot using defaults prdnrmplot(tidfit) ## get the same plot but use default ggplot settings prdnrmplot(tidfit, pretty = FALSE) ## plot in log space prdnrmplot(tidfit, logspace = TRUE) ## plot specific quantiles prdnrmplot(tidfit, tau = c(0.1, 0.9)) ## plot the normalized predictions prdnrmplot(tidfit, predicted = FALSE) ## plot as monthly values prdnrmplot(tidfit, annuals = FALSE) ## format the x-axis is using annual aggregations library(ggplot2) prdnrmplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) ## modify the plot as needed using ggplot scales, etc. prdnrmplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( '', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) ## plot a tidalmean object data(tidfitmean) prdnrmplot(tidfitmean)
Get normalized model predictions from WRTDS to remove the effect of salinity/flow on the response variable. Predicted values in the interpolation grids are averaged across dates.
resnorm(dat_in, ...) ## S3 method for class 'tidal' resnorm(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' resnorm(dat_in, trace = TRUE, ...)
resnorm(dat_in, ...) ## S3 method for class 'tidal' resnorm(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' resnorm(dat_in, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
trace |
logical indicating if progress is shown in the console |
This function is used after wrtds
to normalize predicted values of the response variable from the interpolation grid for each model. The normalized values are based on the average of all predicted estimates across the range of salinity/flow values that have occurred on the same date throughout each year. For example, normalized values for July 2000 are the mean predicted response at that date using the observed salinity/flow values that occur in July of all years. The normalized values allow an interpretation of trends in the response variable that are independent of changes in salinity or freshwater inputs.
Appends columns to the data.frame for normalized values. For, tidal objects, columns are named starting with the prefix ‘norm’, e.g., ‘norm0.5’ are the normalized values for the fit through the median. For tidalmean objects, columns are appended for the log-transformed and back-transformed normalized values, named ‘norm’ and ‘bt_norm’.
## Not run: ## # load a tidal object data(tidobj) # get flow-normalized values for each quantile res <- resnorm(tidobj) # load a tidalmean object data(tidobjmean) # get flow-normalized values res <- resnorm(tidobjmean) ## End(Not run)
## Not run: ## # load a tidal object data(tidobj) # get flow-normalized values for each quantile res <- resnorm(tidobj) # load a tidalmean object data(tidobjmean) # get flow-normalized values res <- resnorm(tidobjmean) ## End(Not run)
Get model predictions from WRTDS using linear interpolation of values in grids
respred(dat_in, ...) ## S3 method for class 'tidal' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...) ## S3 method for class 'tidalmean' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...)
respred(dat_in, ...) ## S3 method for class 'tidal' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...) ## S3 method for class 'tidalmean' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
dat_pred |
optional data to predict using the interpolation grids in dat_in, defaults to observed data in |
trace |
logical indicating if progress is shown in the console |
omit |
logical indicating if observations in |
This function is used after wrtds
to estimate predicted values of the response variable from the interpolation grids. The estimated values are based on a bilinear interpolation of the four predicted response values at two salinity/flow and two date values nearest to the observed salinity/flow and date values to predict.
Data for dat_pred
must be a data frame of two columns for date and flow variables (date
and numeric
objects). The columns must be named 'date' and 'flo'. Values that are outside of the range of data used to fit the model are removed with a warning. It is assumed that the flow variable is not scaled (i.e., raw data) as in a tidal
or tidalmean
object. The dimensions of the output data are modified to match dat_pred
if observations are removed. The omit
argument should not equal FALSE
and is included only for use with wrtdscv
to evaluate folds of the original dataset.
Appends columns to the input data.frame for the predicted values. For tidal objects, columns are named starting with the prefix ‘fit’, e.g., ‘fit0.5’ are the predicted values for the fit through the median. For tidalmean objects, predicted values are appended for the mean model in log-space and the observed values from the back-transformed grids. Columns are named as ‘fits’ and ‘bt_fits’.
## # load a tidal object data(tidobj) # get fitted values for each quantile res <- respred(tidobj) # load a tidalmean object data(tidobjmean) # get predicted values res <- respred(tidobjmean)
## # load a tidal object data(tidobj) # get fitted values for each quantile res <- respred(tidobj) # load a tidalmean object data(tidobjmean) # get predicted values res <- respred(tidobjmean)
Get the scale parameters for predicted values of the response variable, only applies to tidalmean
objects.
resscls(dat_in, ...) ## S3 method for class 'tidalmean' resscls(dat_in, dat_pred = NULL, ...)
resscls(dat_in, ...) ## S3 method for class 'tidalmean' resscls(dat_in, dat_pred = NULL, ...)
dat_in |
input tidalmean object |
... |
arguments passed to or from other methods |
dat_pred |
optional data to predict using the interpolation grids in dat_in, defaults to observed data in |
This function is used after wrtds
to get scale parameters for predicted values of the response variable from the interpolation grids. The values are based on a bilinear interpolation of the four predicted response values at two salinity/flow and two date values nearest to the observed salinity/flow and date values to predict.
Appends columns to the data.frame for the associated scale value for the predicted values. A column is appended to the dat_in
object, named ‘scls’.
## # load a tidalmean object data(tidobjmean) # get predicted values res <- resscls(tidobjmean)
## # load a tidalmean object data(tidobjmean) # get predicted values res <- resscls(tidobjmean)
Sample a daily water quality time series at a set monthly frequency
samp_sim( dat_in, unit = "month", irregular = TRUE, missper = 0, blck = 1, blckper = FALSE )
samp_sim( dat_in, unit = "month", irregular = TRUE, missper = 0, blck = 1, blckper = FALSE )
dat_in |
input |
unit |
chr string indicating sampling unit, must be year, quarter, month, week, or yday for equivalent lubridate function |
irregular |
logical indicating if monthly sampling is done randomly within each |
missper |
numeric from 0-1 indicating percentage of observations used for test dataset |
blck |
numeric indicating block size for resampling test dataset, see details |
blckper |
logical indicating if the value passed to |
This function is intended for sampling a simulated daily time series of water quality that is returned by lnres_sim
or all_sims
.
The missper
argument is used to create a test dataset as a proportion of all observations in the sub-sampled output dataset. The test dataset is created with random block sampling appropriate for time series. Block sampling of the output dataset occurs until the number of unique observations is equal to the percentage defined by missper
. Overlap of blocks are not doubly considered towards the observation counts to satisfy missper
, i.e., sets of continuous observations longer than blck
can be returned because of sampling overlap. Setting blck = 1
and blockper = FALSE
is completely random sampling for missing data. Values for blck
must be 1 or greater if blockper = FALSE
and 1 or less if blckper = T
. If blck = 1
and blckper = T
, the missing data will be one continuous block.
Original data frame with rows subset based on number of desired monthly samples. If missper > 0
, a list is returned where the first element is the index values for the test dataset and the second is the complete subsampled dataset.
## Not run: ## example data data(daydat) ## simulate tosamp <- all_sims(daydat) ## sample samp_sim(tosamp) ## sample and create test dataset # test dataset is 30% size of monthly subsample using block sampling with size = 4 samp_sim(tosamp, missper = 0.3, blck = 4) ## End(Not run)
## Not run: ## example data data(daydat) ## simulate tosamp <- all_sims(daydat) ## sample samp_sim(tosamp) ## sample and create test dataset # test dataset is 30% size of monthly subsample using block sampling with size = 4 samp_sim(tosamp, missper = 0.3, blck = 4) ## End(Not run)
Plot seasonal trends by combining annual data
seasplot(dat_in, ...) ## S3 method for class 'tidal' seasplot( dat_in, tau = NULL, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... ) ## S3 method for class 'tidalmean' seasplot( dat_in, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... )
seasplot(dat_in, ...) ## S3 method for class 'tidal' seasplot( dat_in, tau = NULL, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... ) ## S3 method for class 'tidalmean' seasplot( dat_in, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... )
dat_in |
Input data object |
... |
arguments passed to other methods |
tau |
numeric of quantile to plot |
predicted |
logical indicating if seasonal smooth is based on model predictions, default |
span |
numeric indicating the smoothing parameter for the loess fit, passed to |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
logspace |
logical indicating if plots are in log space |
Seasonal variation across all years can be viewed by showing the observed annual data on a common y-axis. The year value is removed from the results such that the y-axis shows only the day of the year. A simple loess (locally estimated) polynomial smooth is added to show the seasonal trend in the results, where the smoother is fit through the model results for the observed data. The fit can be smoothed through the model predictions or the flow-normalized predictions, neither of which are shown on the plot.
dynaplot
, fitmoplot
, gridplot
, and sliceplot
produce similar graphics except variation in the same month across years is emphasized.
# load a fitted tidal object data(tidfit) # plot using defaults # defaults to all quantiles for tidal object seasplot(tidfit) # tidalmean object seasplot(tidfitmean)
# load a fitted tidal object data(tidfit) # plot using defaults # defaults to all quantiles for tidal object seasplot(tidfit) # tidalmean object seasplot(tidfitmean)
Plot seasonal model response by years on a common axis
seasyrplot(dat_in, ...) ## S3 method for class 'tidal' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... ) ## S3 method for class 'tidalmean' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... )
seasyrplot(dat_in, ...) ## S3 method for class 'tidal' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... ) ## S3 method for class 'tidalmean' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
years |
numeric vector of years to plot |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
alpha |
numeric value indicating transparency of points or lines |
The plot is similar to that produced by seasplot
except the model estimates are plotted for each year as connected lines, as compared to loess lines fit to the model results. seasyrplot
is also similar to sliceplot
except the x-axis and legend grouping variable are flipped. This is useful for evaluating between-year differences in seasonal trends.
Multiple predictions per month are averaged for a smoother plot.
Note that the year variable used for color mapping is treated as a continuous variable although it is an integer by definition.
A ggplot
object that can be further modified
## load a fitted tidal object data(tidfit) # plot using defaults seasyrplot(tidfit) # get the same plot but use default ggplot settings seasyrplot(tidfit, pretty = FALSE) # plot specific quantiles seasyrplot(tidfit, tau = c(0.9)) # plot the normalized predictions seasyrplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) seasyrplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) # plot a tidalmean object data(tidfitmean) seasyrplot(tidfitmean)
## load a fitted tidal object data(tidfit) # plot using defaults seasyrplot(tidfit) # get the same plot but use default ggplot settings seasyrplot(tidfit, pretty = FALSE) # plot specific quantiles seasyrplot(tidfit, tau = c(0.9)) # plot the normalized predictions seasyrplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) seasyrplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) # plot a tidalmean object data(tidfitmean) seasyrplot(tidfitmean)
Plot time slices within a tidal object to view response variable observations, predictions, and normalized results at regular annual intervals.
sliceplot(dat_in, ...) ## S3 method for class 'tidal' sliceplot( dat_in, slices = c(1, 7), tau = NULL, dt_rng = NULL, col_vec = NULL, predicted = TRUE, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' sliceplot( dat_in, slices = c(1, 7), predicted = TRUE, dt_rng = NULL, col_vec = NULL, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
sliceplot(dat_in, ...) ## S3 method for class 'tidal' sliceplot( dat_in, slices = c(1, 7), tau = NULL, dt_rng = NULL, col_vec = NULL, predicted = TRUE, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' sliceplot( dat_in, slices = c(1, 7), predicted = TRUE, dt_rng = NULL, col_vec = NULL, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
slices |
numeric vector of calender months to plot, i.e., 1 - 12 |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
This is a modification of fitplot
that can be used to plot selected time slices from the results of a fitted tidal
object. For example, all results for a particular month across all years can be viewed. This is useful for evaluating between-year differences in results for constant season. Only one quantile fit can be shown per plot because the grouping variable is mapped to the slices.
A ggplot
object that can be further modified
## load a fitted tidal object data(tidfit) # plot using defaults sliceplot(tidfit) # get different months - march and september sliceplot(tidfit, slices = c(3, 9)) # normalized predictions, 10th percentile sliceplot(tidfit, tau = 0.1, predicted = FALSE) # normalized values all months, change line aesthetics, log-space, 90th # add title library(ggplot2) sliceplot(tidfit, slices = 1:12, size = 1.5, tau = 0.9, alpha = 0.6, predicted = FALSE, logspace = TRUE ) + ggtitle('Normalized predictions for all months, 90th percentile') ## plot a tidalmean object data(tidfitmean) sliceplot(tidfitmean)
## load a fitted tidal object data(tidfit) # plot using defaults sliceplot(tidfit) # get different months - march and september sliceplot(tidfit, slices = c(3, 9)) # normalized predictions, 10th percentile sliceplot(tidfit, tau = 0.1, predicted = FALSE) # normalized values all months, change line aesthetics, log-space, 90th # add title library(ggplot2) sliceplot(tidfit, slices = 1:12, size = 1.5, tau = 0.9, alpha = 0.6, predicted = FALSE, logspace = TRUE ) + ggtitle('Normalized predictions for all months, 90th percentile') ## plot a tidalmean object data(tidfitmean) sliceplot(tidfitmean)
Prepare water quality data for weighted regression by creating a tidal class object
tidal( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
tidal( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
dat_in |
Input data frame for a water quality time series with four columns for date (Y-m-d format), response variable, salinity/flow, and detection limit for left-censored data |
ind |
four element numeric vector indicating column positions of date, response variable, salinity/flow, and detection limit of input data frame |
reslab |
character string or expression for labelling the response variable in plots, defaults to log-chlorophyll in ug/L |
flolab |
character string or expression for labelling the flow variable in plots, defaults to Salinity |
reslog |
logical indicating if the response variable is already in log-space, default |
rm_miss |
logical indicating if missing observations in the input data are removed |
... |
arguments passed from other methods |
This function is a simple wrapper to structure
that is used to create a tidal object for use with weighted regression in tidal waters. Input data should be a four-column data.frame
with date, response variable, salinity/flow data, and detection limit for each observation of the response. The response variable is assumed to be log-transformed, otherwise use reslog = FALSE
. Salinity data can be provided as fraction of freshwater or as parts per thousand. The limit column can be entered as a sufficiently small number if all values are above the detection limit or no limit exists. The current implementation of weighted regression for tidal waters only handles left-censored data. Missing observations are also removed.
A tidal object as a data frame and attributes. The data frame has columns ordered as date, response variable, salinity/flow (rescaled to 0, 1 range), detection limit, logical for detection limit, day number, month, year, and decimal time. The attributes are as follows:
names
Column names of the data frame
row.names
Row names of the data frame
class
Class of the object
half_wins
List of numeric values used for half-window widths for model fitting, in the same order as the wt_vars argument passed to getwts
. Initially will be NULL if wrtds
has not been used.
fits
List of matrices with fits for the WRTDS interpolation grid, defaults to one list for the median quantile. Initially will be NULL
if wrtds
has not been used.
predonobs
A data.frame
of predictions using the observed data that were used to fit the model. This is required for wrtdsperf
if a novel dataset is used for predictions after fitting the model. Initially will be NULL if respred
has not been used.
flo_grd
Numeric vector of salinity/flow values that was used for the interpolation grids
floobs_rng
Two element vector indicating the salinity/flow range of the observed data
nobs
List with one matrix showing the number of weights greater than zero for each date and salinity/flow combination used to create the fit matrices in fits
. Number of observations are the same for each quantile model. Initially will be NULL
if wrtds
has not been used.
reslab
expression or character string for response variable label in plots
flolab
expression or character string for flow variable label in plots
## raw data data(chldat) ## format chldat <- tidal(chldat)
## raw data data(chldat) ## format chldat <- tidal(chldat)
Prepare water quality data for weighted regression for the mean response by creating a tidalmean class object
tidalmean( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
tidalmean( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
dat_in |
Input data frame for a water quality time series with four columns for date (Y-m-d format), response variable, salinity/flow, and detection limit for left-censored data |
ind |
four element numeric vector indicating column positions of date, response variable, salinity/flow, and detection limit of input data frame |
reslab |
character string or expression for labelling the response variable in plots, defaults to log-chlorophyll in ug/L |
flolab |
character string or expression for labelling the flow variable in plots, defaults to Salinity |
reslog |
logical indicating if input response variable is already in log-space, default |
rm_miss |
logical indicating if missing observations in the input data are removed |
... |
arguments passed from other methods |
This function is a simple wrapper to structure
that is used to create a tidalmean object for use with weighted regression in tidal waters, specifically to model the mean response as compared to a conditional quantile. Input data should be a four-column data.frame
with date, response variable, salinity/flow data, and detection limit for each observation of the response. The response data are assumed to be log-transformed, otherwise use reslog = FALSE
. Salinity data can be provided as fraction of freshwater or as parts per thousand. The limit column can be entered as a sufficiently small number if all values are above the detection limit or no limit exists. The current implementation of weighted regression for tidal waters only handles left-censored data. Missing observations are also removed.
The tidalmean object structure is almost identical to the tidal object, with the exception of an additional attribute for the back-transformed interpolation grid. This is included to account for retransformation bias of log-transformed variables associated with mean models.
A tidalmean object as a data frame and attributes. The data frame has columns ordered as date, response variable, salinity/flow (rescaled to 0, 1 range), detection limit, logical for detection limit, day number, month, year, and decimal time. The attributes are as follows:
names
Column names of the data frame
row.names
Row names of the data frame
class
Class of the object
half_wins
List of numeric values used for half-window widths for model fitting, in the same order as the wt_vars argument passed to getwts
. Initially will be NULL
if wrtds
has not been used.
fits
List with a single element with fits for the WRTDS mean interpolation grid. Initially will be NULL if wrtds
has not been used.
predonobs
A data.frame
of predictions using the observed data that were used to fit the model. This is required for wrtdsperf
if a novel dataset is used for predictions after fitting the model. Initially will be NULL if respred
has not been used.
bt_fits
List with a single element with back-transformed fits for the WRTDS mean interpolation grid. Initially will be NULL if wrtds
has not been used.
flo_grd
Numeric vector of salinity/flow values that was used for the interpolation grids
floobs_rng
Two element vector indicating the salinity/flow range of the observed data
nobs
List with one matrix showing the number of weights greater than zero for each date and salinity/flow combination used to create the fit matrices in fits
. Initially will be NULL
if wrtds
has not been used.
reslab
expression or character string for response variable label in plots
flolab
expression or character string for flow variable label in plots
## raw data data(chldat) ## format chldat <- tidalmean(chldat)
## raw data data(chldat) ## format chldat <- tidalmean(chldat)
An identical object as tidobj
with the addition of chlorophyll predictions and normalized estimates after running respred
and resnorm
.
tidfit
tidfit
A tidal
and data.frame
object with 156 rows and 15 variables:
date
Date
res
numeric
flo
numeric
lim
numeric
not_cens
logical
day_num
numeric
month
numeric
year
numeric
dec_time
numeric
fit0.1
numeric
fit0.5
numeric
fit0.9
numeric
norm0.1
numeric
norm0.5
numeric
norm0.9
numeric
tidal
for full list of attributes in tidal objects, wrtds
for creating the fits
interpolation grids, and respred
and resnorm
for interpolating predicted and normalized values from the grids.
An identical object as tidobjmean
with the addition of predicted and normalized chlorophyll (log-space and back-transformed) after running respred
and resnorm
.
tidfitmean
tidfitmean
A tidalmean
and data.frame
object with 156 rows and 11 variables:
date
Date
res
numeric
flo
numeric
lim
numeric
not_cens
logical
day_num
numeric
month
numeric
year
numeric
dec_time
numeric
fits
numeric
bt_fits
numeric
norm
numeric
bt_norms
numeric
tidalmean
for full list of attributes in tidalmean objects, wrtds
for creating the fits
, bt_fits
, and scls
interpolation grids in the attributes, and respred
and resnorm
for interpolating predicted and normalized values from the grids.
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay as a tidal object. Raw data are those in chldat
with the addition of further processing using the tidal
and wrtds
functions. Model predictions are obtained for the tenth, median, and ninetieth conditional quantile of chlorophyll. The object also inherits methods from the data.frame
class. The processed data includes columns for date, chlorophyll-a (res
, in log-space), salinity as fraction of freshwater (flo
, i.e., 0 - 1, with higher values indicating more freshwater), the detection limit for all stations for the respective date, a logical column indicating if the observed chlorophyll is at or below the detection limit, the date as decimal time minus the year, the month from 1 to 12, the year, and total decimal time. Attributes include column names, row names, class of the object, matrices of interpolation grids from the weighted regression for each conditional quantile, and vector of salinity values that were used to create the interpolation grids.
tidobj
tidobj
A tidal
and data.frame
object with 156 rows and 9 variables:
date
Date
res
numeric
flo
numeric
lim
numeric
not_cens
logical
day_num
numeric
month
numeric
year
numeric
dec_time
numeric
tidal
for full list of attributes in tidal objects and wrtds
for creating the fits
interpolation grids.
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay as a tidal object. Raw data are those in chldat
with the addition of further processing using the tidalmean
and wrtds
functions. Model predictions are obtained using weighted regression for the conditional mean of chlorophyll. The object also inherits methods from the data.frame
class. The processed data includes columns for date, chlorophyll-a (res
, in log-space), salinity as fraction of freshwater (flo
, i.e., 0 - 1, with higher values indicating more freshwater), the detection limit for all stations for the respective date, a logical column indicating if the observed chlorophyll is at or below the detection limit, the date as decimal time minus the year, the month from 1 to 12, the year, and total decimal time. Attributes include column names, row names, class of the object, an interpolation grid from the weighted regression for the mean response, a back-transformed interpolation grid from the weighted regression for the mean response, and vector of salinity values that were used to create the interpolation grids.
tidobjmean
tidobjmean
A tidalmean
and data.frame
object with 156 rows and 9 variables:
date
Date
res
numeric
flo
numeric
lim
numeric
not_cens
logical
day_num
numeric
month
numeric
year
numeric
dec_time
numeric
tidalmean
for full list of attributes in tidalmean objects and wrtds
for creating the fits
, bt_fits
, and scls
interpolation grids.
Find the optimal half-window width combination to use for weighted regression. This differs from winsrch_optim
by using constrOptim
winsrch_constrOptim(dat_in, ...) ## Default S3 method: winsrch_constrOptim( dat_in, wins_in = NULL, control = list(), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
winsrch_constrOptim(dat_in, ...) ## Default S3 method: winsrch_constrOptim( dat_in, wins_in = NULL, control = list(), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
dat_in |
input data object to use with weighted regression |
... |
|
wins_in |
starting list of window weights for initializing the search algorithm |
control |
A list of control parameters passed to |
lower |
vector of minimum half-window widths to evaluate |
upper |
vector of maximum half-window widths to evaluate |
This function uses optim
to minimize the error returned by wrtdscv
for a given window combination. The search algorithm uses the limited-memory modification of the BFGS quasi-Newton method to impose upper and lower limits on the optimization search. These limits can be changed using the lower
and upper
arguments.
Some stuff
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
Evaluate a grid of half-window width combinations to use for weighted regression
winsrch_grid(dat_in, ...) ## Default S3 method: winsrch_grid(dat_in, grid_in = NULL, ...)
winsrch_grid(dat_in, ...) ## Default S3 method: winsrch_grid(dat_in, grid_in = NULL, ...)
dat_in |
input data object to use with weighted regression |
... |
arguments passed to or from other methods |
grid_in |
optional input matrix of half-window widths created with |
Processing time can be reduced by setting up a parallel backend, as in the examples. Note that this is not effective for small k-values (e.g., < 4) because each fold is sent to a processor, whereas the window width combinations in grid_in
are evaluated in sequence.
This function should only be used to view the error surface associated with finite combinations of window-width combinations. A faster function to identify the optimal window widths is provided by winsrch_optim
.
A data frame of the search grid with associated errors for each cross-validation result. Errors for each grid row are averages of all errors for each fold used in cross-validation.
createsrch
, wrtdscv
, winsrch_optim
## Not run: ## # setup parallel backend library(doParallel) ncores <- detectCores() - 2 registerDoParallel(cores = ncores) # run search function using default search grid - takes a while res <- winsrch_grid(tidobjmean) # view the error surface library(ggplot2) ggplot(res, aes(x = factor(mos), y = factor(yrs), fill = err)) + geom_tile() + facet_wrap(~ flo) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0,0)) + scale_fill_gradientn(colours = gradcols()) # optimal combo res[which.min(res$err), ] ## # create a custom search grid, e.g. years only grid_in <- createsrch(mos = 1, yrs = seq(1, 10), flo = 1) res <- winsrch_grid(tidobjmean, grid_in) ## End(Not run)
## Not run: ## # setup parallel backend library(doParallel) ncores <- detectCores() - 2 registerDoParallel(cores = ncores) # run search function using default search grid - takes a while res <- winsrch_grid(tidobjmean) # view the error surface library(ggplot2) ggplot(res, aes(x = factor(mos), y = factor(yrs), fill = err)) + geom_tile() + facet_wrap(~ flo) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0,0)) + scale_fill_gradientn(colours = gradcols()) # optimal combo res[which.min(res$err), ] ## # create a custom search grid, e.g. years only grid_in <- createsrch(mos = 1, yrs = seq(1, 10), flo = 1) res <- winsrch_grid(tidobjmean, grid_in) ## End(Not run)
Find the optimal half-window width combination to use for weighted regression.
winsrch_optim(dat_in, ...) ## Default S3 method: winsrch_optim( dat_in, wins_in = NULL, control = list(factr = 1e+07, parscale = c(1, 10, 1)), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
winsrch_optim(dat_in, ...) ## Default S3 method: winsrch_optim( dat_in, wins_in = NULL, control = list(factr = 1e+07, parscale = c(1, 10, 1)), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
dat_in |
input data object to use with weighted regression |
... |
|
wins_in |
starting list of window weights for initializing the search algorithm |
control |
A list of control parameters passed to |
lower |
vector of minimum half-window widths to evaluate |
upper |
vector of maximum half-window widths to evaluate |
This function uses optim
to minimize the error returned by wrtdscv
for a given window combination. The search algorithm uses the limited-memory modification of the BFGS quasi-Newton method to impose upper and lower limits on the optimization search. These limits can be changed using the lower
and upper
arguments.
Some stuff
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
Get WRTDS prediction grid for observations of the response variable in a tidal or tidalmean object
wrtds(dat_in, ...) ## S3 method for class 'tidal' wrtds(dat_in, flo_div = 10, tau = 0.5, trace = TRUE, fill_empty = FALSE, ...) ## S3 method for class 'tidalmean' wrtds(dat_in, flo_div = 10, fill_empty = FALSE, trace = TRUE, ...)
wrtds(dat_in, ...) ## S3 method for class 'tidal' wrtds(dat_in, flo_div = 10, tau = 0.5, trace = TRUE, fill_empty = FALSE, ...) ## S3 method for class 'tidalmean' wrtds(dat_in, flo_div = 10, fill_empty = FALSE, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
flo_div |
numeric indicating number of divisions across the range of salinity/flow to create the interpolation grid |
tau |
numeric vector indicating conditional quantiles to fit in the weighted regression, can be many |
trace |
logical indicating if progress is shown in the console |
fill_empty |
logical to fill missing values in interpolation grid using bilinear interpolation by season, see details |
Appends interpolation grid attributes to the input object. For a tidal object, this could include multiple grids for each quantile. For tidalmean objects, only one grid is appended to the ‘fits’ attribute, in addition to a back-transformed grid as the ‘bt_fits’ attribute and a grid of the scale parameter of each prediction as the ‘scls’ attribute. Grid rows correspond to the dates in the input data.
The fill_empty
arguments uses bilinear interpolation of time by flow to fill missing data in the interpolation grids. The grids are subset by month before interpolating to retain the seasonal variation captured by the models. In general, this argument should not be used if more than ten percent of the interpolation grids are missing data. It may be helpful to improve visual appearance of some of the plotting results.
## Not run: ## load data data(chldat) ## as tidal object dat_in <- tidal(chldat) res <- wrtds(dat_in) ## as tidalmean object dat_in <- tidalmean(chldat) res <- wrtds(dat_in) ## multiple quantiles res <- wrtds(dat_in, tau = c(0.1, 0.5, 0.9)) ## End(Not run)
## Not run: ## load data data(chldat) ## as tidal object dat_in <- tidal(chldat) res <- wrtds(dat_in) ## as tidalmean object dat_in <- tidalmean(chldat) res <- wrtds(dat_in) ## multiple quantiles res <- wrtds(dat_in, tau = c(0.1, 0.5, 0.9)) ## End(Not run)
Use k-fold cross-validation to evaluate WRTDS model fit based on supplied half-window widths.
wrtdscv(dat_in, ...) ## Default S3 method: wrtdscv(dat_in, wins, k = 10, seed_val = 123, trace = TRUE, ...)
wrtdscv(dat_in, ...) ## Default S3 method: wrtdscv(dat_in, wins, k = 10, seed_val = 123, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
wins |
list of input half-window widths of the order months, years, and salinity/flow, passed to |
k |
number of folds to evaluate |
seed_val |
seed to keep the same dataset divisions between window width comparisons |
trace |
logical indicating if progress is printed in the console |
Default number of folds is ten. Each fold can be evaluated with multiple cores if a parallel back end is created prior to running the function (see the examples). This will greatly increase processing speed unless k is set to a small number.
Overall error is the average of all errors for each fold.
getwts
, wtsplot
, winsrch_grid
, winsrch_optim
## Not run: library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # half-window widths to evaluate # months, years, and salinity/flow wins <- list(0.5, 10, 0.5) # get ocv score for k = 10 wrtdscv(tidobjmean, wins = wins) # get ocv score k = 2, tau = 0.2 wrtdscv(tidobj, wins = wins, tau = 0.2) ## End(Not run)
## Not run: library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # half-window widths to evaluate # months, years, and salinity/flow wins <- list(0.5, 10, 0.5) # get ocv score for k = 10 wrtdscv(tidobjmean, wins = wins) # get ocv score k = 2, tau = 0.2 wrtdscv(tidobj, wins = wins, tau = 0.2) ## End(Not run)
Get WRTDS performance metrics including goodness of fit, root mean square error, and normalized mean square error.
wrtdsperf(dat_in, ...) ## S3 method for class 'tidal' wrtdsperf(dat_in, logspace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsperf(dat_in, logspace = TRUE, ...)
wrtdsperf(dat_in, ...) ## S3 method for class 'tidal' wrtdsperf(dat_in, logspace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsperf(dat_in, logspace = TRUE, ...)
dat_in |
input tidal object which must already have fitted model data |
... |
arguments passed to or from other methods |
logspace |
logical if performance metrics use back-transformed residuals |
Goodness of fit is calculated using the goodfit
function for quantile regression described in Koenker and Mochado 1999. Root mean square error is based on square root of the mean of the squared residuals. Normalized mean square error described in Gershenfeld and Weigend 1993 is the sum of the squared errors divided by the sum of the non-conditional errors (i.e., sum of the squared values of the observed minus the mean of the observed). This measure allows comparability of error values for data with different ranges, although the interpretation for quantile models is not clear. The value is provided as a means of comparison for WRTDS models created from the same data set but with different window widths during model fitting.
Performance metrics are only valid for observations and model residuals in log-space. Metrics also only apply to the data used to fit the model, i.e., performance will not be evaluated for novel data if the dat_pred
argument was used with respred
.
A data.frame
with the metrics for each quantile model
Gershenfeld, N.A., Weigend, A.S. 1993. The future of time series: learning and understanding. In: Weigend, A.S., Gershenfeld, N.A. (eds). Time Series Prediction: Forecasting the Future and Understanding the Past., second ed. Addison-Wesley, Santa Fe, New Mexico. pp. 1-70.
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
wrtdsrsd
for residuals, goodfit
## load a fitted model object data(tidfit) ## get performance metrics wrtdsperf(tidfit)
## load a fitted model object data(tidfit) ## get performance metrics wrtdsperf(tidfit)
Get WRTDS residuals for each quantile model. These are used to estimate goodness of fit of the model predictions.
wrtdsrsd(dat_in, ...) ## S3 method for class 'tidal' wrtdsrsd(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsrsd(dat_in, trace = TRUE, ...)
wrtdsrsd(dat_in, ...) ## S3 method for class 'tidal' wrtdsrsd(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsrsd(dat_in, trace = TRUE, ...)
dat_in |
input tidal object which must already have fitted model data |
... |
arguments passed to or from other methods |
trace |
logical indicating if progress is shown in the console |
Columns are added to the data of the tidal object for residuals and non-conditional residuals. Both are required to assess the goodness of fit measure described for quantile regression in Koenker and Machado (1999).
A tidal object with columns added to the predonobs
attribute for the residuals ('rsd') and non-conditional residuals ('rsdnl') of each quantile model or a tidalmean object with columns added to the predonobs
attribute for the residuals ('rsd') and back-transformed residuals ('bt_rsd').
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
## load a fitted model object data(tidfit) ## run the function res <- wrtdsrsd(tidfit) head(res)
## load a fitted model object data(tidfit) ## run the function res <- wrtdsrsd(tidfit) head(res)
Get WRTDS trends for annual and monthly groupings
wrtdstrnd(dat_in, ...) ## Default S3 method: wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidal' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidalmean' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... )
wrtdstrnd(dat_in, ...) ## Default S3 method: wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidal' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidalmean' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... )
dat_in |
input tidal or tidalmean object which must already have fitted model data |
... |
methods passed to or from other methods |
mobrks |
list of month groupings where each month is an integer from 1 to 12, see examples |
yrbrks |
numeric vector of breaks for years, see examples |
molabs |
character vector of names for month breaks, see examples |
yrlabs |
character vector of names for year breaks, see examples |
aves |
logical if averages within each period are also returned |
mo_strt |
numeric indicating month to start aggregation years for annual trends, defaults to October for USGS water year from October to September, passed to |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, passed to |
tau |
numeric vector of quantile for estimating trends |
Trends are reported as percent changes of annual averages from the beginning to the end of each period. To reduce the effects of odd years at the beginning and end of each period, percent changes are based on an average of the first three and last three annual averages. For example, percent changes for January throughout an an entire time series from 1980 to 2000 would be the change of the average from January in 1980-1982 to the average from January in 1998-2000. Annual trends, e.g., percent changes from 1980-1986, 1987-1993, etc. do not average by the first and last three years in each grouping because the values are already based on annual averages as returned by annual_agg
.
Note that the default minimum number of months argument (min_mo
) may not be appropriate for all cases. Annual estimates should first be evaluated with prdnrmplot
to verify that odd years with missing months are not driving results for the annual percent changes.
Averages in each period can be returned if aves = TRUE
. These averages are based on annual averages within each period for congruency with the trend estimates.
All trends are based on back-transformed, flow-normalized results.
The user must supply the annual and monthly aggregation periods to the appropriate arguments. These are passed to cut
and are left-open, right-closed along the interval.
A data.frame
with summary trends for each grouping
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd(tidfitmean, mobrks, yrbrks, molabs, yrlabs) # get averages in each period wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs, aves = TRUE)
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd(tidfitmean, mobrks, yrbrks, molabs, yrlabs) # get averages in each period wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs, aves = TRUE)
Get WRTDS trends using seasonal Kendall tests
wrtdstrnd_sk(dat_in, ...) ## Default S3 method: wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, ...) ## S3 method for class 'tidal' wrtdstrnd_sk( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, trndvar = "norm", ... ) ## S3 method for class 'tidalmean' wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, trndvar = "bt_norm", ...)
wrtdstrnd_sk(dat_in, ...) ## Default S3 method: wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, ...) ## S3 method for class 'tidal' wrtdstrnd_sk( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, trndvar = "norm", ... ) ## S3 method for class 'tidalmean' wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, trndvar = "bt_norm", ...)
dat_in |
input tidal or tidalmean object which must already have fitted model data |
... |
methods passed to or from other methods |
mobrks |
list of month groupings where each month is an integer from 1 to 12, see examples |
yrbrks |
numeric vector of breaks for years, see examples |
molabs |
character vector of names for month breaks, see examples |
yrlabs |
character vector of names for year breaks, see examples |
tau |
numeric vector of quantile for estimating trends |
trndvar |
chr string of variable for trend evaluation, usually back-transformed, flow-normalized results, see details |
Trends are based on kendallSeasonalTrendTest
for user-specified time periods. In general, the seasonal Kendall test evaluates monotonic trends using a non-parametric approach that accounts for seasonal variation in the time series.
All trends are based on back-transformed, flow-normalized results by default. The variable for evaluating trends can be changed with 'trndvar'
as 'res'
, 'norm'
, or 'fit'
for tidal
objects and as 'res'
, 'bt_norm'
, or 'bt_fits'
for tidalmean
objects. In all cases, back-transformed variables are evaluated.
The user must supply the annual and monthly aggregation periods to the appropriate arguments. These are passed to cut
and are left-open, right-closed along the interval.
A data.frame
with summary trends for each grouping, including med
as the median value for the period of observation, tau
as the magnitude and direction of the trend, slope
as the Thiel-Sen slope for change per year, chitest
as the significance test evaluating heterogeneity between seasons, ztest
indicating significance of the overall trend, and perchg
as 100 multiplied by the ratio of the annual slope to the median estimate of the time period (percent change per year).
As noted in kendallSeasonalTrendTest
, the overall test is not appropriate if chitest
indicates a small p-value.
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd_sk(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd_sk(tidfitmean, mobrks, yrbrks, molabs, yrlabs)
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd_sk(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd_sk(tidfitmean, mobrks, yrbrks, molabs, yrlabs)
Create several plots showing the weights used to fit a model for a single observation.
wtsplot(dat_in, ...) ## Default S3 method: wtsplot( dat_in, ref = NULL, wins = list(0.5, 10, NULL), min_obs = TRUE, slice = FALSE, dt_rng = NULL, pt_rng = c(1, 12), col_vec = NULL, col_lns = NULL, alpha = 1, as_list = FALSE, ... ) ## S3 method for class 'tidal' wtsplot(dat_in, ...) ## S3 method for class 'tidalmean' wtsplot(dat_in, ...)
wtsplot(dat_in, ...) ## Default S3 method: wtsplot( dat_in, ref = NULL, wins = list(0.5, 10, NULL), min_obs = TRUE, slice = FALSE, dt_rng = NULL, pt_rng = c(1, 12), col_vec = NULL, col_lns = NULL, alpha = 1, as_list = FALSE, ... ) ## S3 method for class 'tidal' wtsplot(dat_in, ...) ## S3 method for class 'tidalmean' wtsplot(dat_in, ...)
dat_in |
input tidal object |
... |
arguments passed to other methods |
ref |
chr string indicating the date at the center of the weighting window. Must be in the format 'YYYY-mm-dd' which is passed to |
wins |
list with three elements passed to |
min_obs |
logical to use window widening if less than 100 non-zero weights are found, passed to |
slice |
logical indicating if only weights bounded by the year window (i.e., the limiting window for the combined weights) are shown, passed to |
dt_rng |
Optional chr string indicating the date range for all plots except seasonal (day) weights. Must be two values in the format 'YYYY-mm-dd' which is passed to |
pt_rng |
numeric vector of two elements indicating point scaling for all weights in the plot of salinity/flow vs time. |
col_vec |
chr string of plot colors to use, passed to |
col_lns |
chr string of line color in plots |
alpha |
numeric value from zero to one indicating transparency of points and lines |
as_list |
logical indicating if plots should be returned in a list |
Create diagnostic plots to view the effects of different weighting windows on model predictions. The plots illustrate the weights that are used when fitting a weighted regression in reference to a single observation. The process is repeated for all observations when the entire model is fit. Five plots are produced by the function, each showing the weights in relation to time and the selected observation (i.e., center of the weighting window). The top plot shows salinity/flow over time with the points colored and sized by the combined weight vector. The remaining four plots show the weights over time for each separate weighting component (months/days, year, and salinity/flow) and the final combined vector.
A combined ggplot
object created using grid.arrange
. A list with elements for each individual plot will be returned if as_list = TRUE
.
## load a fitted tidal object data(tidfit) ## plot using defaults, wtsplot(tidfit) ## change the defaults wtsplot(tidfit, ref = '2000-01-01', wins = list(0.5, 15, Inf), dt_rng = c('1990-01-01', '2010-01-01'), pt_rng = c(3, 8), col_vec = c('lightgreen', 'lightblue', 'purple'), alpha = 0.7)
## load a fitted tidal object data(tidfit) ## plot using defaults, wtsplot(tidfit) ## change the defaults wtsplot(tidfit, ref = '2000-01-01', wins = list(0.5, 15, Inf), dt_rng = c('1990-01-01', '2010-01-01'), pt_rng = c(3, 8), col_vec = c('lightgreen', 'lightblue', 'purple'), alpha = 0.7)