| Title: | Weighted Regression for Water Quality Evaluation in Tidal Waters |
|---|---|
| Description: | An adaptation for estuaries (tidal waters) of weighted regression on time, discharge, and season to evaluate trends in water quality time series. Please see Beck and Hagy (2015) <doi:10.1007/s10666-015-9452-8> for details. |
| Authors: | Marcus W. Beck [aut, cre] |
| Maintainer: | Marcus W. Beck <[email protected]> |
| License: | CC0 |
| Version: | 1.1.5 |
| Built: | 2026-05-15 09:30:45 UTC |
| Source: | https://github.com/fawda123/WRTDStidal |
Get AIC values for a single weighted quantile regression as used in WRTDS models
aiccrq(mod_in, tau = 0.5)aiccrq(mod_in, tau = 0.5)
mod_in |
input crq model |
tau |
numeric indicating quantile to evaluate |
The AIC value is based on the log-likelihood estimate of the model that accounts for the specific quantile, the minimum of the objective function (rho), and the number of model parameters. The residuals are specific to the WRTDS model such that this function cannot be applied to arbitrary crq models.
AIC estimate
# get wts for a model centered on the first observation ref_in <- tidobj[1, ] ref_wts <- getwts(tidobj, ref_in) # get the model mod <- quantreg::crq( survival::Surv(res, not_cens, type = "left") ~ dec_time + flo + sin(2*pi*dec_time) + cos(2*pi*dec_time), weights = ref_wts, data = tidobj, method = "Portnoy" ) aiccrq(mod)# get wts for a model centered on the first observation ref_in <- tidobj[1, ] ref_wts <- getwts(tidobj, ref_in) # get the model mod <- quantreg::crq( survival::Surv(res, not_cens, type = "left") ~ dec_time + flo + sin(2*pi*dec_time) + cos(2*pi*dec_time), weights = ref_wts, data = tidobj, method = "Portnoy" ) aiccrq(mod)
Simulate a response variable time series using all simulation functions
all_sims(dat_in, ...)all_sims(dat_in, ...)
dat_in |
input |
... |
additional arguments passed to |
This is a convenience function that combines lnQ_sim, lnres_err, and lnres_sim. See the help documentation function for more details on each.
Original data frame with additional columns for simulated discharge (lnQ_sim), the random errors of the response variable (errs), the standard error estimates for each residual (scls), a flow-independent response variable time series (lnres_noQ), and a flow-dependent time series (lnres_Q).
daydat for example data, lnQ_sim for simulating discharge, lnres_err for estimating the error structure of the response variable, and lnres_sim for simulating the response variable
## Not run: ## example data data(daydat) ## simulate tmp <- all_sims(daydat) ## End(Not run)## Not run: ## example data data(daydat) ## simulate tmp <- all_sims(daydat) ## End(Not run)
Create annual aggregations of WRTDS output
annual_agg(dat_in, ...) ## Default S3 method: annual_agg(dat_in, mo_strt = 10, min_mo = 9, logspace = TRUE, ...)annual_agg(dat_in, ...) ## Default S3 method: annual_agg(dat_in, mo_strt = 10, min_mo = 9, logspace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years |
logspace |
logical indicating if aggregated data are to be shown in log-space or not |
WRTDS output is averaged by year for both predictions and flow-normalized predictions. Years are averaged only if one observation is contained in each of the minimum number of months specified by min_mo averaging, otherwise results are not returned for the given year. Note that setting min_mo to values smaller than the default can produce inaccurate trends for years with very few results.
The function is used internally within prdnrmplot and fitplot.
An aggregated data object for plotting, returns only model output and response variable.
## tidal object annual_agg(tidfit) ## tidalmean object annual_agg(tidfitmean)## tidal object annual_agg(tidfit) ## tidalmean object annual_agg(tidfitmean)
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay. Data are the median values of monthly observations across all water quality stations in Hillsborough Bay. Date ranges are from January 1974 to December 2012 (452 observations). Variables are date, chlorophyll-a (in log-space, labelled res for response), salinity as fraction of freshwater (i.e., 0 - 1, with higher values indicating more freshwater, label flo for flow), and the detection limit for all stations for the respective date.
chldatchldat
A data frame with 452 rows and 4 variables:
dateDate
resnumeric
flonumeric
limnumeric
Get the chlorophyll axis label for observed or log-space, including units
chllab(logspace = TRUE)chllab(logspace = TRUE)
logspace |
logical indicating if chlorophyll is in log space, otherwise observed |
An expression indicating (log)-chlorophyll in the appropriate units
## default chllab()## default chllab()
Get trend for a single time period
chngest(x, y, single = F)chngest(x, y, single = F)
x |
numeric of year values |
y |
norm numeric of annually averaged response variable |
single |
logical if trends are based on only first and last year of aggregated, i.e., no averaging of first/last three |
Estimates trends using methods described in wrtdstrnd. Used internally and is not to be called by the user. Function runs on individual time period groups defined by arguments in wrtdstrnd.
Create a grid of all unique combinations of half-window widths to evaluate. The result can be passed to winsrch_grid.
createsrch( mos = c(seq(0.5, 1, by = 0.25), 2, 10), yrs = c(seq(5, 15, by = 3), 50), flo = c(seq(0.5, 1, by = 0.1), 5) )createsrch( mos = c(seq(0.5, 1, by = 0.25), 2, 10), yrs = c(seq(5, 15, by = 3), 50), flo = c(seq(0.5, 1, by = 0.1), 5) )
mos |
numeric vector of half-window widths for months, a value of one indicates twelve months |
yrs |
numeric vector of half-window widths for years, a value of one indicates one-year |
flo |
numeric vector of half-window widths for salinity or flow, a value of one indicates the full range of values (100 percent) |
The weighting function uses a tri-cube weighting scheme such that weights diminish with distance from the center of the window. For example, a value of one for the month window does not mean that all months are weighted equally even though the window covers an entire calendar year.
A matrix with number of rows equal to the product of the lengths of each input vector, where each row is a unique combination for the selected half-window widths.
createsrch() createsrch(1, 1, 1)createsrch() createsrch(1, 1, 1)
Combined daily flow observations from the USGS stream gage station 01594440 near Bowie, Maryland and daily chlorophyll and salinity records from the Jug Bay station of the Chesapeake Bay Maryland National Estuarine Research Reserve. Daily chlorophyll concentrations were estimated from fluorescence values that did not include blue-green algae blooms. These date are provided to illustrate examples with time series simulation.
daydatdaydat
A data frame with 3506 rows and 9 variables, 1985 to 2014:
dateDate as daily time step
flonumeric for salinity, ppt
lnresnumeric for chlorophyll fluorescence as ln-transformed ug/l
Qnumeric for daily discharge m3/s
lnQnumeric for daily discharge ln-transformed m3/s
jdaynumeric for julian day
yearnumeric for year
daynumeric for day from 1-31
dec_timenumeric for decimal time on the annual period
Create decimal time on an annual scale from an input time vector
dec_time(date_in) ## S3 method for class 'Date' dec_time(date_in)dec_time(date_in) ## S3 method for class 'Date' dec_time(date_in)
date_in |
input time vector, usually a |
Function is used internally within the package.
A named list of four numeric vectors including day_num (decimal day on an annual scale), month (month of the year as integer), year, and dec_time (decimal time as sum of year and day_num)
dt <- Sys.Date() dts <- seq.Date(dt - 365, dt, by = 'day') dec_time(dts)dt <- Sys.Date() dts <- seq.Date(dt - 365, dt, by = 'day') dec_time(dts)
Plot the relationship between the modelled response and salinity/flow across the time series using line plots for each month. Each line corresponds to a unique year. This can be used to evaluate temporal variation between the two.
dynaplot(dat_in, ...) ## S3 method for class 'tidal' dynaplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... ) ## S3 method for class 'tidalmean' dynaplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... )dynaplot(dat_in, ...) ## S3 method for class 'tidal' dynaplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... ) ## S3 method for class 'tidalmean' dynaplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, alpha = 1, size = 1, logspace = TRUE, floscl = TRUE, allflo = FALSE, ncol = NULL, grids = TRUE, scales = NULL, pretty = TRUE, use_bw = TRUE, fac_nms = NULL, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric input from 1 to 12 indicating the monthly predictions to plot |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
years |
numeric vector of years to plot, one to many, defaults to all |
col_vec |
chr string of plot colors to use, passed to |
alpha |
numeric value from zero to one indicating line transparency |
size |
numeric value for line size |
logspace |
logical indicating if plots are in log space |
floscl |
logical indicating if salinity/flow on x-axis is standardized (default) or in original scale |
allflo |
logical indicating if the salinity or flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
scales |
chr string passed to ggplot to change x/y axis scaling on facets, acceptable values are |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
use_bw |
logical indicating if the |
fac_nms |
optional chr string for facet labels, which must be equal in length to |
These plots can be used to examine how the relationship between the response variable and flow varies throughout the time series. It is essentially identical to the plot produced by gridplot, except lines plots are returned that show the relationship of the response variable with salinity/flow using different lines for each year. The interpolation grid that is stored as an attribute in a fitted tidal object is used to create the plot. Each plot is limited to the same month throughout the time series to limit seasonal variation. Plots are also constrained to the fifth and ninety-fifth percentile of observed salinity/flow values during the month of interest to limit the predictions within the data domain. This behavior can be suppressed by changing the allflo argument.
Note that the year variable used for color mapping is treated as a continuous variable although it is an integer by definition.
A ggplot object that can be further modified
# load a fitted tidal object data(tidfit) # plot using defaults, # defaults to the fiftieth quantile for all years dynaplot(tidfit) ## Not run: # change the defaults dynaplot(tidfit, tau = 0.9, month = 2, years = seq(1980, 1990), col_vec = rainbow(7), alpha = 0.5, size = 3) # plot a tidalmean object data(tidfitmean) dynaplot(tidfitmean) ## End(Not run)# load a fitted tidal object data(tidfit) # plot using defaults, # defaults to the fiftieth quantile for all years dynaplot(tidfit) ## Not run: # change the defaults dynaplot(tidfit, tau = 0.9, month = 2, years = seq(1980, 1990), col_vec = rainbow(7), alpha = 0.5, size = 3) # plot a tidalmean object data(tidfitmean) dynaplot(tidfitmean) ## End(Not run)
Add date, year, month, and day columns to the interpolation grids using dates from dat_in. The interp.surface function is used after splitting the interpolation matrix by month to fill missing values. Function is used in wrtds.
fill_grd(grd_in, dat_in, interp = FALSE)fill_grd(grd_in, dat_in, interp = FALSE)
grd_in |
interpolation grid to fill, either a single mean grid or an individual quantile grid created within |
dat_in |
|
interp |
logical for interpolation |
predonobs attributeFill the predonobs attribute
fillpo(dat_in, ...) ## S3 method for class 'tidal' fillpo(dat_in, ...) ## S3 method for class 'tidalmean' fillpo(dat_in, ...)fillpo(dat_in, ...) ## S3 method for class 'tidal' fillpo(dat_in, ...) ## S3 method for class 'tidalmean' fillpo(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
Used in respred to fill the predonobs attribute. This attribute is used to estimate performance metrics with wrtdsperf.
The input tidal or tidalmean object with the filled predonobs attribute as predictions for the observed data as a data frame.
Plot a tidal object to view the response variable observations, predictions, and normalized results separately for each month.
fitmoplot(dat_in, ...) ## S3 method for class 'tidal' fitmoplot( dat_in, month = c(1:12), tau = NULL, predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitmoplot( dat_in, month = c(1:12), predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )fitmoplot(dat_in, ...) ## S3 method for class 'tidal' fitmoplot( dat_in, month = c(1:12), tau = NULL, predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitmoplot( dat_in, month = c(1:12), predicted = TRUE, logspace = TRUE, dt_rng = NULL, ncol = NULL, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
ncol |
numeric argument passed to |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
The plots are similar to those produced by fitplot except the values are faceted by month. This allows an evaluation of trends over time independent of seasonal variation. Multiple observations within each month for each year are averaged for a smoother plot.
A ggplot object that can be further modified
fitplot, prdnrmplot, sliceplot
## load a fitted tidal object data(tidfit) # plot using defaults fitmoplot(tidfit) ## Not run: # get the same plot but use default ggplot settings fitmoplot(tidfit, pretty = FALSE) # plot specific quantiles fitmoplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitmoplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) fitmoplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitmoplot(tidfitmean) ## End(Not run)## load a fitted tidal object data(tidfit) # plot using defaults fitmoplot(tidfit) ## Not run: # get the same plot but use default ggplot settings fitmoplot(tidfit, pretty = FALSE) # plot specific quantiles fitmoplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitmoplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) fitmoplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitmoplot(tidfitmean) ## End(Not run)
Plot a tidal object to view response variable observations, predictions, and normalized results.
fitplot(dat_in, ...) ## S3 method for class 'tidal' fitplot( dat_in, tau = NULL, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitplot( dat_in, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )fitplot(dat_in, ...) ## S3 method for class 'tidal' fitplot( dat_in, tau = NULL, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' fitplot( dat_in, predicted = TRUE, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, grids = TRUE, min_mo = 9, mo_strt = 10, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
annuals |
logical indicating if plots are annual aggregations of results |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, applies only if |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September, applies only if |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
A ggplot object that can be further modified
fitmoplot, prdnrmplot, sliceplot
## load a fitted tidal object data(tidfit) # plot using defaults fitplot(tidfit) # get the same plot but use default ggplot settings fitplot(tidfit, pretty = FALSE) # plot in log space fitplot(tidfit, logspace = TRUE) # plot specific quantiles fitplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitplot(tidfit, predicted = FALSE) # plot as monthly values fitplot(tidfit, annuals = FALSE) # format the x-axis is using annual aggregations library(ggplot2) fitplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) # modify the plot as needed using ggplot scales, etc. fitplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitplot(tidfitmean)## load a fitted tidal object data(tidfit) # plot using defaults fitplot(tidfit) # get the same plot but use default ggplot settings fitplot(tidfit, pretty = FALSE) # plot in log space fitplot(tidfit, logspace = TRUE) # plot specific quantiles fitplot(tidfit, tau = c(0.1, 0.9)) # plot the normalized predictions fitplot(tidfit, predicted = FALSE) # plot as monthly values fitplot(tidfit, annuals = FALSE) # format the x-axis is using annual aggregations library(ggplot2) fitplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) # modify the plot as needed using ggplot scales, etc. fitplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( 'Predictions', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) # plot a tidalmean object data(tidfitmean) fitplot(tidfitmean)
Get weights for WRTDS for a single observation using a tri-cubic weighting function
getwts(dat_in, ...) ## Default S3 method: getwts( dat_in, ref_in, wt_vars = c("day_num", "dec_time", "flo"), wins = list(0.5, 10, NULL), all = FALSE, slice = TRUE, ngrzero = FALSE, wins_only = FALSE, min_obs = 100, ... )getwts(dat_in, ...) ## Default S3 method: getwts( dat_in, ref_in, wt_vars = c("day_num", "dec_time", "flo"), wins = list(0.5, 10, NULL), all = FALSE, slice = TRUE, ngrzero = FALSE, wins_only = FALSE, min_obs = 100, ... )
dat_in |
input tidal object |
... |
arguments passed to or from other methods |
ref_in |
row of tidal object as reference for weights |
wt_vars |
chr string of three elements indicating names of columns in tidal object that are used for reference row weights |
wins |
list of half-window widths for time, year, and flow |
all |
logical to return individual weights rather than the product of all three, default |
slice |
logical indicating if data are subset by observations within the maximum window width for faster calculations |
ngrzero |
logical indicating if count of observations with weights greater than zero is returned |
wins_only |
logical if the half-window widths should be returned as a list |
min_obs |
numeric vector for window widening if the number of observations with non-zero weights is less than the specified value, use |
The default half-window widths for day_num, year, and flow are half a day (12 hours), 10 years, and half the range of salinity/flow in the input data. The half-window widths are expanded by 10% until at least 100 observations have weights greater than zero. This behavior can be suppressed by setting min_obs = NULL.
A vector of weights with length equal to the number of observations (rows) in the tidal object. Vectors for all three weighting variables are returned if all = TRUE.
## data(tidobj) # get weights for first row first <- tidobj[1, ] wts <- getwts(tidobj, first) plot(wts, type = 'l') ## Not run: # get count of observations with grzero weights sapply(1:nrow(tidobj), function(row) getwts(tidobj, tidobj[row, ], ngrzero = TRUE)) ## End(Not run)## data(tidobj) # get weights for first row first <- tidobj[1, ] wts <- getwts(tidobj, first) plot(wts, type = 'l') ## Not run: # get count of observations with grzero weights sapply(1:nrow(tidobj), function(row) getwts(tidobj, tidobj[row, ], ngrzero = TRUE)) ## End(Not run)
Calculate quantile regression goodness of fit using residuals and non-conditional residuals
goodfit(resid, resid_nl, tau)goodfit(resid, resid_nl, tau)
resid |
numeric vector of residuals from the conditional quantile model |
resid_nl |
numeric vector of residuals from the non-conditional (null) quantile model |
tau |
numeric value from zero to one for the estimated quantile |
The goodness of fit measure for quantile regression is estimated as 1 minus the ratio between the sum of absolute deviations in the fully parameterized models and the sum of absolute deviations in the null (non-conditional) quantile model. The values are useful for comparisons between quantile models, but they are not comparable to standard coefficients of determination. The latter is based on the variance of squared deviations, whereas goodness of fit values for quantile regression are based on absolute deviations. Goodness of fit values will always be smaller than R2 values.
A numeric value from 0 to 1 indicating goodness of fit
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
wrtdsrsd for residuals
library(quantreg) ## random variables x <- runif(100, 0, 10) y <- x + rnorm(100) ## quantile model mod <- rq(y ~ x, tau = 0.5) res <- resid(mod) ## non-conditional quantile model mod_nl <- rq(y ~ 1, tau = 0.5) rsd_nl <- resid(mod_nl) goodfit(res, rsd_nl, 0.5) ## r2 of mean model for comparison mod_lm <- lm(y ~ x) summary(mod_lm)$r.squaredlibrary(quantreg) ## random variables x <- runif(100, 0, 10) y <- x + rnorm(100) ## quantile model mod <- rq(y ~ x, tau = 0.5) res <- resid(mod) ## non-conditional quantile model mod_nl <- rq(y ~ 1, tau = 0.5) rsd_nl <- resid(mod_nl) goodfit(res, rsd_nl, 0.5) ## r2 of mean model for comparison mod_lm <- lm(y ~ x) summary(mod_lm)$r.squared
Gets colors used for WRTDS plots
gradcols(col_vec = NULL)gradcols(col_vec = NULL)
col_vec |
chr string of plot colors to use, typically passed to |
This is a convenience function for retrieving a color palette that is used by most of the plotting functions. Palettes from RColorBrewer will use the maximum number of colors. The default palette is 'Spectral'.
A character vector of colors in hexadecimal notation.
## defaults gradcols() ## another RColorBrewer palette gradcols('Pastel2') ## a silly example gradcols(rainbow(7))## defaults gradcols() ## another RColorBrewer palette gradcols('Pastel2') ## a silly example gradcols(rainbow(7))
Plot the relationship between the response variable and salinity/flow across the time series using a gridded surface for all months. The response is shaded by relative values across all dates for comparison.
gridplot(dat_in, ...) ## S3 method for class 'tidal' gridplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidalmean' gridplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... )gridplot(dat_in, ...) ## S3 method for class 'tidal' gridplot( dat_in, month = c(1:12), tau = NULL, years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidalmean' gridplot( dat_in, month = c(1:12), years = NULL, col_vec = NULL, col_lim = NULL, logspace = TRUE, floscl = TRUE, allflo = FALSE, flo_fac = 3, yr_fac = 3, ncol = NULL, grids = FALSE, pretty = TRUE, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot or chr string 'all' to indicate all months with no plot facets |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
years |
numeric vector for range of years to plot |
col_vec |
chr string of plot colors to use, passed to |
col_lim |
numeric vector of length two that defines the range of the color ramp in the legend, passed to |
logspace |
logical indicating if plots are in log space |
floscl |
logical indicating if salinity/flow on x-axis is standardized (default) or in original scale |
allflo |
logical indicating if the salinity/flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
flo_fac |
numeric value indicating the factor for smoothing the response variable across salinity/flow values. Increasing the value creates more smoothing and setting the value to 1 removes all smoothing. |
yr_fac |
numeric value indicating the factor for smoothing the response variable across integer years. Increasing the value creates more smoothing and setting the value to 1 removes all smoothing. |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
These plots can be used to examine how the relationship between the response variable and salinity/flow varies throughout the time series for multiple months. The plot is similar to that returned by dynaplot except changes in the response are shown on a gridded surface of salinity/flow versus time. Multiple months can also be viewed. Color shading is in proportion to the value of the response variable and is relative across the plotted months. The interpolation grid that is stored as an attribute in a fitted tidal object is used to create the plot. By default, the plots are constrained to the fifth and ninety-fifth percentile of observed salinity/flow values during each month to limit the predictions within the data domain. This behavior can be suppressed by changing the allflo argument, although the predicted values of the response variable that are outside of the salinity/flow range for the plotted month are typically unrealistic.
A ggplot object that can be further modified
dynaplot, fitplot, gridplot, prdnrmplot
## Not run: ## load a fitted tidal object data(tidfit) ## defaults to the fiftieth quantile gridplot(tidfit) ## no facets, all months gridplot(tidfit, month = 'all') ## change the defaults gridplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) gridplot(tidfitmean) ## End(Not run)## Not run: ## load a fitted tidal object data(tidfit) ## defaults to the fiftieth quantile gridplot(tidfit) ## no facets, all months gridplot(tidfit, month = 'all') ## change the defaults gridplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) gridplot(tidfitmean) ## End(Not run)
Nonparametric test for monotonic trend Within each season based on Kendall's Tau statistic
kendallSeasonalTrendTest(y, ...) ## Default S3 method: kendallSeasonalTrendTest( y, season, year, alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, independent.obs = TRUE, data.name = NULL, season.name = NULL, year.name = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'data.frame' kendallSeasonalTrendTest(y, ...) ## S3 method for class 'formula' kendallSeasonalTrendTest(y, data = NULL, subset, na.action = na.pass, ...) ## S3 method for class 'matrix' kendallSeasonalTrendTest(y, ...)kendallSeasonalTrendTest(y, ...) ## Default S3 method: kendallSeasonalTrendTest( y, season, year, alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, independent.obs = TRUE, data.name = NULL, season.name = NULL, year.name = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'data.frame' kendallSeasonalTrendTest(y, ...) ## S3 method for class 'formula' kendallSeasonalTrendTest(y, data = NULL, subset, na.action = na.pass, ...) ## S3 method for class 'matrix' kendallSeasonalTrendTest(y, ...)
y |
an object containing data for the trend test. In the default method, the argument |
... |
methods passed to or from other methods |
season |
numeric or character vector or a factor indicating the seasons in which the observations in y were taken. The length of |
year |
numeric vector indicating the years in which the observations in |
alternative |
character string indicating the kind of alternative hypothesis. The possible values are |
correct |
logical scalar indicating whether to use the correction for continuity in computing the z-statistic that is based on the test statistic S'. The default value is |
ci.slope |
logical scalar indicating whether to compute a confidence interval for the slope. The default value is |
conf.level |
numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the slope. The default value is |
independent.obs |
logical scalar indicating whether to assume the observations in y are seially independent. The default value is |
data.name |
character string indicating the name of the data used for the trend test. The default value is |
season.name |
character string indicating the name of the data used for the season. The default value is |
year.name |
character string indicating the name of the data used for the year. The default value is |
parent.of.data |
character string indicating the source of the data used for the trend test. |
subset.expression |
character string indicating the expression used to subset the data. |
data |
specifies an optional data frame, list or environment (or object coercible by |
subset |
specifies an optional vector specifying a subset of observations to be used. |
na.action |
specifies a function which indicates what should happen when the data contain |
Perform a nonparametric test for a monotonic trend within each season based on Kendall's tau statistic, and optionally compute a confidence interval for the slope across all seasons.
A list object with elements for results of the test
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
kendallSeasonalTrendTest(res ~ month + year, tidfitmean)kendallSeasonalTrendTest(res ~ month + year, tidfitmean)
Nonparametric test for monotonic trend based on Kendall's Tau statistic
kendallTrendTest(y, ...) ## Default S3 method: kendallTrendTest( y, x = seq(along = y), alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, warn = TRUE, data.name = NULL, data.name.x = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'formula' kendallTrendTest(y, data = NULL, subset, na.action = na.pass, ...)kendallTrendTest(y, ...) ## Default S3 method: kendallTrendTest( y, x = seq(along = y), alternative = "two.sided", correct = TRUE, ci.slope = TRUE, conf.level = 0.95, warn = TRUE, data.name = NULL, data.name.x = NULL, parent.of.data = NULL, subset.expression = NULL, ... ) ## S3 method for class 'formula' kendallTrendTest(y, data = NULL, subset, na.action = na.pass, ...)
y |
an object containing data for the trend test. In the default method, the argument |
... |
methods passed to or from other methods |
x |
numeric vector of "predictor" values. The length of |
alternative |
character string indicating the kind of alternative hypothesis. The possible values are |
correct |
logical scalar indicating whether to use the correction for continuity in computing the z-statistic that is based on the test statistic S'. The default value is |
ci.slope |
logical scalar indicating whether to compute a confidence interval for the slope. The default value is |
conf.level |
numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the slope. The default value is |
warn |
logical scalar indicating whether to print a warning message when |
data.name |
character string indicating the name of the data used for the trend test. The default value is |
data.name.x |
character string indicating the name of the data used for the predictor variable |
parent.of.data |
character string indicating the source of the data used for the trend test. |
subset.expression |
character string indicating the expression used to subset the data. |
data |
specifies an optional data frame, list or environment (or object coercible by |
subset |
specifies an optional vector specifying a subset of observations to be used. |
na.action |
specifies a function which indicates what should happen when the data contain |
kendallTrendTest performs Kendall's nonparametric test for a monotonic trend, which is a special case of the test for independence based on Kendall's tau statistic (see cor.test). The slope is estimated using the method of Theil (1950) and Sen (1968). When ci.slope=TRUE, the confidence interval for the slope is computed using Gilbert's (1987) Modification of the Theil/Sen Method.
Kendall's test for a monotonic trend is a special case of the test for independence based on Kendall's tau statistic. The first section below explains the general case of testing for independence. The second section explains the special case of testing for monotonic trend. The last section explains how a simple linear regression model is a special case of a monotonic trend and how the slope may be estimated.
A list object with elements for results of the test
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
kendallTrendTest(res ~ dec_time, tidfitmean)kendallTrendTest(res ~ dec_time, tidfitmean)
Simulate a discharge time series by modelling the statistical properties of an existing daily time series
lnQ_sim(dat_in, comps = FALSE, seed = NULL)lnQ_sim(dat_in, comps = FALSE, seed = NULL)
dat_in |
input |
comps |
logical indicating if components of the simulated time series are returned, see value. |
seed |
optional numeric value for random generation seed |
Daily flow data are simulated as the additive combination of a stationary seasonal component and serially-correlated errors estimated from the observed data. The stationary seasonal component is based on a seasonal regression of discharge over time. The residuals from this regression are used to estimate the error distribution using an ARIMA model. Parameters of the ARIMA model are chosen using stepwise estimation for nonseasonal univariate time series with the auto.arima function. Random errors from a standard normal distribution for the length of the original time series are generated using the model estimates with the arima.sim function. Finally, the errors are multiplied by the standard deviation of the original residuals and added to the seasonal component to create a simulated, daily log-flow time series.
The original data frame with an additional column of simulated data named lnQ_sim if comps = FALSE. Otherwise, a two-element list is returned where the first element is 1) a list with the linear seasonal model fit to the observed time series and ARIMA model fit to the seasonal residuals, and 2) a data frame with the original data, the fit from the seasonal linear model (seas_fit), residuals from observed flow and seasonal fit (seas_res), standard deviation of seasonal residuals (sd_seas), simulated errors from the ARIMA model (errs), and simulated discharge time series (sim_out). Note that sim_out vector is converted to the same range as the input flow record.
## example data data(daydat) ## simulate lnQ_sim(daydat)## example data data(daydat) ## simulate lnQ_sim(daydat)
Simulate random errors of a water quality time series by modelling the statistical properties of an observed dataset
lnres_err(dat_in, yr = NULL, comps = FALSE, seed = NULL)lnres_err(dat_in, yr = NULL, comps = FALSE, seed = NULL)
dat_in |
input |
yr |
numeric year value to use for the stationary model, defaults to the median year |
comps |
logical indicating if the WRTDS model used to get response error measures is also returned, see value. |
seed |
optional numeric value for random generation seed |
Random errors for a stationary seasonal water quality time series on a daily time step are generated by modelling residuals from an observed dataset. First, a stationary seasonal model is created by fitting a wrtds model and estimating an error distribution the residuals using the auto.arima function. Accumulated standard errors from the regression are also retained for each residual. Random errors using the estimated auto-regressive structures are simulated using arima.sim for the entire year and multiplied by the corresponding standard error estimate from the regression. The entire year is then repeated for every year in the observed time series. The final simulated errors are rescaled to the range of the original residuals that were used to estimate the distribution.
The original data frame with additional columns for the random errors (errs) and the standard error estimates for each residual (scls). If comps = TRUE, a two-element list is returned that also includes the WRTDS model used as a basis for errors and scale values.
## Not run: ## example data data(daydat) ## get errors lnres_err(daydat) ## End(Not run)## Not run: ## example data data(daydat) ## get errors lnres_err(daydat) ## End(Not run)
Simulate a water quality time series with an estimated error structure and simulated discharge effect
lnres_sim(dat_in, lnQ_coef = NULL)lnres_sim(dat_in, lnQ_coef = NULL)
dat_in |
input |
lnQ_coef |
numeric vector of coefficients of the same length of the time series in |
This function creates a simulated water quality time series and requires error estimates for an observed water quality dataset and a simulated discharge time series. The water quality time series is created as the additive combination of a seasonal, stationary time component (as in lnQ_sim for discharge), a random error component from lnres_err, and a simulated discharge time series from lnQ_sim. The discharge time series is considered an explicit component of the water quality time series and is first centered at zero prior to adding. The optional vector of coefficients passed to lnQ_coef can mediate the influence of discharge on the water quality time series. For example, a vector of all zeroes implies no effect, whereas a vector of all ones implies a constant effect (default).
The original data frame with additional columns for the seasonal water quality model (lnres_seas), a flow-independent water quality time series (lnres_noQ), and a flow-dependent time series (lnres_Q).
daydat for the format of an input dataset, lnQ_sim for simulating discharge, and lnres_err for estimating the error distribution of the water quality time series, all_sims for completing all steps at once.
## Not run: ## example data data(daydat) ## get simulated discharge sims <- lnQ_sim(daydat) ## get error structure of wq time series sims <- lnres_err(sims) ## get simulated wq time series using results from previous lnres_sim(sims) ## End(Not run)## Not run: ## example data data(daydat) ## get simulated discharge sims <- lnQ_sim(daydat) ## get error structure of wq time series sims <- lnres_err(sims) ## get simulated wq time series using results from previous lnres_sim(sims) ## End(Not run)
Fit weighted regression and get predicted/normalized response variable from a data frame. This is a wrapper for multiple function used to create a weighted regression model and should be used rather than the individual functions.
modfit(dat_in, ...) ## Default S3 method: modfit(dat_in, ...) ## S3 method for class 'tidal' modfit(dat_in, ...) ## S3 method for class 'tidalmean' modfit(dat_in, ...) ## S3 method for class 'data.frame' modfit(dat_in, resp_type = "quantile", ...)modfit(dat_in, ...) ## Default S3 method: modfit(dat_in, ...) ## S3 method for class 'tidal' modfit(dat_in, ...) ## S3 method for class 'tidalmean' modfit(dat_in, ...) ## S3 method for class 'data.frame' modfit(dat_in, resp_type = "quantile", ...)
dat_in |
input |
... |
arguments passed to or from other methods |
resp_type |
chr string indicating the type of model response to use, quantile or mean model |
This function is used as a convenience to combine several functions that accomplish specific tasks, primarily the creation of a tidal or tidalmean object, fitting of the weighted regression models with wrtds, extraction of fitted values from the interpolation grids using respred, and normalization of the fitted values from the interpolation grid using resnorm. The format of the input should be a data.frame with response variable observations as rows and the first four columns as date, response variable, salinity/flow, and detection limits. The order of the columns may vary provided the order of each of the four critical variables is specified by the ind argument that is passed to the tidal or tidalmean function. The response variable data are also assumed to be in log-space, otherwise use reslog = FALSE which is also passed to the tidal or tidalmean function. The dataset described in chldat is an example of the correct format.
For quantile models, the default conditional quantile that is predicted is the median (tau = 0.5, passed to the wrtds function). Numerous other arguments affect the output and the default parameters may not be appropriate for all scenarios. Arguments used by other functions can be specified explicitly with the initial call. The documentation for the functions under ‘see also’ should be consulted for available arguments, as well as the examples that illustrate common changes to the default values.
A tidal object with predicted and normalized response variable predictions, attributes updated accordingly.
See the help files for tidal, tidalmean, wrtds, getwts, respred, and resnorm for arguments that can be passed to this function.
## Not run: ## load data data(chldat) ## fit the model and get predicted/normalized data for response variable # default median fit # grids predicted across salinity range with ten values res <- modfit(chldat) # for mean models res <- modfit(chldat, resp_type = 'mean') ## fit different quantiles and smaller interpolation grid res <- modfit(chldat, tau = c(0.2, 0.8), flo_div = 5) ## fit with different window widths # half-window widths of one day, five years, and 0.3 salinity res <- modfit(chldat, wins = list(1, 5, 0.3)) ## suppress console output res <- modfit(chldat, trace = FALSE) ## End(Not run)## Not run: ## load data data(chldat) ## fit the model and get predicted/normalized data for response variable # default median fit # grids predicted across salinity range with ten values res <- modfit(chldat) # for mean models res <- modfit(chldat, resp_type = 'mean') ## fit different quantiles and smaller interpolation grid res <- modfit(chldat, tau = c(0.2, 0.8), flo_div = 5) ## fit with different window widths # half-window widths of one day, five years, and 0.3 salinity res <- modfit(chldat, wins = list(1, 5, 0.3)) ## suppress console output res <- modfit(chldat, trace = FALSE) ## End(Not run)
Plot number of observations for each point in a WRTDS interpolation grid. This is a diagnostic plot to identify sample size for each unique location in the domain of the time series that is considered during model fitting.
nobsplot(dat_in, ...) ## Default S3 method: nobsplot( dat_in, month = "all", years = NULL, col_vec = NULL, allflo = TRUE, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidal' nobsplot(dat_in, ...) ## S3 method for class 'tidalmean' nobsplot(dat_in, ...)nobsplot(dat_in, ...) ## Default S3 method: nobsplot( dat_in, month = "all", years = NULL, col_vec = NULL, allflo = TRUE, ncol = NULL, grids = FALSE, pretty = TRUE, ... ) ## S3 method for class 'tidal' nobsplot(dat_in, ...) ## S3 method for class 'tidalmean' nobsplot(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
month |
numeric indicating months to plot or chr string 'all' to indicate all months with no plot facets |
years |
numeric vector of years to plot, defaults to all |
col_vec |
chr string of plot colors to use, passed to |
allflo |
logical indicating if the salinity/flow values for plotting are limited to the fifth and ninety-fifth percentile of observed values for the month of interest |
ncol |
numeric argument passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
The plots can be used sample size as an indication of model fit for each unique location in the domain space of the time series. The plots show grids of the number of observations with weights greater than zero for each unique date and salinity/flow combination. The obs attribute in the tidal or tidalmean object is created during model fitting and has the same dimensions as the interpolation grid. Each row is a unique date in the original dataset and each column is a salinity/flow value used to fit each regression (i.e., values in the flo_grd attribute). In general, low points in the grid may indicate locations in the time series where insufficient data could affect model fit.
Unlike gridplot, interpolation of the grids for a smoother appearance is not allowed because the objective is to identify specific locations with low sample size. For the former function, the objective is to characterize general trends over time rather values at specific locations.
A ggplot object that can be further modified
wtsplot for an alternative to evaluating weights with different window width combinations
## Not run: ## load a fitted tidal object data(tidfit) ## default plot nobsplot(tidfit) ## no facets, all months nobsplot(tidfit) ## change the defaults nobsplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) nobsplot(tidfitmean) ## End(Not run)## Not run: ## load a fitted tidal object data(tidfit) ## default plot nobsplot(tidfit) ## no facets, all months nobsplot(tidfit) ## change the defaults nobsplot(tidfit, tau = c(0.1), month = c(3, 6, 9, 12), col_vec = c('red', 'blue', 'green'), flo_fac = 1) ## plot a tidalmean object data(tidfitmean) nobsplot(tidfitmean) ## End(Not run)
Plot observed response variable and salinity/flow time series from a tidal object
obsplot(dat_in, ...) ## Default S3 method: obsplot( dat_in, lines = TRUE, logspace = TRUE, dt_rng = NULL, pretty = TRUE, col = "black", lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidal' obsplot(dat_in, ...) ## S3 method for class 'tidalmean' obsplot(dat_in, ...)obsplot(dat_in, ...) ## Default S3 method: obsplot( dat_in, lines = TRUE, logspace = TRUE, dt_rng = NULL, pretty = TRUE, col = "black", lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidal' obsplot(dat_in, ...) ## S3 method for class 'tidalmean' obsplot(dat_in, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
lines |
logical indicating if a line plot is used, otherwise points |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
col |
chr string of plot color to use |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
A ggplot object that can be further modified
## load a fitted tidal object data(tidfit) ## plot using defaults obsplot(tidfit) ## changing default obsplot(tidfit, alpha = 0.5, size = 4, col = 'blue', lines = FALSE) ## plot a tidalmean object data(tidfitmean) obsplot(tidfitmean)## load a fitted tidal object data(tidfit) ## plot using defaults obsplot(tidfit) ## changing default obsplot(tidfit, alpha = 0.5, size = 4, col = 'blue', lines = FALSE) ## plot a tidalmean object data(tidfitmean) obsplot(tidfitmean)
Plot combined predicted and normalized results from a tidal object to evaluate the influence of salinity or flow changes on the response variable. The plot is similar to that produced by fitplot except predicted values are shown as points and observed values are removed.
prdnrmplot(dat_in, ...) ## S3 method for class 'tidal' prdnrmplot( dat_in, tau = NULL, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... ) ## S3 method for class 'tidalmean' prdnrmplot( dat_in, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... )prdnrmplot(dat_in, ...) ## S3 method for class 'tidal' prdnrmplot( dat_in, tau = NULL, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... ) ## S3 method for class 'tidalmean' prdnrmplot( dat_in, annuals = TRUE, logspace = TRUE, dt_rng = NULL, col_vec = NULL, lwd = 1, size = 2, alpha = 1, min_mo = 9, mo_strt = 10, pretty = TRUE, plot = TRUE, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
annuals |
logical indicating if plots are annual aggregations of results |
logspace |
logical indicating if plots are in log space |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, applies only if |
mo_strt |
numeric indicating month to start aggregation years, defaults to October for USGS water year from October to September, applies only if |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
plot |
logical if plot is returned, otherwise data used in the plot |
A ggplot object that can be further modified
## load a fitted tidal object data(tidfit) ## plot using defaults prdnrmplot(tidfit) ## get the same plot but use default ggplot settings prdnrmplot(tidfit, pretty = FALSE) ## plot in log space prdnrmplot(tidfit, logspace = TRUE) ## plot specific quantiles prdnrmplot(tidfit, tau = c(0.1, 0.9)) ## plot the normalized predictions prdnrmplot(tidfit, predicted = FALSE) ## plot as monthly values prdnrmplot(tidfit, annuals = FALSE) ## format the x-axis is using annual aggregations library(ggplot2) prdnrmplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) ## modify the plot as needed using ggplot scales, etc. prdnrmplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( '', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) ## plot a tidalmean object data(tidfitmean) prdnrmplot(tidfitmean)## load a fitted tidal object data(tidfit) ## plot using defaults prdnrmplot(tidfit) ## get the same plot but use default ggplot settings prdnrmplot(tidfit, pretty = FALSE) ## plot in log space prdnrmplot(tidfit, logspace = TRUE) ## plot specific quantiles prdnrmplot(tidfit, tau = c(0.1, 0.9)) ## plot the normalized predictions prdnrmplot(tidfit, predicted = FALSE) ## plot as monthly values prdnrmplot(tidfit, annuals = FALSE) ## format the x-axis is using annual aggregations library(ggplot2) prdnrmplot(tidfit, annual = TRUE) + scale_x_date(limits = as.Date(c('2000-01-01', '2012-01-01'))) ## modify the plot as needed using ggplot scales, etc. prdnrmplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 50) ) + scale_colour_manual( '', labels = c('lo', 'md', 'hi'), values = c('red', 'green', 'blue'), guide = guide_legend(reverse = TRUE) ) ## plot a tidalmean object data(tidfitmean) prdnrmplot(tidfitmean)
Get normalized model predictions from WRTDS to remove the effect of salinity/flow on the response variable. Predicted values in the interpolation grids are averaged across dates.
resnorm(dat_in, ...) ## S3 method for class 'tidal' resnorm(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' resnorm(dat_in, trace = TRUE, ...)resnorm(dat_in, ...) ## S3 method for class 'tidal' resnorm(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' resnorm(dat_in, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
trace |
logical indicating if progress is shown in the console |
This function is used after wrtds to normalize predicted values of the response variable from the interpolation grid for each model. The normalized values are based on the average of all predicted estimates across the range of salinity/flow values that have occurred on the same date throughout each year. For example, normalized values for July 2000 are the mean predicted response at that date using the observed salinity/flow values that occur in July of all years. The normalized values allow an interpretation of trends in the response variable that are independent of changes in salinity or freshwater inputs.
Appends columns to the data.frame for normalized values. For, tidal objects, columns are named starting with the prefix ‘norm’, e.g., ‘norm0.5’ are the normalized values for the fit through the median. For tidalmean objects, columns are appended for the log-transformed and back-transformed normalized values, named ‘norm’ and ‘bt_norm’.
## Not run: ## # load a tidal object data(tidobj) # get flow-normalized values for each quantile res <- resnorm(tidobj) # load a tidalmean object data(tidobjmean) # get flow-normalized values res <- resnorm(tidobjmean) ## End(Not run)## Not run: ## # load a tidal object data(tidobj) # get flow-normalized values for each quantile res <- resnorm(tidobj) # load a tidalmean object data(tidobjmean) # get flow-normalized values res <- resnorm(tidobjmean) ## End(Not run)
Get model predictions from WRTDS using linear interpolation of values in grids
respred(dat_in, ...) ## S3 method for class 'tidal' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...) ## S3 method for class 'tidalmean' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...)respred(dat_in, ...) ## S3 method for class 'tidal' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...) ## S3 method for class 'tidalmean' respred(dat_in, dat_pred = NULL, trace = TRUE, omit = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
dat_pred |
optional data to predict using the interpolation grids in dat_in, defaults to observed data in |
trace |
logical indicating if progress is shown in the console |
omit |
logical indicating if observations in |
This function is used after wrtds to estimate predicted values of the response variable from the interpolation grids. The estimated values are based on a bilinear interpolation of the four predicted response values at two salinity/flow and two date values nearest to the observed salinity/flow and date values to predict.
Data for dat_pred must be a data frame of two columns for date and flow variables (date and numeric objects). The columns must be named 'date' and 'flo'. Values that are outside of the range of data used to fit the model are removed with a warning. It is assumed that the flow variable is not scaled (i.e., raw data) as in a tidal or tidalmean object. The dimensions of the output data are modified to match dat_pred if observations are removed. The omit argument should not equal FALSE and is included only for use with wrtdscv to evaluate folds of the original dataset.
Appends columns to the input data.frame for the predicted values. For tidal objects, columns are named starting with the prefix ‘fit’, e.g., ‘fit0.5’ are the predicted values for the fit through the median. For tidalmean objects, predicted values are appended for the mean model in log-space and the observed values from the back-transformed grids. Columns are named as ‘fits’ and ‘bt_fits’.
## # load a tidal object data(tidobj) # get fitted values for each quantile res <- respred(tidobj) # load a tidalmean object data(tidobjmean) # get predicted values res <- respred(tidobjmean)## # load a tidal object data(tidobj) # get fitted values for each quantile res <- respred(tidobj) # load a tidalmean object data(tidobjmean) # get predicted values res <- respred(tidobjmean)
Get the scale parameters for predicted values of the response variable, only applies to tidalmean objects.
resscls(dat_in, ...) ## S3 method for class 'tidalmean' resscls(dat_in, dat_pred = NULL, ...)resscls(dat_in, ...) ## S3 method for class 'tidalmean' resscls(dat_in, dat_pred = NULL, ...)
dat_in |
input tidalmean object |
... |
arguments passed to or from other methods |
dat_pred |
optional data to predict using the interpolation grids in dat_in, defaults to observed data in |
This function is used after wrtds to get scale parameters for predicted values of the response variable from the interpolation grids. The values are based on a bilinear interpolation of the four predicted response values at two salinity/flow and two date values nearest to the observed salinity/flow and date values to predict.
Appends columns to the data.frame for the associated scale value for the predicted values. A column is appended to the dat_in object, named ‘scls’.
## # load a tidalmean object data(tidobjmean) # get predicted values res <- resscls(tidobjmean)## # load a tidalmean object data(tidobjmean) # get predicted values res <- resscls(tidobjmean)
Sample a daily water quality time series at a set monthly frequency
samp_sim( dat_in, unit = "month", irregular = TRUE, missper = 0, blck = 1, blckper = FALSE )samp_sim( dat_in, unit = "month", irregular = TRUE, missper = 0, blck = 1, blckper = FALSE )
dat_in |
input |
unit |
chr string indicating sampling unit, must be year, quarter, month, week, or yday for equivalent lubridate function |
irregular |
logical indicating if monthly sampling is done randomly within each |
missper |
numeric from 0-1 indicating percentage of observations used for test dataset |
blck |
numeric indicating block size for resampling test dataset, see details |
blckper |
logical indicating if the value passed to |
This function is intended for sampling a simulated daily time series of water quality that is returned by lnres_sim or all_sims.
The missper argument is used to create a test dataset as a proportion of all observations in the sub-sampled output dataset. The test dataset is created with random block sampling appropriate for time series. Block sampling of the output dataset occurs until the number of unique observations is equal to the percentage defined by missper. Overlap of blocks are not doubly considered towards the observation counts to satisfy missper, i.e., sets of continuous observations longer than blck can be returned because of sampling overlap. Setting blck = 1 and blockper = FALSE is completely random sampling for missing data. Values for blck must be 1 or greater if blockper = FALSE and 1 or less if blckper = T. If blck = 1 and blckper = T, the missing data will be one continuous block.
Original data frame with rows subset based on number of desired monthly samples. If missper > 0, a list is returned where the first element is the index values for the test dataset and the second is the complete subsampled dataset.
## Not run: ## example data data(daydat) ## simulate tosamp <- all_sims(daydat) ## sample samp_sim(tosamp) ## sample and create test dataset # test dataset is 30% size of monthly subsample using block sampling with size = 4 samp_sim(tosamp, missper = 0.3, blck = 4) ## End(Not run)## Not run: ## example data data(daydat) ## simulate tosamp <- all_sims(daydat) ## sample samp_sim(tosamp) ## sample and create test dataset # test dataset is 30% size of monthly subsample using block sampling with size = 4 samp_sim(tosamp, missper = 0.3, blck = 4) ## End(Not run)
Plot seasonal trends by combining annual data
seasplot(dat_in, ...) ## S3 method for class 'tidal' seasplot( dat_in, tau = NULL, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... ) ## S3 method for class 'tidalmean' seasplot( dat_in, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... )seasplot(dat_in, ...) ## S3 method for class 'tidal' seasplot( dat_in, tau = NULL, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... ) ## S3 method for class 'tidalmean' seasplot( dat_in, predicted = TRUE, span = 0.4, lwd = 1, size = 2, alpha = 1, col_vec = NULL, grids = TRUE, logspace = TRUE, ... )
dat_in |
Input data object |
... |
arguments passed to other methods |
tau |
numeric of quantile to plot |
predicted |
logical indicating if seasonal smooth is based on model predictions, default |
span |
numeric indicating the smoothing parameter for the loess fit, passed to |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
logspace |
logical indicating if plots are in log space |
Seasonal variation across all years can be viewed by showing the observed annual data on a common y-axis. The year value is removed from the results such that the y-axis shows only the day of the year. A simple loess (locally estimated) polynomial smooth is added to show the seasonal trend in the results, where the smoother is fit through the model results for the observed data. The fit can be smoothed through the model predictions or the flow-normalized predictions, neither of which are shown on the plot.
dynaplot, fitmoplot, gridplot, and sliceplot produce similar graphics except variation in the same month across years is emphasized.
# load a fitted tidal object data(tidfit) # plot using defaults # defaults to all quantiles for tidal object seasplot(tidfit) # tidalmean object seasplot(tidfitmean)# load a fitted tidal object data(tidfit) # plot using defaults # defaults to all quantiles for tidal object seasplot(tidfit) # tidalmean object seasplot(tidfitmean)
Plot seasonal model response by years on a common axis
seasyrplot(dat_in, ...) ## S3 method for class 'tidal' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... ) ## S3 method for class 'tidalmean' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... )seasyrplot(dat_in, ...) ## S3 method for class 'tidal' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... ) ## S3 method for class 'tidalmean' seasyrplot( dat_in, years = NULL, tau = NULL, predicted = TRUE, logspace = TRUE, col_vec = NULL, grids = TRUE, pretty = TRUE, lwd = 0.5, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
years |
numeric vector of years to plot |
tau |
numeric vector of quantiles to plot, defaults to all in object if not supplied |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
col_vec |
chr string of plot colors to use, passed to |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
alpha |
numeric value indicating transparency of points or lines |
The plot is similar to that produced by seasplot except the model estimates are plotted for each year as connected lines, as compared to loess lines fit to the model results. seasyrplot is also similar to sliceplot except the x-axis and legend grouping variable are flipped. This is useful for evaluating between-year differences in seasonal trends.
Multiple predictions per month are averaged for a smoother plot.
Note that the year variable used for color mapping is treated as a continuous variable although it is an integer by definition.
A ggplot object that can be further modified
## load a fitted tidal object data(tidfit) # plot using defaults seasyrplot(tidfit) # get the same plot but use default ggplot settings seasyrplot(tidfit, pretty = FALSE) # plot specific quantiles seasyrplot(tidfit, tau = c(0.9)) # plot the normalized predictions seasyrplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) seasyrplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) # plot a tidalmean object data(tidfitmean) seasyrplot(tidfitmean)## load a fitted tidal object data(tidfit) # plot using defaults seasyrplot(tidfit) # get the same plot but use default ggplot settings seasyrplot(tidfit, pretty = FALSE) # plot specific quantiles seasyrplot(tidfit, tau = c(0.9)) # plot the normalized predictions seasyrplot(tidfit, predicted = FALSE) # modify the plot as needed using ggplot scales, etc. library(ggplot2) seasyrplot(tidfit, pretty = FALSE, linetype = 'dashed') + theme_classic() + scale_y_continuous( 'Chlorophyll', limits = c(0, 5) ) # plot a tidalmean object data(tidfitmean) seasyrplot(tidfitmean)
Plot time slices within a tidal object to view response variable observations, predictions, and normalized results at regular annual intervals.
sliceplot(dat_in, ...) ## S3 method for class 'tidal' sliceplot( dat_in, slices = c(1, 7), tau = NULL, dt_rng = NULL, col_vec = NULL, predicted = TRUE, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' sliceplot( dat_in, slices = c(1, 7), predicted = TRUE, dt_rng = NULL, col_vec = NULL, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )sliceplot(dat_in, ...) ## S3 method for class 'tidal' sliceplot( dat_in, slices = c(1, 7), tau = NULL, dt_rng = NULL, col_vec = NULL, predicted = TRUE, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... ) ## S3 method for class 'tidalmean' sliceplot( dat_in, slices = c(1, 7), predicted = TRUE, dt_rng = NULL, col_vec = NULL, logspace = TRUE, grids = TRUE, pretty = TRUE, lwd = 1, size = 2, alpha = 1, ... )
dat_in |
input tidal or tidalmean object |
... |
arguments passed to other methods |
slices |
numeric vector of calender months to plot, i.e., 1 - 12 |
tau |
numeric vector of quantile to plot. The function will plot the 'middle' quantile if none is specified, e.g., if 0.2, 0.3, and 0.4 are present in the fitted model object then 0.3 will be plotted. |
dt_rng |
Optional chr string indicating the date range of the plot. Must be two values in the format 'YYYY-mm-dd' which is passed to |
col_vec |
chr string of plot colors to use, passed to |
predicted |
logical indicating if standard predicted values are plotted, default |
logspace |
logical indicating if plots are in log space |
grids |
logical indicating if grid lines are present |
pretty |
logical indicating if my subjective idea of plot aesthetics is applied, otherwise the |
lwd |
numeric value indicating width of lines |
size |
numeric value indicating size of points |
alpha |
numeric value indicating transparency of points or lines |
This is a modification of fitplot that can be used to plot selected time slices from the results of a fitted tidal object. For example, all results for a particular month across all years can be viewed. This is useful for evaluating between-year differences in results for constant season. Only one quantile fit can be shown per plot because the grouping variable is mapped to the slices.
A ggplot object that can be further modified
## load a fitted tidal object data(tidfit) # plot using defaults sliceplot(tidfit) # get different months - march and september sliceplot(tidfit, slices = c(3, 9)) # normalized predictions, 10th percentile sliceplot(tidfit, tau = 0.1, predicted = FALSE) # normalized values all months, change line aesthetics, log-space, 90th # add title library(ggplot2) sliceplot(tidfit, slices = 1:12, size = 1.5, tau = 0.9, alpha = 0.6, predicted = FALSE, logspace = TRUE ) + ggtitle('Normalized predictions for all months, 90th percentile') ## plot a tidalmean object data(tidfitmean) sliceplot(tidfitmean)## load a fitted tidal object data(tidfit) # plot using defaults sliceplot(tidfit) # get different months - march and september sliceplot(tidfit, slices = c(3, 9)) # normalized predictions, 10th percentile sliceplot(tidfit, tau = 0.1, predicted = FALSE) # normalized values all months, change line aesthetics, log-space, 90th # add title library(ggplot2) sliceplot(tidfit, slices = 1:12, size = 1.5, tau = 0.9, alpha = 0.6, predicted = FALSE, logspace = TRUE ) + ggtitle('Normalized predictions for all months, 90th percentile') ## plot a tidalmean object data(tidfitmean) sliceplot(tidfitmean)
Prepare water quality data for weighted regression by creating a tidal class object
tidal( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )tidal( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
dat_in |
Input data frame for a water quality time series with four columns for date (Y-m-d format), response variable, salinity/flow, and detection limit for left-censored data |
ind |
four element numeric vector indicating column positions of date, response variable, salinity/flow, and detection limit of input data frame |
reslab |
character string or expression for labelling the response variable in plots, defaults to log-chlorophyll in ug/L |
flolab |
character string or expression for labelling the flow variable in plots, defaults to Salinity |
reslog |
logical indicating if the response variable is already in log-space, default |
rm_miss |
logical indicating if missing observations in the input data are removed |
... |
arguments passed from other methods |
This function is a simple wrapper to structure that is used to create a tidal object for use with weighted regression in tidal waters. Input data should be a four-column data.frame with date, response variable, salinity/flow data, and detection limit for each observation of the response. The response variable is assumed to be log-transformed, otherwise use reslog = FALSE. Salinity data can be provided as fraction of freshwater or as parts per thousand. The limit column can be entered as a sufficiently small number if all values are above the detection limit or no limit exists. The current implementation of weighted regression for tidal waters only handles left-censored data. Missing observations are also removed.
A tidal object as a data frame and attributes. The data frame has columns ordered as date, response variable, salinity/flow (rescaled to 0, 1 range), detection limit, logical for detection limit, day number, month, year, and decimal time. The attributes are as follows:
namesColumn names of the data frame
row.namesRow names of the data frame
classClass of the object
half_winsList of numeric values used for half-window widths for model fitting, in the same order as the wt_vars argument passed to getwts. Initially will be NULL if wrtds has not been used.
fitsList of matrices with fits for the WRTDS interpolation grid, defaults to one list for the median quantile. Initially will be NULL if wrtds has not been used.
predonobsA data.frame of predictions using the observed data that were used to fit the model. This is required for wrtdsperf if a novel dataset is used for predictions after fitting the model. Initially will be NULL if respred has not been used.
flo_grdNumeric vector of salinity/flow values that was used for the interpolation grids
floobs_rngTwo element vector indicating the salinity/flow range of the observed data
nobsList with one matrix showing the number of weights greater than zero for each date and salinity/flow combination used to create the fit matrices in fits. Number of observations are the same for each quantile model. Initially will be NULL if wrtds has not been used.
reslabexpression or character string for response variable label in plots
flolabexpression or character string for flow variable label in plots
## raw data data(chldat) ## format chldat <- tidal(chldat)## raw data data(chldat) ## format chldat <- tidal(chldat)
Prepare water quality data for weighted regression for the mean response by creating a tidalmean class object
tidalmean( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )tidalmean( dat_in, ind = c(1, 2, 3, 4), reslab = NULL, flolab = NULL, reslog = TRUE, rm_miss = FALSE, ... )
dat_in |
Input data frame for a water quality time series with four columns for date (Y-m-d format), response variable, salinity/flow, and detection limit for left-censored data |
ind |
four element numeric vector indicating column positions of date, response variable, salinity/flow, and detection limit of input data frame |
reslab |
character string or expression for labelling the response variable in plots, defaults to log-chlorophyll in ug/L |
flolab |
character string or expression for labelling the flow variable in plots, defaults to Salinity |
reslog |
logical indicating if input response variable is already in log-space, default |
rm_miss |
logical indicating if missing observations in the input data are removed |
... |
arguments passed from other methods |
This function is a simple wrapper to structure that is used to create a tidalmean object for use with weighted regression in tidal waters, specifically to model the mean response as compared to a conditional quantile. Input data should be a four-column data.frame with date, response variable, salinity/flow data, and detection limit for each observation of the response. The response data are assumed to be log-transformed, otherwise use reslog = FALSE. Salinity data can be provided as fraction of freshwater or as parts per thousand. The limit column can be entered as a sufficiently small number if all values are above the detection limit or no limit exists. The current implementation of weighted regression for tidal waters only handles left-censored data. Missing observations are also removed.
The tidalmean object structure is almost identical to the tidal object, with the exception of an additional attribute for the back-transformed interpolation grid. This is included to account for retransformation bias of log-transformed variables associated with mean models.
A tidalmean object as a data frame and attributes. The data frame has columns ordered as date, response variable, salinity/flow (rescaled to 0, 1 range), detection limit, logical for detection limit, day number, month, year, and decimal time. The attributes are as follows:
namesColumn names of the data frame
row.namesRow names of the data frame
classClass of the object
half_winsList of numeric values used for half-window widths for model fitting, in the same order as the wt_vars argument passed to getwts. Initially will be NULL if wrtds has not been used.
fitsList with a single element with fits for the WRTDS mean interpolation grid. Initially will be NULL if wrtds has not been used.
predonobsA data.frame of predictions using the observed data that were used to fit the model. This is required for wrtdsperf if a novel dataset is used for predictions after fitting the model. Initially will be NULL if respred has not been used.
bt_fitsList with a single element with back-transformed fits for the WRTDS mean interpolation grid. Initially will be NULL if wrtds has not been used.
flo_grdNumeric vector of salinity/flow values that was used for the interpolation grids
floobs_rngTwo element vector indicating the salinity/flow range of the observed data
nobsList with one matrix showing the number of weights greater than zero for each date and salinity/flow combination used to create the fit matrices in fits. Initially will be NULL if wrtds has not been used.
reslabexpression or character string for response variable label in plots
flolabexpression or character string for flow variable label in plots
## raw data data(chldat) ## format chldat <- tidalmean(chldat)## raw data data(chldat) ## format chldat <- tidalmean(chldat)
An identical object as tidobj with the addition of chlorophyll predictions and normalized estimates after running respred and resnorm.
tidfittidfit
A tidal and data.frame object with 156 rows and 15 variables:
dateDate
resnumeric
flonumeric
limnumeric
not_censlogical
day_numnumeric
monthnumeric
yearnumeric
dec_timenumeric
fit0.1numeric
fit0.5numeric
fit0.9numeric
norm0.1numeric
norm0.5numeric
norm0.9numeric
tidal for full list of attributes in tidal objects, wrtds for creating the fits interpolation grids, and respred and resnorm for interpolating predicted and normalized values from the grids.
An identical object as tidobjmean with the addition of predicted and normalized chlorophyll (log-space and back-transformed) after running respred and resnorm.
tidfitmeantidfitmean
A tidalmean and data.frame object with 156 rows and 11 variables:
dateDate
resnumeric
flonumeric
limnumeric
not_censlogical
day_numnumeric
monthnumeric
yearnumeric
dec_timenumeric
fitsnumeric
bt_fitsnumeric
normnumeric
bt_normsnumeric
tidalmean for full list of attributes in tidalmean objects, wrtds for creating the fits, bt_fits, and scls interpolation grids in the attributes, and respred and resnorm for interpolating predicted and normalized values from the grids.
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay as a tidal object. Raw data are those in chldat with the addition of further processing using the tidal and wrtds functions. Model predictions are obtained for the tenth, median, and ninetieth conditional quantile of chlorophyll. The object also inherits methods from the data.frame class. The processed data includes columns for date, chlorophyll-a (res, in log-space), salinity as fraction of freshwater (flo, i.e., 0 - 1, with higher values indicating more freshwater), the detection limit for all stations for the respective date, a logical column indicating if the observed chlorophyll is at or below the detection limit, the date as decimal time minus the year, the month from 1 to 12, the year, and total decimal time. Attributes include column names, row names, class of the object, matrices of interpolation grids from the weighted regression for each conditional quantile, and vector of salinity values that were used to create the interpolation grids.
tidobjtidobj
A tidal and data.frame object with 156 rows and 9 variables:
dateDate
resnumeric
flonumeric
limnumeric
not_censlogical
day_numnumeric
monthnumeric
yearnumeric
dec_timenumeric
tidal for full list of attributes in tidal objects and wrtds for creating the fits interpolation grids.
Monthly chlorophyll time series for the Hillsborough Bay segment of Tampa Bay as a tidal object. Raw data are those in chldat with the addition of further processing using the tidalmean and wrtds functions. Model predictions are obtained using weighted regression for the conditional mean of chlorophyll. The object also inherits methods from the data.frame class. The processed data includes columns for date, chlorophyll-a (res, in log-space), salinity as fraction of freshwater (flo, i.e., 0 - 1, with higher values indicating more freshwater), the detection limit for all stations for the respective date, a logical column indicating if the observed chlorophyll is at or below the detection limit, the date as decimal time minus the year, the month from 1 to 12, the year, and total decimal time. Attributes include column names, row names, class of the object, an interpolation grid from the weighted regression for the mean response, a back-transformed interpolation grid from the weighted regression for the mean response, and vector of salinity values that were used to create the interpolation grids.
tidobjmeantidobjmean
A tidalmean and data.frame object with 156 rows and 9 variables:
dateDate
resnumeric
flonumeric
limnumeric
not_censlogical
day_numnumeric
monthnumeric
yearnumeric
dec_timenumeric
tidalmean for full list of attributes in tidalmean objects and wrtds for creating the fits, bt_fits, and scls interpolation grids.
Find the optimal half-window width combination to use for weighted regression. This differs from winsrch_optim by using constrOptim
winsrch_constrOptim(dat_in, ...) ## Default S3 method: winsrch_constrOptim( dat_in, wins_in = NULL, control = list(), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )winsrch_constrOptim(dat_in, ...) ## Default S3 method: winsrch_constrOptim( dat_in, wins_in = NULL, control = list(), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
dat_in |
input data object to use with weighted regression |
... |
|
wins_in |
starting list of window weights for initializing the search algorithm |
control |
A list of control parameters passed to |
lower |
vector of minimum half-window widths to evaluate |
upper |
vector of maximum half-window widths to evaluate |
This function uses optim to minimize the error returned by wrtdscv for a given window combination. The search algorithm uses the limited-memory modification of the BFGS quasi-Newton method to impose upper and lower limits on the optimization search. These limits can be changed using the lower and upper arguments.
Some stuff
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
Evaluate a grid of half-window width combinations to use for weighted regression
winsrch_grid(dat_in, ...) ## Default S3 method: winsrch_grid(dat_in, grid_in = NULL, ...)winsrch_grid(dat_in, ...) ## Default S3 method: winsrch_grid(dat_in, grid_in = NULL, ...)
dat_in |
input data object to use with weighted regression |
... |
arguments passed to or from other methods |
grid_in |
optional input matrix of half-window widths created with |
Processing time can be reduced by setting up a parallel backend, as in the examples. Note that this is not effective for small k-values (e.g., < 4) because each fold is sent to a processor, whereas the window width combinations in grid_in are evaluated in sequence.
This function should only be used to view the error surface associated with finite combinations of window-width combinations. A faster function to identify the optimal window widths is provided by winsrch_optim.
A data frame of the search grid with associated errors for each cross-validation result. Errors for each grid row are averages of all errors for each fold used in cross-validation.
createsrch, wrtdscv, winsrch_optim
## Not run: ## # setup parallel backend library(doParallel) ncores <- detectCores() - 2 registerDoParallel(cores = ncores) # run search function using default search grid - takes a while res <- winsrch_grid(tidobjmean) # view the error surface library(ggplot2) ggplot(res, aes(x = factor(mos), y = factor(yrs), fill = err)) + geom_tile() + facet_wrap(~ flo) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0,0)) + scale_fill_gradientn(colours = gradcols()) # optimal combo res[which.min(res$err), ] ## # create a custom search grid, e.g. years only grid_in <- createsrch(mos = 1, yrs = seq(1, 10), flo = 1) res <- winsrch_grid(tidobjmean, grid_in) ## End(Not run)## Not run: ## # setup parallel backend library(doParallel) ncores <- detectCores() - 2 registerDoParallel(cores = ncores) # run search function using default search grid - takes a while res <- winsrch_grid(tidobjmean) # view the error surface library(ggplot2) ggplot(res, aes(x = factor(mos), y = factor(yrs), fill = err)) + geom_tile() + facet_wrap(~ flo) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0,0)) + scale_fill_gradientn(colours = gradcols()) # optimal combo res[which.min(res$err), ] ## # create a custom search grid, e.g. years only grid_in <- createsrch(mos = 1, yrs = seq(1, 10), flo = 1) res <- winsrch_grid(tidobjmean, grid_in) ## End(Not run)
Find the optimal half-window width combination to use for weighted regression.
winsrch_optim(dat_in, ...) ## Default S3 method: winsrch_optim( dat_in, wins_in = NULL, control = list(factr = 1e+07, parscale = c(1, 10, 1)), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )winsrch_optim(dat_in, ...) ## Default S3 method: winsrch_optim( dat_in, wins_in = NULL, control = list(factr = 1e+07, parscale = c(1, 10, 1)), lower = c(0.1, 1, 0.1), upper = c(2, 15, 2), ... )
dat_in |
input data object to use with weighted regression |
... |
|
wins_in |
starting list of window weights for initializing the search algorithm |
control |
A list of control parameters passed to |
lower |
vector of minimum half-window widths to evaluate |
upper |
vector of maximum half-window widths to evaluate |
This function uses optim to minimize the error returned by wrtdscv for a given window combination. The search algorithm uses the limited-memory modification of the BFGS quasi-Newton method to impose upper and lower limits on the optimization search. These limits can be changed using the lower and upper arguments.
Some stuff
## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)## Not run: # setup parallel backend library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # run search function - takes a while res <- winsrch_optim(tidobjmean) ## End(Not run)
Get WRTDS prediction grid for observations of the response variable in a tidal or tidalmean object
wrtds(dat_in, ...) ## S3 method for class 'tidal' wrtds(dat_in, flo_div = 10, tau = 0.5, trace = TRUE, fill_empty = FALSE, ...) ## S3 method for class 'tidalmean' wrtds(dat_in, flo_div = 10, fill_empty = FALSE, trace = TRUE, ...)wrtds(dat_in, ...) ## S3 method for class 'tidal' wrtds(dat_in, flo_div = 10, tau = 0.5, trace = TRUE, fill_empty = FALSE, ...) ## S3 method for class 'tidalmean' wrtds(dat_in, flo_div = 10, fill_empty = FALSE, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to or from other methods |
flo_div |
numeric indicating number of divisions across the range of salinity/flow to create the interpolation grid |
tau |
numeric vector indicating conditional quantiles to fit in the weighted regression, can be many |
trace |
logical indicating if progress is shown in the console |
fill_empty |
logical to fill missing values in interpolation grid using bilinear interpolation by season, see details |
Appends interpolation grid attributes to the input object. For a tidal object, this could include multiple grids for each quantile. For tidalmean objects, only one grid is appended to the ‘fits’ attribute, in addition to a back-transformed grid as the ‘bt_fits’ attribute and a grid of the scale parameter of each prediction as the ‘scls’ attribute. Grid rows correspond to the dates in the input data.
The fill_empty arguments uses bilinear interpolation of time by flow to fill missing data in the interpolation grids. The grids are subset by month before interpolating to retain the seasonal variation captured by the models. In general, this argument should not be used if more than ten percent of the interpolation grids are missing data. It may be helpful to improve visual appearance of some of the plotting results.
## Not run: ## load data data(chldat) ## as tidal object dat_in <- tidal(chldat) res <- wrtds(dat_in) ## as tidalmean object dat_in <- tidalmean(chldat) res <- wrtds(dat_in) ## multiple quantiles res <- wrtds(dat_in, tau = c(0.1, 0.5, 0.9)) ## End(Not run)## Not run: ## load data data(chldat) ## as tidal object dat_in <- tidal(chldat) res <- wrtds(dat_in) ## as tidalmean object dat_in <- tidalmean(chldat) res <- wrtds(dat_in) ## multiple quantiles res <- wrtds(dat_in, tau = c(0.1, 0.5, 0.9)) ## End(Not run)
Use k-fold cross-validation to evaluate WRTDS model fit based on supplied half-window widths.
wrtdscv(dat_in, ...) ## Default S3 method: wrtdscv(dat_in, wins, k = 10, seed_val = 123, trace = TRUE, ...)wrtdscv(dat_in, ...) ## Default S3 method: wrtdscv(dat_in, wins, k = 10, seed_val = 123, trace = TRUE, ...)
dat_in |
input tidal or tidalmean object |
... |
arguments passed to |
wins |
list of input half-window widths of the order months, years, and salinity/flow, passed to |
k |
number of folds to evaluate |
seed_val |
seed to keep the same dataset divisions between window width comparisons |
trace |
logical indicating if progress is printed in the console |
Default number of folds is ten. Each fold can be evaluated with multiple cores if a parallel back end is created prior to running the function (see the examples). This will greatly increase processing speed unless k is set to a small number.
Overall error is the average of all errors for each fold.
getwts, wtsplot, winsrch_grid, winsrch_optim
## Not run: library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # half-window widths to evaluate # months, years, and salinity/flow wins <- list(0.5, 10, 0.5) # get ocv score for k = 10 wrtdscv(tidobjmean, wins = wins) # get ocv score k = 2, tau = 0.2 wrtdscv(tidobj, wins = wins, tau = 0.2) ## End(Not run)## Not run: library(doParallel) ncores <- detectCores() - 1 registerDoParallel(cores = ncores) # half-window widths to evaluate # months, years, and salinity/flow wins <- list(0.5, 10, 0.5) # get ocv score for k = 10 wrtdscv(tidobjmean, wins = wins) # get ocv score k = 2, tau = 0.2 wrtdscv(tidobj, wins = wins, tau = 0.2) ## End(Not run)
Get WRTDS performance metrics including goodness of fit, root mean square error, and normalized mean square error.
wrtdsperf(dat_in, ...) ## S3 method for class 'tidal' wrtdsperf(dat_in, logspace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsperf(dat_in, logspace = TRUE, ...)wrtdsperf(dat_in, ...) ## S3 method for class 'tidal' wrtdsperf(dat_in, logspace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsperf(dat_in, logspace = TRUE, ...)
dat_in |
input tidal object which must already have fitted model data |
... |
arguments passed to or from other methods |
logspace |
logical if performance metrics use back-transformed residuals |
Goodness of fit is calculated using the goodfit function for quantile regression described in Koenker and Mochado 1999. Root mean square error is based on square root of the mean of the squared residuals. Normalized mean square error described in Gershenfeld and Weigend 1993 is the sum of the squared errors divided by the sum of the non-conditional errors (i.e., sum of the squared values of the observed minus the mean of the observed). This measure allows comparability of error values for data with different ranges, although the interpretation for quantile models is not clear. The value is provided as a means of comparison for WRTDS models created from the same data set but with different window widths during model fitting.
Performance metrics are only valid for observations and model residuals in log-space. Metrics also only apply to the data used to fit the model, i.e., performance will not be evaluated for novel data if the dat_pred argument was used with respred.
A data.frame with the metrics for each quantile model
Gershenfeld, N.A., Weigend, A.S. 1993. The future of time series: learning and understanding. In: Weigend, A.S., Gershenfeld, N.A. (eds). Time Series Prediction: Forecasting the Future and Understanding the Past., second ed. Addison-Wesley, Santa Fe, New Mexico. pp. 1-70.
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
wrtdsrsd for residuals, goodfit
## load a fitted model object data(tidfit) ## get performance metrics wrtdsperf(tidfit)## load a fitted model object data(tidfit) ## get performance metrics wrtdsperf(tidfit)
Get WRTDS residuals for each quantile model. These are used to estimate goodness of fit of the model predictions.
wrtdsrsd(dat_in, ...) ## S3 method for class 'tidal' wrtdsrsd(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsrsd(dat_in, trace = TRUE, ...)wrtdsrsd(dat_in, ...) ## S3 method for class 'tidal' wrtdsrsd(dat_in, trace = TRUE, ...) ## S3 method for class 'tidalmean' wrtdsrsd(dat_in, trace = TRUE, ...)
dat_in |
input tidal object which must already have fitted model data |
... |
arguments passed to or from other methods |
trace |
logical indicating if progress is shown in the console |
Columns are added to the data of the tidal object for residuals and non-conditional residuals. Both are required to assess the goodness of fit measure described for quantile regression in Koenker and Machado (1999).
A tidal object with columns added to the predonobs attribute for the residuals ('rsd') and non-conditional residuals ('rsdnl') of each quantile model or a tidalmean object with columns added to the predonobs attribute for the residuals ('rsd') and back-transformed residuals ('bt_rsd').
Koenker, R., Machado, J.A.F. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association. 94(448):1296-1310.
## load a fitted model object data(tidfit) ## run the function res <- wrtdsrsd(tidfit) head(res)## load a fitted model object data(tidfit) ## run the function res <- wrtdsrsd(tidfit) head(res)
Get WRTDS trends for annual and monthly groupings
wrtdstrnd(dat_in, ...) ## Default S3 method: wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidal' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidalmean' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... )wrtdstrnd(dat_in, ...) ## Default S3 method: wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidal' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, aves = FALSE, mo_strt = 10, min_mo = 9, ... ) ## S3 method for class 'tidalmean' wrtdstrnd( dat_in, mobrks, yrbrks, molabs, yrlabs, aves = FALSE, mo_strt = 10, min_mo = 9, ... )
dat_in |
input tidal or tidalmean object which must already have fitted model data |
... |
methods passed to or from other methods |
mobrks |
list of month groupings where each month is an integer from 1 to 12, see examples |
yrbrks |
numeric vector of breaks for years, see examples |
molabs |
character vector of names for month breaks, see examples |
yrlabs |
character vector of names for year breaks, see examples |
aves |
logical if averages within each period are also returned |
mo_strt |
numeric indicating month to start aggregation years for annual trends, defaults to October for USGS water year from October to September, passed to |
min_mo |
numeric value from one to twelve indicating the minimum number of months with observations for averaging by years, passed to |
tau |
numeric vector of quantile for estimating trends |
Trends are reported as percent changes of annual averages from the beginning to the end of each period. To reduce the effects of odd years at the beginning and end of each period, percent changes are based on an average of the first three and last three annual averages. For example, percent changes for January throughout an an entire time series from 1980 to 2000 would be the change of the average from January in 1980-1982 to the average from January in 1998-2000. Annual trends, e.g., percent changes from 1980-1986, 1987-1993, etc. do not average by the first and last three years in each grouping because the values are already based on annual averages as returned by annual_agg.
Note that the default minimum number of months argument (min_mo) may not be appropriate for all cases. Annual estimates should first be evaluated with prdnrmplot to verify that odd years with missing months are not driving results for the annual percent changes.
Averages in each period can be returned if aves = TRUE. These averages are based on annual averages within each period for congruency with the trend estimates.
All trends are based on back-transformed, flow-normalized results.
The user must supply the annual and monthly aggregation periods to the appropriate arguments. These are passed to cut and are left-open, right-closed along the interval.
A data.frame with summary trends for each grouping
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd(tidfitmean, mobrks, yrbrks, molabs, yrlabs) # get averages in each period wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs, aves = TRUE)## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd(tidfitmean, mobrks, yrbrks, molabs, yrlabs) # get averages in each period wrtdstrnd(tidfit, mobrks, yrbrks, molabs, yrlabs, aves = TRUE)
Get WRTDS trends using seasonal Kendall tests
wrtdstrnd_sk(dat_in, ...) ## Default S3 method: wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, ...) ## S3 method for class 'tidal' wrtdstrnd_sk( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, trndvar = "norm", ... ) ## S3 method for class 'tidalmean' wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, trndvar = "bt_norm", ...)wrtdstrnd_sk(dat_in, ...) ## Default S3 method: wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, ...) ## S3 method for class 'tidal' wrtdstrnd_sk( dat_in, mobrks, yrbrks, molabs, yrlabs, tau = NULL, trndvar = "norm", ... ) ## S3 method for class 'tidalmean' wrtdstrnd_sk(dat_in, mobrks, yrbrks, molabs, yrlabs, trndvar = "bt_norm", ...)
dat_in |
input tidal or tidalmean object which must already have fitted model data |
... |
methods passed to or from other methods |
mobrks |
list of month groupings where each month is an integer from 1 to 12, see examples |
yrbrks |
numeric vector of breaks for years, see examples |
molabs |
character vector of names for month breaks, see examples |
yrlabs |
character vector of names for year breaks, see examples |
tau |
numeric vector of quantile for estimating trends |
trndvar |
chr string of variable for trend evaluation, usually back-transformed, flow-normalized results, see details |
Trends are based on kendallSeasonalTrendTest for user-specified time periods. In general, the seasonal Kendall test evaluates monotonic trends using a non-parametric approach that accounts for seasonal variation in the time series.
All trends are based on back-transformed, flow-normalized results by default. The variable for evaluating trends can be changed with 'trndvar' as 'res', 'norm', or 'fit' for tidal objects and as 'res', 'bt_norm', or 'bt_fits' for tidalmean objects. In all cases, back-transformed variables are evaluated.
The user must supply the annual and monthly aggregation periods to the appropriate arguments. These are passed to cut and are left-open, right-closed along the interval.
A data.frame with summary trends for each grouping, including med as the median value for the period of observation, tau as the magnitude and direction of the trend, slope as the Thiel-Sen slope for change per year, chitest as the significance test evaluating heterogeneity between seasons, ztest indicating significance of the overall trend, and perchg as 100 multiplied by the ratio of the annual slope to the median estimate of the time period (percent change per year).
As noted in kendallSeasonalTrendTest, the overall test is not appropriate if chitest indicates a small p-value.
Hirsch, R.M., Slack, J.R., Smith, R.A. 1982. Techniques of trend analysis for monthly water quality data. Water Resources Research, 18:107-121.
Millard, S. P. 2013. EnvStats: An R Package for Environmental Statistics. Springer, New York.
## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd_sk(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd_sk(tidfitmean, mobrks, yrbrks, molabs, yrlabs)## load a fitted model object data(tidfit) data(tidfitmean) ## get trends # setup month, year categories mobrks <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9), c(10, 11, 12)) yrbrks <- c(1973, 1985, 1994, 2003, 2012) molabs <- c('JFM', 'AMJ', 'JAS', 'OND') yrlabs <- c('1974-1985', '1986-1994', '1995-2003', '2004-2012') wrtdstrnd_sk(tidfit, mobrks, yrbrks, molabs, yrlabs) wrtdstrnd_sk(tidfitmean, mobrks, yrbrks, molabs, yrlabs)
Create several plots showing the weights used to fit a model for a single observation.
wtsplot(dat_in, ...) ## Default S3 method: wtsplot( dat_in, ref = NULL, wins = list(0.5, 10, NULL), min_obs = TRUE, slice = FALSE, dt_rng = NULL, pt_rng = c(1, 12), col_vec = NULL, col_lns = NULL, alpha = 1, as_list = FALSE, ... ) ## S3 method for class 'tidal' wtsplot(dat_in, ...) ## S3 method for class 'tidalmean' wtsplot(dat_in, ...)wtsplot(dat_in, ...) ## Default S3 method: wtsplot( dat_in, ref = NULL, wins = list(0.5, 10, NULL), min_obs = TRUE, slice = FALSE, dt_rng = NULL, pt_rng = c(1, 12), col_vec = NULL, col_lns = NULL, alpha = 1, as_list = FALSE, ... ) ## S3 method for class 'tidal' wtsplot(dat_in, ...) ## S3 method for class 'tidalmean' wtsplot(dat_in, ...)
dat_in |
input tidal object |
... |
arguments passed to other methods |
ref |
chr string indicating the date at the center of the weighting window. Must be in the format 'YYYY-mm-dd' which is passed to |
wins |
list with three elements passed to |
min_obs |
logical to use window widening if less than 100 non-zero weights are found, passed to |
slice |
logical indicating if only weights bounded by the year window (i.e., the limiting window for the combined weights) are shown, passed to |
dt_rng |
Optional chr string indicating the date range for all plots except seasonal (day) weights. Must be two values in the format 'YYYY-mm-dd' which is passed to |
pt_rng |
numeric vector of two elements indicating point scaling for all weights in the plot of salinity/flow vs time. |
col_vec |
chr string of plot colors to use, passed to |
col_lns |
chr string of line color in plots |
alpha |
numeric value from zero to one indicating transparency of points and lines |
as_list |
logical indicating if plots should be returned in a list |
Create diagnostic plots to view the effects of different weighting windows on model predictions. The plots illustrate the weights that are used when fitting a weighted regression in reference to a single observation. The process is repeated for all observations when the entire model is fit. Five plots are produced by the function, each showing the weights in relation to time and the selected observation (i.e., center of the weighting window). The top plot shows salinity/flow over time with the points colored and sized by the combined weight vector. The remaining four plots show the weights over time for each separate weighting component (months/days, year, and salinity/flow) and the final combined vector.
A combined ggplot object created using grid.arrange. A list with elements for each individual plot will be returned if as_list = TRUE.
## load a fitted tidal object data(tidfit) ## plot using defaults, wtsplot(tidfit) ## change the defaults wtsplot(tidfit, ref = '2000-01-01', wins = list(0.5, 15, Inf), dt_rng = c('1990-01-01', '2010-01-01'), pt_rng = c(3, 8), col_vec = c('lightgreen', 'lightblue', 'purple'), alpha = 0.7)## load a fitted tidal object data(tidfit) ## plot using defaults, wtsplot(tidfit) ## change the defaults wtsplot(tidfit, ref = '2000-01-01', wins = list(0.5, 15, Inf), dt_rng = c('1990-01-01', '2010-01-01'), pt_rng = c(3, 8), col_vec = c('lightgreen', 'lightblue', 'purple'), alpha = 0.7)