A function to request regression plots on a call to proc_reg. The function allows you to specify the type of regression plots to produce. It produces a combined diagnostics panel by default. You may also specify individual plots to create by setting the "panel" parameter to FALSE, and passing a vector of plot names on the "type" parameter.

regplot(
  type = c("diagnostics", "residuals", "fitplot"),
  panel = TRUE,
  stats = "default",
  label = FALSE,
  id = NULL
)

Arguments

type

The type(s) of plot to create. Multiple types should be passed as a vector of strings. Valid values are "diagnostics", "residualbypredicted", "rstudentbypredicted", "rstudentbyleverage", "qqplot", "observedbypredicted", "cooksd", "residualhistogram", "rfplot", "residuals", and "fitplot". The default value is a vector with "diagnostics", "residuals", and "fitplot". The "diagnostics" keyword produces a single combined chart with 8 different plots and a selection of statistics in a small table. The statistics can be controlled by the stats parameter.

panel

Whether or not to display the diagnostics plots combined into in a single panel. Default is TRUE. A value of FALSE will create individual plots instead. This parameter is equivalent to the "unpack" keyword in SAS.

stats

The statistics to display on the diagnostics panel. Valid values are: "adjrsq", "aic", "coeffvar", "depmean", "default", "edf", "mse", "nobs", "nparm", "rsquare", and "sse". The default value is "default", which produces the following statistics: "nobs", "nparm", "edf", "mse", "rsquare", and "adjrsq".

label

Whether or not to label values automatically. Valid values are TRUE or FALSE. Default is FALSE. If TRUE, this options will assign labels to outlier values on some charts. Only some individual charts are labelled, not the panel diagnostics chart.

id

If the label parameter is TRUE, this parameter determines which value is assigned to the label. By default, the row number will be assigned. You may also assign a column name from the input dataset to use as the label value.

Details

Any requested plots will be displayed on interactive reports only. Plots are created as jpeg files, and stored in a temp directory. Those temporary files are then referenced by the interactive report to display the graphic.

If desired, you may output the report objects and pass to proc_print. To do this, set output = report on the call to proc_freq, and pass the entire list to proc_print.

Plots

The plots parameter allows you to request several types of regression plots. Below are the types of plots that are supported. The list shows the plot type keyword needed to request the plot, and a brief description:

  • diagnostics: A fit diagnostics panel that contains 8 different types of plots and a table of statistics.

  • residuals: Produces a panel of residual plots against each independent variable in the model.

  • fitplot: Produces a scatter plot of the dependent variable against the regressor, including the fitted line and confidence/prediction bands. This is only available for models with a single regressor.

  • qqplot: Normal Quantile-Quantile (Q-Q) plot of residuals.

  • rfplot: Residual-Fit (RF) spread plot.

  • residualbypredicted: Residuals vs. Predicted values.

  • rstudentbypredicted: Externally Studentized Residuals (RStudent) vs. Predicted values.

  • rstudentbyleverage: Externally Studentized Residuals vs. Leverage.

  • cooksd: Cook’s D statistic vs. Observation number.

  • residualhistogram: Histogram of residuals, with a normal and kernel curve overlay.

  • observedbypredicted: Dependent variable (Observed) vs. Predicted values.

The above plots may be requested in different ways: as a vector of keywords, or as a call to the regplot function. The keyword approach will produce plots with default parameters. A call to regplot will give you control over some parameters to the charts. See the regplot function for further details.

Statistics

  • adjrsq: Adjusted R-square.

  • aic: Akaike's information criterion.

  • coeffvar: Coefficient of variation.

  • depmean: Mean of dependent.

  • default: A set of default statistics.

  • edf: Error degrees of freedom.

  • mse: Mean squared error.

  • nobs: Number of observations used.

  • nparm: Number of parameters in the model (including the intercept).

  • rsquare: The R-square statistic.

  • sse: Error sum of squares.

Examples

library(procs)

# Turn off printing for CRAN checks
# Set to TRUE to run in local environment
options("procs.print" = FALSE)


# Example 1: Regression statistics with default plots
res <- proc_reg(iris, model = "Sepal.Length = Petal.Length",
                 output = report,
                 plots = regplot,
                 titles = "Iris Regression Statistics")

# View results
res

# Example 2: Regression statistics with custom plot strings and by variable
res <- proc_reg(iris, model = "Sepal.Length = Petal.Length",
                 output = report,
                 by = Species,
                 plots = v(diagnostics, residuals, residualhistogram, cooksd),
                 titles = "Iris Regression Statistics")

# View results
res

# Example 3: Regression statistics with multiple models, same plot
res <- proc_reg(iris, model = c("Sepal.Length = Petal.Length",
                                "Sepal.Length = Sepal.Width",
                                "Sepal.Length = Petal.Width"),
                 output = report,
                 plots = "diagnostics",
                 titles = "Iris Regression Statistics")

# View results
res

# Example 4: Regression statistics with multiple models, different plot strings
res <- proc_reg(iris, model = c("Sepal.Length = Petal.Length",
                                "Sepal.Length = Sepal.Width",
                                "Sepal.Length = Petal.Width"),
                 output = report,
                 plots = list("diagnostics",
                              "cooksd",
                              "residualhistogram"),
                 titles = "Iris Regression Statistics")

# View results
res

# Example 5: Regression statistics with multiple models, different plot functions
res <- proc_reg(iris, model = c("Sepal.Length = Petal.Length",
                                "Sepal.Length = Sepal.Width",
                                "Sepal.Length = Petal.Width"),
                 output = report,
                 plots = list(regplot(type = "diagnostics"),
                              regplot(type = "cooksd",
                                      label = TRUE),
                              regplot(type = "fitplot",
                                      label = TRUE,
                                      stats = c("nobs", "mse", "rsquare"))),
                 titles = "Iris Regression Statistics")

# View results
res