Below are some frequently asked questions about the procs package. Click on the links below to navigate to the full question and answer content.
Q: There are already many statistical functions and packages in R. Why did you write procs?
A: I wrote the procs package to help SAS® programmers. There are many SAS® programmers trying to learn R and struggling with the many differences between these two languages. The procs package provides them with a set of functions that are conceptually similar to SAS® procedures. The aim is to make them more productive and comfortable working in R in a shorter time frame.
Q: My company requires that all software be validated before using in production. Is the procs package validated?
A: Yes. The functions were validated by comparing to SAS®. The validation documentation is here.
Q: The output dataset columns and column names are a little bit different from SAS®. Why?
A: The output datasets and column names have been standardized. They should be more predictable and easier to manipulate programmatically. They were changed intentionally as an improvement over the corresponding SAS® procedures.
Q: I see you can do one-way and two-way frequency tables. What about N-way tables? Does that package support them?
A: No. N-way tables are less common, and were not seen as a priority. These types of tables are on the list for a future enhancement.
Q: I’m trying to find the option to produce a plot, and can’t find it. Does the package support plots or not?
A: Not yet. Plots are expected for a future release. In the meantime, you can send the output results into ggplot2 and create the plots on your own.
Q: I see the frequency function supports Chi-square and Fishers’s exact tests. What about Cochran-Mantel-Haenszel?
A: CMH statistics were left out of the of the procs package because the corresponding R function does not match SAS® reliably. You can run CMH statistics yourself using using the mantelhaen.test function from the stats package.
Q: I see you have PROC TRANSPOSE and PROC SORT. How come you didn’t include PROC COMPARE?
A: There are already several dataset comparison packages in R, such as diffdf and comparedf. The functionality of these packages is also similar to SAS® PROC COMPARE. If you wish to do dataset comparisons, please research these existing packages.
Q: The proc_freq()
and
proc_means()
functions are helpful. But what about some
other stats procedures? What about PROC GLM and PROC ANOVA?
A: These functions are planned for a future release. In the meantime, please see the excellent package sasLM for functions that replicate SAS® statistics related to linear modeling.
Q: I created some frequencies, and need to output them to a report. Is there a way to create a PDF or RTF directly from the procs package?
A: Yes. Put your data frames in a list, and send to
proc_print()
. You can use the parameters to assign the
titles, the output location, and the output type. The function supports
PDF, RTF, HTML, DOCX, and TXT. For more advanced reporting features, see
the reporter package.
Q: The proc_freq()
function seems to
always order the categories alphabetically. But I have a specific order
in mind that is not alphabetical. Is there a way to order the frequency
categories in a specific way?
A: Yes. If your frequency variable is defined as a
factor, the proc_freq()
function will order the frequency
categories by that factor. Here is an example:
library(procs)
# Turn off printing for CRAN checks
options("procs.print" = FALSE)
# Create sample data
df <- as.data.frame(HairEyeColor, stringsAsFactors = FALSE)
# Assign factor for ordering
df$Eye <- factor(df$Eye, c("Green", "Hazel", "Blue", "Brown"))
# Use factor in tables parameter
res <- proc_freq(df,
tables = Eye,
weight = Freq)
# Output is now ordered by the factor
res
# VAR CAT N CNT PCT
# 1 Eye Green 592 64 10.81081
# 2 Eye Hazel 592 93 15.70946
# 3 Eye Blue 592 215 36.31757
# 4 Eye Brown 592 220 37.16216