Below are some frequently asked questions about the procs package. Click on the links below to navigate to the full question and answer content.

Content

Why did you write the procs package?

Q: There are already many statistical functions and packages in R. Why did you write procs?

A: I wrote the procs package to help SAS® programmers. There are many SAS® programmers trying to learn R and struggling with the many differences between these two languages. The procs package provides them with a set of functions that are conceptually similar to SAS® procedures. The aim is to make them more productive and comfortable working in R in a shorter time frame.

top


Are these functions validated?

Q: My company requires that all software be validated before using in production. Is the procs package validated?

A: Yes. The functions were validated by comparing to SAS®. The validation documentation is here.

top


Why is the output different?

Q: The output dataset columns and column names are a little bit different from SAS®. Why?

A: The output datasets and column names have been standardized. They should be more predictable and easier to manipulate programmatically. They were changed intentionally as an improvement over the corresponding SAS® procedures.

top


Can you do N-way tables?

Q: I see you can do one-way and two-way frequency tables. What about N-way tables? Does that package support them?

A: No. N-way tables are less common, and were not seen as a priority. These types of tables are on the list for a future enhancement.

top


Can these procedures create plots?

Q: I’m trying to find the option to produce a plot, and can’t find it. Does the package support plots or not?

A: Not yet. Plots are expected for a future release. In the meantime, you can send the output results into ggplot2 and create the plots on your own.

top


Does the package support Cochran-Mantel-Haenszel Statistics?

Q: I see the frequency function supports Chi-square and Fishers’s exact tests. What about Cochran-Mantel-Haenszel?

A: CMH statistics were left out of the of the procs package because the corresponding R function does not match SAS® reliably. You can run CMH statistics yourself using using the mantelhaen.test function from the stats package.

top


What about PROC COMPARE?

Q: I see you have PROC TRANSPOSE and PROC SORT. How come you didn’t include PROC COMPARE?

A: There are already several dataset comparison packages in R, such as diffdf and comparedf. The functionality of these packages is also similar to SAS® PROC COMPARE. If you wish to do dataset comparisons, please research these existing packages.

top


Are there any other statistics procedures coming?

Q: The proc_freq() and proc_means() functions are helpful. But what about some other stats procedures? What about PROC GLM and PROC ANOVA?

A: These functions are planned for a future release. In the meantime, please see the excellent package sasLM for functions that replicate SAS® statistics related to linear modeling.

top


How do I get my statistics into a report?

Q: I created some frequencies, and need to output them to a report. Is there a way to create a PDF or RTF directly from the procs package?

A: Yes. Put your data frames in a list, and send to proc_print(). You can use the parameters to assign the titles, the output location, and the output type. The function supports PDF, RTF, HTML, DOCX, and TXT. For more advanced reporting features, see the reporter package.

top


How do I order my frequencies?

Q: The proc_freq() function seems to always order the categories alphabetically. But I have a specific order in mind that is not alphabetical. Is there a way to order the frequency categories in a specific way?

A: Yes. If your frequency variable is defined as a factor, the proc_freq() function will order the frequency categories by that factor. Here is an example:

library(procs)

# Turn off printing for CRAN checks
options("procs.print" = FALSE)

# Create sample data
df <- as.data.frame(HairEyeColor, stringsAsFactors = FALSE)

# Assign factor for ordering
df$Eye <- factor(df$Eye, c("Green", "Hazel", "Blue", "Brown"))

# Use factor in tables parameter
res <- proc_freq(df,
                 tables = Eye,
                 weight = Freq)

# Output is now ordered by the factor
res
#   VAR   CAT   N CNT      PCT
# 1 Eye Green 592  64 10.81081
# 2 Eye Hazel 592  93 15.70946
# 3 Eye  Blue 592 215 36.31757
# 4 Eye Brown 592 220 37.16216

top