Let’s compare bits of code between R and SAS that do the exact same thing. I’ve got a data set with lots of information about wines. The data is already split into 3 chunks; train, test, and score. I’d like to create several new variables for each of the eleven numeric variables that already exist – an inverse, a square root, a natural log, a square and a cube.

Here’s the code in SAS (sorry if that’s small):

Now here’s the bit of code that does the exact same in R using the dplyr package:

Now, I have a lot more R experience than I do with SAS, so maybe there’s a shorter way to do that in SAS. I tried a couple of different ways of pasting the type of transform onto the value from the array, but SAS kept giving me error messages. I even tried using an array from macro variable – &cols – but SAS didn’t agree with that one either.

I also searched in vain through the SAS documentation in hopes of finding something that would automatically append the column names like any of the mutate functions will in R.

If I have to do this type of task (and I do all the time at work) in the future, then I’m going to go with R for a couple of reasons. First, as is plain to see I did a LOT less typing in R. Second, the code surprisingly ran faster in R than SAS. Though what I waited on the longest in SAS was for the output data to appear. With R it doesn’t automatically show me the output tables, I have open those manually. Finally, I just like the flow a little better in R. I feel like things that are one step in R sometimes take 3-4 steps in SAS.

If you read this and you happen to be a wiz with SAS, please drop a comment with your thoughts on how I could tighten up the SAS code. I’ll have some NCAA basketball updates coming soon.