Here we are at the last project of the term for the first semester of my MSBA degree at Wake Forest. And now is a great time for a failed model building experience. The assignment was to build a logistic regression model (or random forest/gradient boosting if one wanted to give it a try) to

# Don’t Crash Your Laptop While Running Random Forest in R

Or more specifically, be careful to manage your machine's resources if you decide to run parallel jobs using RStudio. The latest version of RStudio makes it very easy to set up multiple jobs at the same time. All you need to do is put the code you'd like to run in a separate script, click

# Text Analytics

When learning to analyze text this is a great time to be alive. The bloviator-in-chief currently occupying the Oval Office provides a cornucopia of text via Twitter to use for learning purposes. Even better for analysts, the folks over at the Trump Twitter Archive have made it very easy to grab a .csv of tweets

# More fun with R functions

PROC MEANS is SAS is a very useful function to get descriptive statistics for numeric variables in a dataset. There are a few R packages that approximate or even improve on PROC MEANS - psych, stats, skimr. But for practice with functions this week we wrote an R function that would give some of the

# R Notebooks Are Very Useful

Because I started learning R many years ago, prior to the inclusion of the notebook format in RStudio, I stubbornly stuck with using R scripts for the past few years as the notebook format grew in popularity. Now my mind has been changed. The biggest benefit to notebooks is that a couple of years ago

# Functions are my friend

We've reached the point in our programming class where we've moved onto R. I couldn't be happier. Why? Because now I can use functions. Let's face it, I'm lazy and the less typing that I have to do the better. Here's a simple example.Task: Create a histogram for twelve variables.Solution: Write a function, create a

# Linear Regression With SAS

It probably seems like I complain a lot about SAS (I have), so today I'll write about something that I've learned from SAS that is really useful and a huge time saver when creating a generalized linear regression model. Proc GLMSelect makes the process of variable selection and transformation really easy. Specifically the Effect statement

# Spring Break is Upon Us

We have a week "off!" OK, not really. We have several assignments due just two weeks from now that we'll have to spend our time working on. But that's fine by me; I like having a little breathing room to get the work completed. For Stat we have the second part of an analysis project

# Proc Fastclus Is Going to be Useful

This week in learning SAS we began working with Proc Fastlclus - a procedure to group data together by two or more variables. This is particularly useful for data that has little linear relationship. Once you've identified clusters it helps to find similar characteristics among the clusters to aid in predicting outcomes for other observations

# I’ll be Using SAS a Lot Soon

I find myself struggling with what to write about this week. But I've learned that at some point in the future I'll be using SAS mainly instead of R. So I suppose that learning to use SAS as part of MSBA is timely. I'll still need to use R for the dashboard report that I've