Here we are at the last project of the term for the first semester of my MSBA degree at Wake Forest. And now is a great time for a failed model building experience. The assignment was to build a logistic regression model (or random forest/gradient boosting if one wanted to give it a try) to … Continue reading Model Building is Hard. Don’t Lose Hope If You Fail.
Don’t Crash Your Laptop While Running Random Forest in R
Or more specifically, be careful to manage your machine's resources if you decide to run parallel jobs using RStudio. The latest version of RStudio makes it very easy to set up multiple jobs at the same time. All you need to do is put the code you'd like to run in a separate script, click … Continue reading Don’t Crash Your Laptop While Running Random Forest in R
Text Analytics
When learning to analyze text this is a great time to be alive. The bloviator-in-chief currently occupying the Oval Office provides a cornucopia of text via Twitter to use for learning purposes. Even better for analysts, the folks over at the Trump Twitter Archive have made it very easy to grab a .csv of tweets … Continue reading Text Analytics
More fun with R functions
PROC MEANS is SAS is a very useful function to get descriptive statistics for numeric variables in a dataset. There are a few R packages that approximate or even improve on PROC MEANS - psych, stats, skimr. But for practice with functions this week we wrote an R function that would give some of the … Continue reading More fun with R functions
R Notebooks Are Very Useful
Because I started learning R many years ago, prior to the inclusion of the notebook format in RStudio, I stubbornly stuck with using R scripts for the past few years as the notebook format grew in popularity. Now my mind has been changed. The biggest benefit to notebooks is that a couple of years ago … Continue reading R Notebooks Are Very Useful
Functions are my friend
We've reached the point in our programming class where we've moved onto R. I couldn't be happier. Why? Because now I can use functions. Let's face it, I'm lazy and the less typing that I have to do the better. Here's a simple example.Task: Create a histogram for twelve variables.Solution: Write a function, create a … Continue reading Functions are my friend
Linear Regression With SAS
It probably seems like I complain a lot about SAS (I have), so today I'll write about something that I've learned from SAS that is really useful and a huge time saver when creating a generalized linear regression model. Proc GLMSelect makes the process of variable selection and transformation really easy. Specifically the Effect statement … Continue reading Linear Regression With SAS
NCAA Tournament Initial Thoughts
I'll have more to share later but here are my initial tournament thoughts. I like the Heels draw, other than the possibility of playing Kansas (if they make it out of the first weekend) in Kansas City. Auburn is an incredibly tough looking 5 seed, but defense may be their downfall (and they can't keep … Continue reading NCAA Tournament Initial Thoughts
Spring Break is Upon Us
We have a week "off!" OK, not really. We have several assignments due just two weeks from now that we'll have to spend our time working on. But that's fine by me; I like having a little breathing room to get the work completed. For Stat we have the second part of an analysis project … Continue reading Spring Break is Upon Us
Carolina/Dook Round 2
I had to take a little break from basketball analysis due to work, illness, school, etc. But I'm back, because there's another big game tomorrow night. While all the hoopla centers on whether Zion will play or not play let's take a look at the numbers! First, an update on the KenPom ratings graphic. Since … Continue reading Carolina/Dook Round 2