Here we are at the last project of the term for the first semester of my MSBA degree at Wake Forest. And now is a great time for a failed model building experience. The assignment was to build a logistic regression model (or random forest/gradient boosting if one wanted to give it a try) to … Continue reading Model Building is Hard. Don’t Lose Hope If You Fail.
Category: R
Don’t Crash Your Laptop While Running Random Forest in R
Or more specifically, be careful to manage your machine's resources if you decide to run parallel jobs using RStudio. The latest version of RStudio makes it very easy to set up multiple jobs at the same time. All you need to do is put the code you'd like to run in a separate script, click … Continue reading Don’t Crash Your Laptop While Running Random Forest in R
Text Analytics
When learning to analyze text this is a great time to be alive. The bloviator-in-chief currently occupying the Oval Office provides a cornucopia of text via Twitter to use for learning purposes. Even better for analysts, the folks over at the Trump Twitter Archive have made it very easy to grab a .csv of tweets … Continue reading Text Analytics
More fun with R functions
PROC MEANS is SAS is a very useful function to get descriptive statistics for numeric variables in a dataset. There are a few R packages that approximate or even improve on PROC MEANS - psych, stats, skimr. But for practice with functions this week we wrote an R function that would give some of the … Continue reading More fun with R functions
R Notebooks Are Very Useful
Because I started learning R many years ago, prior to the inclusion of the notebook format in RStudio, I stubbornly stuck with using R scripts for the past few years as the notebook format grew in popularity. Now my mind has been changed. The biggest benefit to notebooks is that a couple of years ago … Continue reading R Notebooks Are Very Useful
Functions are my friend
We've reached the point in our programming class where we've moved onto R. I couldn't be happier. Why? Because now I can use functions. Let's face it, I'm lazy and the less typing that I have to do the better. Here's a simple example.Task: Create a histogram for twelve variables.Solution: Write a function, create a … Continue reading Functions are my friend
Carolina/Dook Round 2
I had to take a little break from basketball analysis due to work, illness, school, etc. But I'm back, because there's another big game tomorrow night. While all the hoopla centers on whether Zion will play or not play let's take a look at the numbers! First, an update on the KenPom ratings graphic. Since … Continue reading Carolina/Dook Round 2
NCAA Bball Update #3
Alright, it's time to really take this NCAA graphic to a new level. I don't know about all of you, but I'm most interested in winning the NCAA tournament pool at work. I came in third last year and it came down to the championship game for me. And while you could have zoomed in … Continue reading NCAA Bball Update #3
NCAA KenPom Efficiency Rankings Updated
I've updated my chart with the latest KenPom rankings AND I've now added the past NCAA Champions (2002-2018) as diamonds on the chart. It really stands out now that past champions have always had an efficiency margin of greater than 20 - or to the right of the green line. In fact, of the 17 … Continue reading NCAA KenPom Efficiency Rankings Updated
Even more NCAA Hoops Fun
In addition to the ncaahoopR package in R, I had also been working on building a plot using Ken Pomeroy's efficiency ratings. And voila! Here it is. Unfortunately, WordPress doesn't support the embedded interactive graphic. But if you visit my site on Plotly you'll get the full experience. What I've Done I've written an R … Continue reading Even more NCAA Hoops Fun