Data Preprocessing: SQL vs. Python

Raw data could be messy and inconsistent. Before any meaningful insights can be generated, the first thing we need to do is to clean and explore the data, which leads us to the important question: what tool should we use to facilitate this crucial step? Generally speaking, there are two Continue reading Data Preprocessing: SQL vs. Python

From Chaucer to Code: Why Non-STEM Backgrounds Matter in Data Analytics

There are common stereotypes that humanities degrees are “useless,” that Historically Black Colleges/Universities (HBCUs) don’t prepare you for the real world, and that being a Black woman is a deficit in STEM fields. I’m here to challenge all that. As a Black woman who graduated from the illustrious North Carolina Continue reading From Chaucer to Code: Why Non-STEM Backgrounds Matter in Data Analytics

How to Use a Spaced Repetition Flashcard System to Succeed at the IAA

As MSA students, we are asked to juggle a slew of tasks and projects that pull us in a million different directions. It’s often difficult to find time amid the onslaught of obligations to find time to study for our assessments. Many students use the flashcard website Quizlet to generate Continue reading How to Use a Spaced Repetition Flashcard System to Succeed at the IAA

The Right Amount of Wrong: Calibrating Predictive Models

graph showing predicted versus actual win percentage

The Right Amount of Wrong: Calibrating Predictive Models How do we evaluate a predictive model? The most intuitive answer is accuracy—a simple measure of what percent of the time your model’s prediction is correct. Accuracy is a widely used evaluation metric, largely because it’s so easy to understand. However, a Continue reading The Right Amount of Wrong: Calibrating Predictive Models