This is Part II of two posts demonstrating how to get test metrics from a highly unbalanced dataset. In Part I, I showed how you could theoretically estimate the precision, recall, and f1 score on highly unbalanced data. In this part, I’ll do the same thing with a real dataset. To do so, I’ll use the Adult Income Dataset.
In this post, I’m going to walk through how to solve a problem that you might run into when evaluating models on highly unbalanced datasets. Let’s imagine you’re classifying whether people have a really rare disease or not. You asked 100,000 people at random and only found 10 instances of the disease. How are you going to be able to get enough data to train a machine learning model? Fortunately, you know of a treatment center that treats this specific disease.
I had too many failures for one post, so this post describes even more ways not to evaluate models with FastAI.
This post summarizes some of the paths I went down trying to figure out how to evaluate things in FastAI. I’ll start it off correctly and let you know when I go down a bad path.
This post walks through how to create a Siamese network using FastAI. It is based on a tutorial from FastAI. Some of the classes and functions are directly copied from there, but I’ve added things as well.
Saving and loading neural networks is always a little tricky. The best way to do it depends on what exactly you’re trying to do. Do you want to continue training the model? If so, you’ll need to save the optimizer state. If you just want to run it for inference, you might not need this. It also gets more complicated with custom functions. In this post, I’ll walk through how to save a FastAI model and then load it again for inference.
This post is a collection of some notes and thoughts I’ve had when working with FastAI.