Removing most of the fastai library to train a model
Lesson 2
Lesson Video:
Introduction
In this chapter we’re going to go back to the previous lesson and train on the PETs dataset again, however there is a specific set of rules we will be following:
We cannot use the fastai data API, it must be done in raw PyTorch
We cannot use vision_learner, we must create our own model
We cannot use fastai’s Optimizer, it must be a PyTorch optimizer.
If you don’t know what that last part is, that is okay. We’ll cover it briefly in this lecture
Removing the Data API
Downloading the dataset
The only part of fastai we will be allowed to use is untar_data to get the dataset and imagenet_stats, so let’s import it and grab it now:
from fastai.data.external import untar_data, URLsfrom fastai.vision.data import imagenet_statsfrom fastcore.xtras import Path # to bring in some patched functionalities we will use laterdataset_path = untar_data(URLs.PETS)dataset_path.ls()
In Python this is a class constructor or what is called when you do MyClass(). For our class this will include taking in and storing the list of filenames, transforms, and a way to turn the label strings into a number
In Python this function is how you get the length of some collection of data or items when doing len(MyThing()). For our class this will return the length of all the items used in the dataset.
In Python this function is what gets called when you index into an object, such as myList[x] and will return whatever you are trying to grab when doing so. For our class this will grab and open a file, apply the transforms, and return a tuple of the image and the label
import refrom PIL import Imagefrom torch.utils.data import Dataset# This example is highly based on the work of Sylvain Gugger# for the Accelerate notebook example which can be found here: # https://github.com/huggingface/notebooks/blob/main/examples/accelerate_examples/simple_cv_example.ipynbclass PetsDataset(Dataset):"A basic dataset that will return a tuple of (image, label)"def__init__(self, filenames:list, transforms:nn.Sequential, label_to_int:dict):self.filenames = filenamesself.transforms = transformsself.label_to_int = label_to_intself.to_tensor = ToTensor()def__len__(self):returnlen(self.filenames)def apply_x_transforms(self, filename): image = Image.open(filename).convert("RGB") tensor_image =self.to_tensor(image)returnself.transforms(tensor_image)def apply_y_transforms(self, filename): label = re.findall(r"^(.*)_\d+\.jpg$", filename.name)[0].lower()returnself.label_to_int[label]def__getitem__(self, index): filename =self.filenames[index] x =self.apply_x_transforms(filename) y =self.apply_y_transforms(filename)return (x,y)
import refrom PIL import Imagefrom torch.utils.data import Dataset# This example is highly based on the work of Sylvain Gugger# for the Accelerate notebook example which can be found here: # https://github.com/huggingface/notebooks/blob/main/examples/accelerate_examples/simple_cv_example.ipynbclass PetsDataset(Dataset):"A basic dataset that will return a tuple of (image, label)"def__init__(self, filenames:list, transforms:nn.Sequential, label_to_int:dict):self.filenames = filenamesself.transforms = transformsself.label_to_int = label_to_intself.to_tensor = ToTensor()def__len__(self):returnlen(self.filenames)def apply_x_transforms(self, filename): image = Image.open(filename).convert("RGB") tensor_image =self.to_tensor(image)returnself.transforms(tensor_image)def apply_y_transforms(self, filename): label = re.findall(r"^(.*)_\d+\.jpg$", filename.name)[0].lower()returnself.label_to_int[label]def__getitem__(self, index): filename =self.filenames[index] x =self.apply_x_transforms(filename) y =self.apply_y_transforms(filename)return (x,y)
This function first opens an image in Pillow and converts it to an RGB color channel, then turns this PIL Image into a tensor, before finally applying the transforms we want applied to it.
This function uses regex to extract the filename based on the expectation it will show up as label_{some_number}.jpg and then converts this string label into an integer based on the label to integer dictionary
def__getitem__(self, index): filename =self.filenames[index] x =self.apply_x_transforms(filename) y =self.apply_y_transforms(filename)return (x,y)
This function first grabs the filename we want to use based on the index passed, then calls our defined apply_{type}_transform function before finally returning a tuple of the input and output.
Prepare for the Dataset
Next we need to prepare for the dataset by:
Getting a dictionary of labels to encoded classes
Split the dataset randomly 80/20
Labels as encoded classes
To get the labels as encoded classes, we can create a list of just labels then find the unique ones from them:
This performs a monkey-patched functionality to pathlib.Path in fastcore to perform an ls operation on the path, returning only files that end with .jpg
A map will apply some function to every single item in a collection. Generally it’s seen as map(func, items). Since this list is a fastcore.foundations.L, we can just use map() directly and have it apply to labels
lambda x:
A lambda function is what is called an anonymous function. These don’t need def name():... and instead assume the input is whatever goes before the :
re.findall(label_pat, x.name)[0].lower()
This will apply our label_pat to the filename, return the first found item, and lowercase it.
unique()
This will look inside our resulting labels and return a list of every single unique value inside it
Which has been transformed into a 224x224 tensor, and a class label of 3
Creating PyTorch Dataloaders
Next we need to create a set of DataLoader’s to use. These will get wrapped by fastai’s DataLoaders class, which let’s us use them directly in the framework:
The DataLoaders class accepts any number of DataLoader’s (from fastai or PyTorch), and each are accessible through dls[index]; however only the first two will be available as dls.train and dls.valid respectively.
Creating a PyTorch Model
We’ll be doing a similar method to what was shown earlier this lesson to create a pretrained model through PyTorch:
from torchvision.models import resnet34model = resnet34(pretrained=True)
And change the last layer’s outputs to be our number of classes:
The last thing we need to do is perform gradual unfreezing of our layers. What is this?
Gradual Unfreezing
In Concept
When we loaded the model in through vision_learner, it did a number of changes to our model.
It “cut off” that fc layer as well as the pooling layer (avgpool) to create a body
It used create_head to make a new head that fastai uses for their cnn models
It then froze the backbone of the model, or the body
Freezing means that the parameters inside that section of the model are considered untrainable, meaning their parameters won’t get updated as we train. This will be applied to both Model Body’s shown
Essentially, it looks like so:
While we won’t create a custom head, we will still be performing the freezing:
All of a PyTorch model’s layers live in the children() generator. We can find the last layer by turning that into a list and indexing into it properly
for layer inlist(model.children())[:-1]:ifhasattr(layer, "requires_grad_"): layer.requires_grad_(False)
Why iterate over all the layers including the pooling layer?
Pooling layers have no parameters, so there is no requires_grad_ to be set.
And now the backbone of the model is frozen. We’re almost there!
Creating an Optimizer
The last step we will go over here is creating the Optimizer.
What is an optimizer?
It is the backbone of our training. It is what goes through and calculates how to update our weights relative to our loss for a particular batch.
By default fastai uses the AdamW optimizer (shown as Adam in fastai). As a result we’ll use it here:
from torch.optim import AdamW
To use a PyTorch optimizer in the fastai framework, we make use of an OptimWrapper class fastai has to convert the PyTorch optimizer into something compatible:
from functools import partialfrom fastai.optimizer import OptimWrapper
A partial function is a function that has overloaded constructors, so when we call opt_func() now it will automatically have the opt parameter be set to the AdamW class.
Bringing in fastai and Training!
We have all the steps in place now to finally begin training. As mentioned previously fastai’s training magic is all within the Learner class. As a result, we will import it and any patched methods we want to use:
from fastai.losses import CrossEntropyLossFlatfrom fastai.metrics import accuracyfrom fastai.learner import Learnerfrom fastai.callback.schedule import Learner # To get `fit_one_cycle`, `lr_find`, and more
To bring inMSELossFlaty @patched functions defined in a fastai module, we can import the entire module or import the class. Both do not immediatly pollute the namespace, however the one shown here is better for code clarity
We then pass all the items we’ve written so far to the Learner:
When performing inference, we set the model to evaluation mode. This modifies any layers that keep track of things during training and are deterministic such as BatchNorm
with torch.no_grad(): preds = net(tfm_x.cuda())
We wrap the prediction around torch.no_grad to skip calculating the gradients. This make inference time a bit faster and saves a bit of memory. Also the input needs to be on the right device (cuda)
pred = preds.argmax(dim=-1)[0]label =list(label_to_int.keys())[pred]
To get the class result, we find what index had the highest value. Since we only predicted on one value we can take the first item. And finally we can take our label_to_int dictionary from earlier and index into it to grab the true label.