Dataset:
Our dataset today will be ImageWoof. Link
Goal: Using no pre-trained weights, see how well of accuracy we can get in x epochs
This dataset is generally harder than imagenette, both are a subset of ImageNet.
Models are leaning more towards being faster, more effecient
Let's import the library:
from fastai.vision.all import *
Below you will find the exact imports for everything we use today
import kornia
from torch import nn
from fastai.callback.progress import ProgressCallback
from fastai.callback.schedule import lr_find, fit_flat_cos, fit_one_cycle
from fastai.data.core import Datasets
from fastai.data.external import untar_data, URLs
from fastai.data.transforms import Categorize, GrandparentSplitter, IntToFloatTensor, Normalize, ToTensor, parent_label
from fastai.layers import Mish
from fastai.learner import Learner
from fastai.metrics import LabelSmoothingCrossEntropy, top_k_accuracy, accuracy
from fastai.optimizer import ranger, Lookahead, RAdam
from fastai.vision.augment import FlipItem, RandomResizedCrop, Resize
from fastai.vision.core import PILImage, imagenet_stats, get_image_files
from fastai.vision.models.xresnet import xresnet50
Let's grab our data. For the competition, we'll focus on 5 epochs at 128x128
path = untar_data(URLs.IMAGEWOOF)
There are a few more datasets available:
ImageNette: Slightly easier than ImageWoof, 10 easily classified classes from Imagenet (tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute)
ImageWoof: 10 different dog breeds, Australian terrier, Border terrier, Samoyed, Beagle, Shih-Tzu, English foxhound, Rhodesian ridgeback, Dingo, Golden retriever, Old English sheepdog
Image网 (Pronounced Imagewang, "net" in Chinese): Both ImageNette and Woof but:
- The validation set is the same as Imagewoof (i.e. 30% of Imagewoof images); there are no Imagenette images in the validation set (they're all in the training set)
- Only 10% of Imagewoof images are in the training set!
- The remaining are in the unsup ("unsupervised") directory, and you can not use their labels in training!
We'll use the low-level Dataset
API for this.
tfms = [[PILImage.create], [parent_label, Categorize()]]
item_tfms = [ToTensor(), Resize(128)]
batch_tfms = [FlipItem(), RandomResizedCrop(128, min_scale=0.35),
IntToFloatTensor(), Normalize.from_stats(*imagenet_stats)]
Let's make our split
items = get_image_files(path)
split_idx = GrandparentSplitter(valid_name='val')(items)
Now let's build our Datasets
dsets = Datasets(items, tfms, splits=split_idx)
And our DataLoaders
dls = dsets.dataloaders(after_item=item_tfms, after_batch=batch_tfms, bs=64)
Let's make sure they look okay
dls.show_batch()
Wait, those aren't species names! How do I get that readable?
- We can use a dictionary as a transform on our
y
's
lbl_dict = dict(
n02086240= 'Shih-Tzu',
n02087394= 'Rhodesian ridgeback',
n02088364= 'Beagle',
n02089973= 'English foxhound',
n02093754= 'Australian terrier',
n02096294= 'Border terrier',
n02099601= 'Golden retriever',
n02105641= 'Old English sheepdog',
n02111889= 'Samoyed',
n02115641= 'Dingo'
)
To do so we pass in the __getitem__
attribute to our transforms
tfms = [[PILImage.create], [parent_label, lbl_dict.__getitem__, Categorize()]]
dsets = Datasets(items, tfms, splits=split_idx)
dls = dsets.dataloaders(after_item=item_tfms, after_batch=batch_tfms, bs=64)
Let's make sure it worked
dls.show_batch()
Much better!
The Architecture and the new Improvements
- XResNet - Based on the Bag of Tricks for Resnet paper
- Mish - A new activation function that has shown fantastic results
- Ranger - A new optimizer function based on two seperate papers:
- Self-Attention - Bringing in ideas from GAN's into image classification (Incorportated by Seb)
- MaxBlurPool - Better generalization
- Flatten + Anneal Scheduling - Mikhail Grankin
- Label Smoothing Cross Entropy - A threshold base (were you close) rather than yes or no
All of these ideas live in the library (except Dialated Convolutions for now). But where? Let's walk through it
arch = xresnet50(pretrained=False)
After the ImageWoof
competition, almost all of the above can simply be a setting in xresnet
.
arch = xresnet50(pretrained=False, act_cls=Mish)
arch[0]
arch = xresnet50(pretrained=False, act_cls=Mish, sa=True)
For this one we'll need a custom function
# https://discuss.pytorch.org/t/how-can-i-replace-an-intermediate-layer-in-a-pre-trained-network/3586/7
import kornia
def convert_MP_to_blurMP(model, layer_type_old):
conversion_count = 0
for name, module in reversed(model._modules.items()):
if len(list(module.children())) > 0:
# recurse
model._modules[name] = convert_MP_to_blurMP(module, layer_type_old)
if type(module) == layer_type_old:
layer_old = module
layer_new = kornia.contrib.MaxBlurPool2d(3, True)
model._modules[name] = layer_new
return model
net = xresnet50(pretrained=False, act_cls=Mish, sa=True, n_out=10)
net = convert_MP_to_blurMP(net, nn.MaxPool2d)
Now that we know how to use it all, let's test it out!
opt_func = ranger
That is the same thing as:
def opt_func(ps, lr=1e-3): return Lookahead(RAdam(ps, lr=lr))
opt_func = opt_func
We'll also use Label Smoothing Cross Entropy as our loss
learn = Learner(dls, model=net, loss_func=LabelSmoothingCrossEntropy(), metrics=[top_k_accuracy, accuracy])
Finally let's find and fit our model
learn.lr_find()
And fit for 5 epochs! We want to use fit_flat_cos
that Mikahil Grankin came up with. Why?
- We noticed gradient blow up. So instead of One-Cycle:
from fastai.test_utils import synth_learner
synth = synth_learner()
synth.fit_one_cycle(1)
synth.recorder.plot_sched()
We fit with Cosine Annealing:
synth.fit_flat_cos(1, pct_start=0.72)
synth.recorder.plot_sched()
net = xresnet50(pretrained=False, act_cls=Mish, sa=True, n_out=10)
net = convert_MP_to_blurMP(net, nn.MaxPool2d)
learn = Learner(dls, model=net, loss_func=LabelSmoothingCrossEntropy(), metrics=[top_k_accuracy, accuracy])
learn.fit_one_cycle(5, 2e-3)
net = xresnet50(pretrained=False, act_cls=Mish, sa=True, n_out=10)
net = convert_MP_to_blurMP(net, nn.MaxPool2d)
learn = Learner(dls, model=net, loss_func=LabelSmoothingCrossEntropy(), metrics=[top_k_accuracy, accuracy], opt_func=ranger)
learn.fit_flat_cos(5, 4e-3)