In this notebook we'll be looking at dblock.summary and how to interpret it
Libraries for today:
from fastai.vision.all import *
Below you will find the exact imports for everything we use today
from fastcore.transform import Pipeline
from fastai.data.block import CategoryBlock, DataBlock
from fastai.data.core import Datasets
from fastai.data.external import untar_data, URLs
from fastai.data.transforms import Categorize, GrandparentSplitter, IntToFloatTensor, Normalize, RandomSplitter, ToTensor, parent_label
from fastai.torch_core import to_device
from fastai.vision.augment import aug_transforms, Resize, RandomResizedCrop, FlipItem
from fastai.vision.data import ImageBlock, PILImage, get_image_files, imagenet_stats
We'll use ImageWoof like we did in previous notebooks
path = untar_data(URLs.IMAGEWOOF)
And create our label dictionary similarly
lbl_dict = dict(
  n02086240= 'Shih-Tzu',
  n02087394= 'Rhodesian ridgeback',
  n02088364= 'Beagle',
  n02089973= 'English foxhound',
  n02093754= 'Australian terrier',
  n02096294= 'Border terrier',
  n02099601= 'Golden retriever',
  n02105641= 'Old English sheepdog',
  n02111889= 'Samoyed',
  n02115641= 'Dingo'
)
Some minimal transforms to get us by
item_tfms = Resize(128)
batch_tfms = [*aug_transforms(size=224, max_warp=0), Normalize.from_stats(*imagenet_stats)]
bs=64
And our DataBlock
pets = DataBlock(blocks=(ImageBlock, CategoryBlock),
                 get_items=get_image_files,
                 splitter=RandomSplitter(),
                 get_y=Pipeline([parent_label, lbl_dict.__getitem__]),
                 item_tfms=item_tfms,
                 batch_tfms=batch_tfms)
Now to run .summary, we need to send in what our DataBlock expects. In this case it's a path (think how we make our DataLoaders from the DataBlock)
pets.summary(path)
tfms = [[PILImage.create], [parent_label, Categorize()]]
item_tfms = [ToTensor(), Resize(128)]
batch_tfms = [FlipItem(), RandomResizedCrop(128, min_scale=0.35),
              IntToFloatTensor(), Normalize.from_stats(*imagenet_stats)]
items = get_image_files(path)
split_idx = GrandparentSplitter(valid_name='val')(items)
dsets = Datasets(items, tfms, splits=split_idx)
dls = dsets.dataloaders(after_item=item_tfms, after_batch=batch_tfms, bs=64)
We'll want to grab the first item from our set
x = dsets.train[0]
x
And pass it into any after_item or after_batch transform Pipeline. We can list them by calling them
dls.train.after_item
dls.train.after_batch
And now we can pass in our item through the Pipeline like so:
(x[0] has our input and x[1] has our y)
for f in dls.train.after_item:
  name = f.name
  x = f(x)
  print(name, x[0])
for f in dls.train.after_batch:
  name = f.name
  x = f(to_device(x, 'cuda')) # we need to move our data to the GPU
  print(name, x[0])