Lesson Video:


This article is also a Jupyter Notebook available to be run from the top down. There will be code snippets that you can then run in any environment.

Below are the versions of fastai, fastcore, and wwf currently running at the time of writing this:

  • fastai: 2.1.10
  • fastcore: 1.3.13
  • wwf: 0.0.7

This notebook goes through how to build a Siamese dataset from scratch.

What is a Siamese Problem?

Identifying if two images belong to the same class:

Common use cases:

  • Person identification
  • Small classification sample size

Why Siamese?

Let's think of an example problem:

I own 3 dogs and I want to differentiate between the three of them from a photo, but I only have 5 images of each animal. By our normal standards, we would say that is far too little of data for us to work with. But in this case, we have now 120 training samples (not including augmentation).

Our example will use the PETS dataset. We won't be training, but if you're dealing with this problem, you should have all the tools you need by now

Installing the library and starting to build the Dataset

from fastai.vision.all import *
src = untar_data(URLs.PETS)/'images'

We'll grab all the file names:

items = get_image_files(src)

And now we can start preparing our dataset. We will be doing everything at the lowest level possible today. First let's make a transform that will open some image from a filename and resize it.

def resized_image(fn:Path, sz=128):
  "Opens an image from `fn` and resizes it to `sz`"
  x = Image.open(fn).convert('RGB').resize((sz,sz))
  return tensor(array(x)).permute(2,0,1).float()/255.

Now let's get two random images (that we know are different)

img1 = resized_image(items[0], 448)
img2 = resized_image(items[1], 448)

Now we need some way of viewing our image, along with a title. Let's make a TitledImage class:

class TitledImage(Tuple):
  def show(self, ctx=None, **kwargs): show_titled_image(self, ctx=ctx, **kwargs)
TitledImage(img1, 'Test').show()

Now let's make something similar for a pair of images (our Siamese)

class SiameseImage(Tuple):
  def show(self, ctx=None, **kwargs):
    im1, im2, is_same = self
    return show_image(torch.cat([im1,im2], dim=2), title=is_same, ctx=ctx, **kwargs)

Let's look at two examples (which look remarkably) similar to that image earlier:

SiameseImage(img1, img1, True).show(figsize=(7,7));
SiameseImage(img1, img2, False).show(figsize=(7,7));

SiamesePair

Now we need some transform to generate our Siamese dataset. We'll want it to take in a list of items and labels:

class SiamesePair(Transform):
  "A transform to generate Siamese data"
  def __init__(self, items, labels):
    self.items, self.labels, self.assoc = items,labels,self
    sortlbl = sorted(enumerate(labels), key=itemgetter(1))
    self.clsmap = {k:L(v).itemgot(0) for k,v in itertools.groupby(sortlbl, key=itemgetter(1))}
    self.idxs = range_of(self.items)
  
  def encodes(self,i):
    "x: tuple of `i`th image and a random image from same or different class; y: True if same class"
    othercls = self.clsmap[self.labels[i]] if random.random()>0.5 else self.idxs
    otherit = random.choice(othercls)
    same = tensor([self.labels[otherit]==self.labels[i]]).int()
    return SiameseImage(self.items[i], self.items[otherit], same)

We are going to want some labels to be sued, so let's grab some:

labeller = RegexLabeller(pat = r'/([^/]+)_\d+.jpg$')
labels = items.map(labeller)
labels[:5], len(labels)
((#5) ['Bengal','german_shorthaired','great_pyrenees','japanese_chin','english_setter'],
 7390)

Now we can build our SiamesePair transform

sp = SiamesePair(items, labels)

Let's look at a few bits

sp.clsmap
{'Abyssinian': (#200) [23,69,89,90,103,127,233,244,288,307...],
 'Bengal': (#200) [0,5,61,77,194,214,234,337,424,457...],
 'Birman': (#200) [16,78,108,158,216,239,321,397,406,482...],
 'Bombay': (#200) [39,41,62,63,84,97,177,187,224,281...],
 'British_Shorthair': (#200) [24,82,92,130,143,156,195,219,283,295...],
 'Egyptian_Mau': (#200) [35,109,183,185,210,259,280,323,355,372...],
 'Maine_Coon': (#200) [44,53,131,135,140,172,246,269,309,333...],
 'Persian': (#200) [99,123,125,126,138,145,168,197,218,223...],
 'Ragdoll': (#200) [50,112,160,240,274,331,351,395,412,450...],
 'Russian_Blue': (#200) [32,68,137,139,141,142,189,250,251,297...],
 'Siamese': (#200) [119,193,322,343,369,370,375,386,552,627...],
 'Sphynx': (#200) [46,75,115,117,174,199,211,231,243,335...],
 'american_bulldog': (#200) [34,59,107,133,169,176,190,241,278,340...],
 'american_pit_bull_terrier': (#200) [17,29,124,191,202,206,235,236,248,285...],
 'basset_hound': (#200) [14,36,149,184,227,255,264,266,276,299...],
 'beagle': (#200) [70,100,101,120,167,173,203,212,262,385...],
 'boxer': (#200) [58,73,161,188,196,222,228,254,275,301...],
 'chihuahua': (#200) [18,40,48,57,118,150,245,286,510,511...],
 'english_cocker_spaniel': (#200) [12,33,45,55,93,116,157,198,229,449...],
 'english_setter': (#200) [4,7,71,146,208,221,252,289,328,330...],
 'german_shorthaired': (#200) [1,102,106,166,178,213,277,290,306,367...],
 'great_pyrenees': (#200) [2,204,238,256,320,373,381,417,525,528...],
 'havanese': (#200) [13,43,114,209,215,242,271,315,366,371...],
 'japanese_chin': (#200) [3,38,42,72,86,136,159,298,308,311...],
 'keeshond': (#200) [11,20,91,110,132,342,348,383,438,444...],
 'leonberger': (#200) [6,19,28,31,54,65,104,151,171,179...],
 'miniature_pinscher': (#200) [15,21,25,52,147,163,270,338,358,359...],
 'newfoundland': (#200) [37,66,83,105,170,186,226,232,247,293...],
 'pomeranian': (#200) [8,47,51,64,122,134,144,181,253,267...],
 'pug': (#200) [49,113,129,192,303,346,382,423,545,574...],
 'saint_bernard': (#200) [27,98,257,265,279,380,413,428,435,439...],
 'samoyed': (#200) [56,60,85,95,111,155,237,287,313,332...],
 'scottish_terrier': (#199) [10,76,80,94,175,249,268,284,422,445...],
 'shiba_inu': (#200) [9,30,79,153,292,310,379,459,478,480...],
 'staffordshire_bull_terrier': (#191) [26,74,81,121,148,165,182,205,272,296...],
 'wheaten_terrier': (#200) [22,67,128,154,162,164,200,207,217,230...],
 'yorkshire_terrier': (#200) [87,88,96,152,201,263,273,362,404,430...]}
sp.labels
(#7390) ['Bengal','german_shorthaired','great_pyrenees','japanese_chin','english_setter','Bengal','leonberger','english_setter','pomeranian','shiba_inu'...]

Now finally, we can build our Pipeline

Bringing it to a DataLoader

First we'll want to make a Transform out of that resized_image function we had

OpenAndResize = Transform(resized_image)

And now that we have all the pieces together, let's build a Pipeline:

pipe = Pipeline([sp, OpenAndResize])

And take a look at it's first set:

x,y,z = pipe(0)
x.shape, y.shape, z
(torch.Size([3, 128, 128]),
 torch.Size([3, 128, 128]),
 tensor([0], dtype=torch.int32))

To turn anything into a DataLoader, we want it to first be a TfmdList. We can accomplish this by passing in a list of index's and a Pipeline to run through:

tls = TfmdLists(range_of(items), pipe)

And now make our Dataloaders

dls = tls.dataloaders(bs=16, after_batch=[Normalize.from_stats(*imagenet_stats)])

And we can look at a batch!

batch = dls.one_batch()

Now I did not get the show function working, so let's take a look the very "simple" way

a,b,c = batch[0][0], batch[1][0], batch[2][0]
a.shape, b.shape, c
(torch.Size([3, 128, 128]),
 torch.Size([3, 128, 128]),
 tensor([1], dtype=torch.int32))
from torchvision import transforms
im1 = transforms.ToPILImage()(batch[0][0]).convert("RGB")
im2 = transforms.ToPILImage()(batch[1][0]).convert("RGB")
display(im1, im2)

While we cannot display, you now have a DataLoader ready for Siamese networks!

For more information on training a model and predicting, see the official fastai Siamese tutorial here