Lesson Video:

This module was built by Ignacio Oguiza among others, see this megathread for the discussion. The essential goal here is we can use arrays of any dimension instead of just 1 (such as tabular) or 3 (such as images).

Note, this notebook was heavily influenced from his tutorial notebook here

Now let's grab what we need:

from fastai.tabular.all import *

from tsai.all import *

For our data we'll be utilizing the UCR repository which has 128 univariate and 30 multivariate datasets. In the framework we can quickly grab any dataset we want:

name = 'StarLightCurves'

We can now grab our train and validation by calling get_UCR_data:

X_train, y_train, X_valid, y_valid = get_UCR_data(name, verbose=True, on_disk=True)

Dataset: StarLightCurves
downloading data...
...data downloaded
decompressing data...
...data decompressed
X_train: (1000, 1, 1024)
y_train: (1000,)
X_valid: (8236, 1, 1024)
y_valid: (8236,)

Since data is already split between train and test in the UCR dataset, we are going to merge them and create some indices to split them in the same sets. To save on memory, he figured out a way to utilized the numpy arrays via your disk. We'll do so as follows:

X = np.concatenate((X_train, X_valid))
y = np.concatenate((y_train, y_valid))

np.save('./data/UCR/StarLightCurves/X.npy', X)
np.save('./data/UCR/StarLightCurves/y.npy', y)
del X, y

Now we can load them back in and make our splits:

X = np.load('./data/UCR/StarLightCurves/X.npy', mmap_mode='r')
y = np.load('./data/UCR/StarLightCurves/y.npy', mmap_mode='r')

splits = (L(np.arange(len(X_train)), use_list=True),
          L(np.arange(len(X_train), len(X)), use_list=True))

splits

((#1000) [0,1,2,3,4,5,6,7,8,9...],
 (#8236) [1000,1001,1002,1003,1004,1005,1006,1007,1008,1009...])

Since we used memmap the data is being directly read from memory. Now to make and use your own data they need to be in a three dimentional array with a format of:

Samples
Variables
Length (or timesteps)

To use this we have a special TSTensor built to handle such data:

t = TSTensor(X)

/usr/local/lib/python3.6/dist-packages/fastai/torch_core.py:117: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)
  return torch.from_numpy(x)

t

TSTensor(samples:9236, vars:1, len:1024)

So here we can see we had 60 samples with one variable and an overall time length of 570 for each

We can directly use this in our DataBlock API too! With a TSTensorBlock:

splitter = IndexSplitter(splits[1])
getters = [ItemGetter(0), ItemGetter(1)]

dblock = DataBlock(blocks=(TSTensorBlock, CategoryBlock),
                   getters=getters,
                   splitter=splitter)

Next we make a call to itemify which zips up our x's and our y's

src = itemify(X, y)

And now we can make our DataLoaders:

dls = dblock.dataloaders(src, bs=64, val_bs=128)

dls.show_batch(max_n=3)

And there's our data! Now are we still on disk? Yes

dls.dataset[0]

(memmap([[0.5373029 , 0.53110296, 0.52850294, ..., 0.52640295, 0.51950294,
          0.51140296]], dtype=float32), TensorCategory(2))

Training our Model

The particular model we're using is the Inception Time model. To do so we need the number of input classes and our number of variables:

dls.c

3

inp_vars = dls.dataset[0][0].shape[-2]

inp_vars

1

net = InceptionTime(inp_vars, dls.c)

learn = Learner(dls, net, loss_func=CrossEntropyLossFlat(), metrics=accuracy, opt_func=ranger)

learn.lr_find()

SuggestedLRs(lr_min=0.025118863582611083, lr_steep=0.002511886414140463)

And now we can fit!

learn.fit_flat_cos(10, 0.025)

epoch	train_loss	valid_loss	accuracy	time
0	0.539881	1.406547	0.641938	00:07
1	0.433147	0.335748	0.851506	00:07
2	0.373533	2.214681	0.559009	00:07
3	0.325850	0.599999	0.852477	00:07
4	0.295900	0.382158	0.837785	00:07
5	0.270379	0.178628	0.942205	00:07
6	0.252287	1.122112	0.765542	00:07
7	0.230175	0.151228	0.949247	00:07
8	0.200199	0.110600	0.972195	00:07
9	0.172292	0.106513	0.973045	00:07

Lesson 4 - Time Series

Lesson Video:

Training our Model