Interactive Training using Python

Layer class (layer.py) has the following methods for an interactive training. For the basic usage of Python binding features, please refer to python.md.

ComputeFeature(self, *srclys)

This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(…) for data transformation.
- *srclys: (an arbtrary number of) source layers

ComputeGradient(self)

This method creates calls singa::Layer::ComputeGradient(…) for gradient computation.

GetParams(self)

This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array.

SetParams(self, *params)

This method sets parameter values of the layer.
- *params: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array.

Dummy class is a subclass of Layer, which is provided to fetch input data and/or label information. Specifically, it creates singa::DummyLayer.

Feed(self, shape, data, aux_data)

This method sets input data and/or auxiary data such as labels.
- shape: the shape (width and height) of dataset
- data: input dataset
- aux_data: auxiary dataset (e.g., labels)

In addition, Dummy class has two subclasses named ImageInput and LabelInput.

ImageInput class will take three arguments as follows.

__init__(self, height=None, width=None, nb_channel=1)
Both ImageInput and LabelInput classes have their own Feed method to call Feed of Dummy class.

Feed(self, data)

Example scripts for the interactive training

Two example scripts are provided at train_mnist.py and train_cifar10.py, one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset.

Assume that nn is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below.
load_dataset() method loads input data and corresponding labels, each of which is a 2D numpy array. For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1].
sgd is an Updater instance. Please see python.md and model.py for more details.

Basic steps for the interactive training

Step 1: Prepare batchsized data and corresponding label information, and then input the data using Feed() method.
Step 2: (a) Transform data according to neuralnet (nn) structure using ComputeFeature(). Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide label information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy.
Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using ComputeGradient().
Step 4: Update parameters, e.g., weight and bias, of layers using Update() of the updater.

Here is an example script for the interactive training. ``` bsize = 64 # batchsize disp_freq = 10 # step to show the training accuracy

x, y = load_dataset()

for i in range(x.shape[0] / bsize):

# (Step1) Input data containing "bsize" samples
xb, yb = x[i*bsize:(i+1)*bsize, :], y[i*bsize:(i+1)*bsize, :]
nn[0].Feed(xb)
label.Feed(yb)

# (Step2-a) Transform data according to the neuralnet (nn) structure
for h in range(1, len(nn)):
    nn[h].ComputeFeature(nn[h-1])

# (Step2-b) Provide label to compute loss function
loss.ComputeFeature(nn[-1], label)

# (Step2-c) Print out performance, e.g., loss and accuracy
if (i+1) % disp_freq == 0:
    print '  Step {:>3}: '.format(i+1),
    loss.display()

# (Step3) Compute gradient in a reverse order
loss.ComputeGradient()
for h in range(len(nn)-1, 0, -1):
    nn[h].ComputeGradient()
    # (Step 4) Update parameter
    sgd.Update(i+1, nn[h])

<a id="model"></a>
### <a href="#model">Example MLP</a>  

Here is an example MLP model with 5 fully-connected hidden layers.
Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

input = ImageInput(28, 28) # image width and height label = LabelInput()

nn = [] nn.append(input) nn.append(Dense(2500, init=‘uniform’)) nn.append(Activation(‘stanh’)) nn.append(Dense(2000, init=‘uniform’)) nn.append(Activation(‘stanh’)) nn.append(Dense(1500, init=‘uniform’)) nn.append(Activation(‘stanh’)) nn.append(Dense(1000, init=‘uniform’)) nn.append(Activation(‘stanh’)) nn.append(Dense(500, init=‘uniform’)) nn.append(Activation(‘stanh’)) nn.append(Dense(10, init=‘uniform’)) loss = Loss(‘softmaxloss’)

sgd = SGD(lr=0.001, lr_type=‘step’)

### <a href="#model2">Example CNN</a>  

Here is an example MLP model with 3 convolution and pooling layers.
Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

input = ImageInput(32, 32, 3) # image width, height, channel label = LabelInput()

nn = [] nn.append(input) nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2)) nn.append(MaxPooling2D(pool_size=(3,3), stride=2)) nn.append(Activation(‘relu’)) nn.append(LRN2D(3, alpha=0.00005, beta=0.75)) nn.append(Convolution2D(32, 5, 1, 2, b_lr=2)) nn.append(Activation(‘relu’)) nn.append(AvgPooling2D(pool_size=(3,3), stride=2)) nn.append(LRN2D(3, alpha=0.00005, beta=0.75)) nn.append(Convolution2D(64, 5, 1, 2)) nn.append(Activation(‘relu’)) nn.append(AvgPooling2D(pool_size=(3,3), stride=2)) nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0)) loss = Loss(‘softmaxloss’)

sgd = SGD(decay=0.004, momentum=0.9, lr_type=‘manual’, step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) ```