# Interactive Training using Python --- `Layer` class ([layer.py](layer.py)) has the following methods for an interactive training. For the basic usage of Python binding features, please refer to [python.md](python.md). **ComputeFeature(self, \*srclys)** * This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(...) for data transformation. * `*srclys`: (an arbtrary number of) source layers **ComputeGradient(self)** * This method creates calls singa::Layer::ComputeGradient(...) for gradient computation. **GetParams(self)** * This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array. **SetParams(self, \*params)** * This method sets parameter values of the layer. * `*params`: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array. * * * `Dummy` class is a subclass of `Layer`, which is provided to fetch input data and/or label information. Specifically, it creates singa::DummyLayer. **Feed(self, shape, data, aux_data)** * This method sets input data and/or auxiary data such as labels. * `shape`: the shape (width and height) of dataset * `data`: input dataset * `aux_data`: auxiary dataset (e.g., labels) In addition, `Dummy` class has two subclasses named `ImageInput` and `LabelInput`. * `ImageInput` class will take three arguments as follows. **\_\_init__(self, height=None, width=None, nb_channel=1)** * Both `ImageInput` and `LabelInput` classes have their own Feed method to call Feed of Dummy class. **Feed(self, data)** * * * ## Example scripts for the interactive training Two example scripts are provided at [`train_mnist.py`]() and [`train_cifar10.py`](), one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset. * Assume that `nn` is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below. * `load_dataset()` method loads input data and corresponding labels, each of which is a 2D numpy array. For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1]. * `sgd` is an Updater instance. Please see [`python.md`](python.md) and [`model.py`]() for more details. #### Basic steps for the interactive training * Step 1: Prepare batchsized data and corresponding label information, and then input the data using `Feed()` method. * Step 2: (a) Transform data according to neuralnet (nn) structure using `ComputeFeature()`. Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide `label` information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy. * Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using `ComputeGradient()`. * Step 4: Update parameters, e.g., weight and bias, of layers using `Update()` of the updater. Here is an example script for the interactive training. ``` bsize = 64 # batchsize disp_freq = 10 # step to show the training accuracy x, y = load_dataset() for i in range(x.shape[0] / bsize): # (Step1) Input data containing "bsize" samples xb, yb = x[i*bsize:(i+1)*bsize, :], y[i*bsize:(i+1)*bsize, :] nn[0].Feed(xb) label.Feed(yb) # (Step2-a) Transform data according to the neuralnet (nn) structure for h in range(1, len(nn)): nn[h].ComputeFeature(nn[h-1]) # (Step2-b) Provide label to compute loss function loss.ComputeFeature(nn[-1], label) # (Step2-c) Print out performance, e.g., loss and accuracy if (i+1) % disp_freq == 0: print ' Step {:>3}: '.format(i+1), loss.display() # (Step3) Compute gradient in a reverse order loss.ComputeGradient() for h in range(len(nn)-1, 0, -1): nn[h].ComputeGradient() # (Step 4) Update parameter sgd.Update(i+1, nn[h]) ``` ### Example MLP Here is an example MLP model with 5 fully-connected hidden layers. Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`](). ``` input = ImageInput(28, 28) # image width and height label = LabelInput() nn = [] nn.append(input) nn.append(Dense(2500, init='uniform')) nn.append(Activation('stanh')) nn.append(Dense(2000, init='uniform')) nn.append(Activation('stanh')) nn.append(Dense(1500, init='uniform')) nn.append(Activation('stanh')) nn.append(Dense(1000, init='uniform')) nn.append(Activation('stanh')) nn.append(Dense(500, init='uniform')) nn.append(Activation('stanh')) nn.append(Dense(10, init='uniform')) loss = Loss('softmaxloss') sgd = SGD(lr=0.001, lr_type='step') ``` ### Example CNN Here is an example MLP model with 3 convolution and pooling layers. Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`](). ``` input = ImageInput(32, 32, 3) # image width, height, channel label = LabelInput() nn = [] nn.append(input) nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2)) nn.append(MaxPooling2D(pool_size=(3,3), stride=2)) nn.append(Activation('relu')) nn.append(LRN2D(3, alpha=0.00005, beta=0.75)) nn.append(Convolution2D(32, 5, 1, 2, b_lr=2)) nn.append(Activation('relu')) nn.append(AvgPooling2D(pool_size=(3,3), stride=2)) nn.append(LRN2D(3, alpha=0.00005, beta=0.75)) nn.append(Convolution2D(64, 5, 1, 2)) nn.append(Activation('relu')) nn.append(AvgPooling2D(pool_size=(3,3), stride=2)) nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0)) loss = Loss('softmaxloss') sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) ```