# Interactive Training using Python

---

`Layer` class ([layer.py](layer.py)) has the following methods for an interactive training.
For the basic usage of Python binding features, please refer to [python.md](python.md).

**ComputeFeature(self, \*srclys)**

* This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(...) for data transformation.

	* `*srclys`: (an arbtrary number of) source layers

**ComputeGradient(self)**

* This method creates calls singa::Layer::ComputeGradient(...) for gradient computation.

**GetParams(self)**

* This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array.

**SetParams(self, \*params)**

* This method sets parameter values of the layer.
	* `*params`: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array.

* * *

`Dummy` class is a subclass of `Layer`, which is provided to fetch input data and/or label information.
Specifically, it creates singa::DummyLayer.

**Feed(self, shape, data, aux_data)**

* This method sets input data and/or auxiary data such as labels.

	* `shape`: the shape (width and height) of dataset
	* `data`: input dataset
	* `aux_data`: auxiary dataset (e.g., labels)

In addition, `Dummy` class has two subclasses named `ImageInput` and `LabelInput`.

* `ImageInput` class will take three arguments as follows.

	**\_\_init__(self, height=None, width=None, nb_channel=1)**

* Both `ImageInput` and `LabelInput` classes have their own Feed method to call Feed of Dummy class.

	**Feed(self, data)**


<!--

Users can save or load model parameter (e.g., weight and bias) at anytime during training.
The following methods are provided in `model.py`.

**save_model_parameter(step, fout, neuralnet)**

* This method saves model parameters into the specified checkpoint (fout).

	* `step`: the step id of training
	* `fout`: the name of checkpoint (output filename)
	* `neuralnet`: neural network model, i.e., a list of layers

**load_model_parameter(fin, neuralnet, batchsize=1, data_shape=None)**

* This method loads model parameters from the specified checkpoint (fin).

	* `fin`: the name of checkpoint (input filename)
	* `neuralnet`: neural network model, i.e., a list of layers
	* `batchsize`:
	* `data_shape`:
-->

* * *

## Example scripts for the interactive training

Two example scripts are provided at [`train_mnist.py`]() and [`train_cifar10.py`](), one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset.

* Assume that `nn` is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below.

* `load_dataset()` method loads input data and corresponding labels, each of which is a 2D numpy array.
For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1].

* `sgd` is an Updater instance. Please see [`python.md`](python.md) and [`model.py`]() for more details.

#### Basic steps for the interactive training

* Step 1: Prepare batchsized data and corresponding label information, and then input the data using `Feed()` method.

* Step 2: (a) Transform data according to neuralnet (nn) structure using `ComputeFeature()`. Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide `label` information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy.

* Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using `ComputeGradient()`.

* Step 4: Update parameters, e.g., weight and bias, of layers using `Update()` of the updater.

Here is an example script for the interactive training.
```
bsize = 64      # batchsize
disp_freq = 10  # step to show the training accuracy

x, y = load_dataset()

for i in range(x.shape[0] / bsize):

	# (Step1) Input data containing "bsize" samples
	xb, yb = x[i*bsize:(i+1)*bsize, :], y[i*bsize:(i+1)*bsize, :]
	nn[0].Feed(xb)
	label.Feed(yb)

	# (Step2-a) Transform data according to the neuralnet (nn) structure
	for h in range(1, len(nn)):
		nn[h].ComputeFeature(nn[h-1])

	# (Step2-b) Provide label to compute loss function
	loss.ComputeFeature(nn[-1], label)

	# (Step2-c) Print out performance, e.g., loss and accuracy
	if (i+1) % disp_freq == 0:
		print '  Step {:>3}: '.format(i+1),
		loss.display()

	# (Step3) Compute gradient in a reverse order
	loss.ComputeGradient()
	for h in range(len(nn)-1, 0, -1):
		nn[h].ComputeGradient()
		# (Step 4) Update parameter
		sgd.Update(i+1, nn[h])
```        

<a id="model"></a>
### <a href="#model">Example MLP</a>  

Here is an example MLP model with 5 fully-connected hidden layers.
Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

```
input = ImageInput(28, 28) # image width and height
label = LabelInput()

nn = []
nn.append(input)
nn.append(Dense(2500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(2000, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(1500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(1000, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(10, init='uniform'))
loss = Loss('softmaxloss')

sgd = SGD(lr=0.001, lr_type='step')

```

### <a href="#model2">Example CNN</a>  

Here is an example MLP model with 3 convolution and pooling layers.
Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

```
input = ImageInput(32, 32, 3) # image width, height, channel
label = LabelInput()

nn = []
nn.append(input)
nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
nn.append(MaxPooling2D(pool_size=(3,3), stride=2))
nn.append(Activation('relu'))
nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
nn.append(Convolution2D(32, 5, 1, 2, b_lr=2))
nn.append(Activation('relu'))
nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
nn.append(Convolution2D(64, 5, 1, 2))
nn.append(Activation('relu'))
nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0))
loss = Loss('softmaxloss')

sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
```