# CNN Example

---

Convolutional neural network (CNN) is a type of feed-forward artificial neural
network widely used for image and video classification. In this example, we will
use a deep CNN model to do image classification for the
[CIFAR10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html).


## Running instructions

Please refer to the [installation](installation.html) page for
instructions on building SINGA, and the [quick start](quick-start.html)
for instructions on starting zookeeper.

We have provided scripts for preparing the training and test dataset in *examples/cifar10/*.

    # in examples/cifar10
    $ cp Makefile.example Makefile
    $ make download
    $ make create


### Training on CPU

We can start the training by

    ./bin/singa-run.sh -conf examples/cifar10/job.conf

You should see output like

    Record job information to /tmp/singa-log/job-info/job-2-20150817-055601
    Executing : ./singa -conf /xxx/incubator-singa/examples/cifar10/job.conf -singa_conf /xxx/incubator-singa/conf/singa.conf -singa_job 2
    E0817 06:56:18.868259 33849 cluster.cc:51] proc #0 -> 192.168.5.128:49152 (pid = 33849)
    E0817 06:56:18.928452 33871 server.cc:36] Server (group = 0, id = 0) start
    E0817 06:56:18.928469 33872 worker.cc:134] Worker (group = 0, id = 0) start
    E0817 06:57:13.657302 33849 trainer.cc:373] Test step-0, loss : 2.302588, accuracy : 0.077900
    E0817 06:57:17.626708 33849 trainer.cc:373] Train step-0, loss : 2.302578, accuracy : 0.062500
    E0817 06:57:24.142645 33849 trainer.cc:373] Train step-30, loss : 2.302404, accuracy : 0.131250
    E0817 06:57:30.813354 33849 trainer.cc:373] Train step-60, loss : 2.302248, accuracy : 0.156250
    E0817 06:57:37.556655 33849 trainer.cc:373] Train step-90, loss : 2.301849, accuracy : 0.175000
    E0817 06:57:44.971276 33849 trainer.cc:373] Train step-120, loss : 2.301077, accuracy : 0.137500
    E0817 06:57:51.801949 33849 trainer.cc:373] Train step-150, loss : 2.300410, accuracy : 0.135417
    E0817 06:57:58.682281 33849 trainer.cc:373] Train step-180, loss : 2.300067, accuracy : 0.127083
    E0817 06:58:05.578366 33849 trainer.cc:373] Train step-210, loss : 2.300143, accuracy : 0.154167
    E0817 06:58:12.518497 33849 trainer.cc:373] Train step-240, loss : 2.295912, accuracy : 0.185417

After training some steps (depends on the setting) or the job is
finished, SINGA will [checkpoint](checkpoint.html) the model parameters.

### Training on GPU

Since version 0.2, we can train CNN models on GPU using cuDNN. Please refer to
the [GPU page](gpu.html) for details on compiling SINGA with GPU and cuDNN.
The configuration file is similar to that for CPU training, except that the
cuDNN layers are used and the GPU device is configured.

    ./bin/singa-run.sh -conf examples/cifar10/cudnn.conf

### Training using Python script

The python helpers coming with SINGA 0.2 make it easy to configure a training
job. For example the *job.conf* is replaced with a simple python script
*mnist_mlp.py* which has about 30 lines of code following the [Keras API](http://keras.io/).

      # on CPU
    ./bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py
      # on GPU
    ./bin/singa-run.sh -exec tool/python/examples/cifar10_cnn_cudnn.py

## Details

To train a model in SINGA, you need to prepare the datasets,
and a job configuration which specifies the neural net structure, training
algorithm (BP or CD), SGD update algorithm (e.g. Adagrad),
number of training/test steps, etc.

### Data preparation

Before using SINGA, you need to write a program to convert the dataset
into a format that SINGA can read. Please refer to the
[Data Preparation](data.html#example---cifar-dataset) to get details about
preparing this CIFAR10 dataset.

### Neural net

Figure 1 shows the net structure of the CNN model we used in this example, which is
set following [Alex](https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-18pct.cfg.)
The dashed circle represents one feature transformation stage, which generally
has four layers as shown in the figure. Sometimes the rectifier layer and normalization layer
are omitted or swapped in one stage. For this example, there are 3 such stages.

Next we follow the guide in [neural net page](neural-net.html)
and [layer page](layer.html) to write the neural net configuration.

<div style = "text-align: center">
<img src = "../_static/images/example-cnn.png" style = "width: 200px"> <br/>
<strong>Figure 1 - Net structure of the CNN example.</strong></img>
</div>

* We configure an input layer to read the training/testing records from a disk file.

        layer{
          name: "data"
          type: kRecordInput
          store_conf {
            backend: "kvfile"
            path: "examples/cifar10/train_data.bin"
            mean_file: "examples/cifar10/image_mean.bin"
            batchsize: 64
            random_skip: 5000
            shape: 3
            shape: 32
            shape: 32
           }
           exclude: kTest  # exclude this layer for the testing net
        }
        layer{
          name: "data"
          type: kRecordInput
          store_conf {
            backend: "kvfile"
            path: "examples/cifar10/test_data.bin"
            mean_file: "examples/cifar10/image_mean.bin"
            batchsize: 100
            shape: 3
            shape: 32
            shape: 32
           }
         exclude: kTrain # exclude this layer for the training net
        }


* We configure layers for the feature transformation as follows
(all layers are built-in layers in SINGA; hyper-parameters of these layers are set according to
[Alex's setting](https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-18pct.cfg)).

        layer {
          name: "conv1"
          type: kConvolution
          srclayers: "data"
          convolution_conf {... }
          ...
        }
        layer {
          name: "pool1"
          type: kPooling
          srclayers: "conv1"
          pooling_conf {... }
        }
        layer {
          name: "relu1"
          type: kReLU
          srclayers:"pool1"
        }
        layer {
          name: "norm1"
          type: kLRN
          lrn_conf {... }
          srclayers:"relu1"
        }

  The configurations for another 2 stages are omitted here.

* There is an [inner product layer](layer.html#innerproductlayer)
after the 3 transformation stages, which is
configured with 10 output units, i.e., the number of total labels. The weight
matrix Param is configured with a large weight decay scale to reduce the over-fitting.

        layer {
          name: "ip1"
          type: kInnerProduct
          srclayers:"pool3"
          innerproduct_conf {
            num_output: 10
          }
          param {
            name: "w4"
            wd_scale:250
            ...
          }
          param {
            name: "b4"
            ...
          }
        }

* The last layer is a [Softmax loss layer](layer.html#softmaxloss)

        layer{
          name: "loss"
          type: kSoftmaxLoss
          softmaxloss_conf{ topk:1 }
          srclayers:"ip1"
          srclayers: "data"
        }

### Updater

The [normal SGD updater](updater.html#updater) is selected.
The learning rate is changed like going down stairs, and is configured using the
[kFixedStep](updater.html#kfixedstep) type.

        updater{
          type: kSGD
          weight_decay:0.004
          learning_rate {
            type: kFixedStep
            fixedstep_conf:{
              step:0             # lr for step 0-60000 is 0.001
              step:60000         # lr for step 60000-65000 is 0.0001
              step:65000         # lr for step 650000- is 0.00001
              step_lr:0.001
              step_lr:0.0001
              step_lr:0.00001
            }
          }
        }

### TrainOneBatch algorithm

The CNN model is a feed forward model, thus should be configured to use the
[Back-propagation algorithm](train-one-batch.html#back-propagation).

    train_one_batch {
      alg: kBP
    }

### Cluster setting

The following configuration set a single worker and server for training.
[Training frameworks](frameworks.html) page introduces configurations of a couple of distributed
training frameworks.

    cluster {
      nworker_groups: 1
      nserver_groups: 1
    }