Apache SINGA
A distributed deep learning platform .
 All Classes Namespaces Files Functions Variables Typedefs Enumerator Macros
Public Member Functions | Protected Attributes | List of all members
singa::Worker Class Referenceabstract

The Worker class which runs the training algorithm. More...

#include <worker.h>

Inheritance diagram for singa::Worker:
singa::BPWorker

Public Member Functions

 Worker (int thread_id, int group_id, int worker_id)
 
void Setup (const ModelProto &model, shared_ptr< NeuralNet > train_net)
 
void set_test_net (shared_ptr< NeuralNet > test_net)
 
void set_validation_net (shared_ptr< NeuralNet > val_net)
 
void Stop ()
 
int Put (shared_ptr< Param > param, int step)
 
int Get (shared_ptr< Param > param, int step)
 
int Update (shared_ptr< Param > param, int step)
 
int Collect (shared_ptr< Param > param, int step)
 
int CollectAll (shared_ptr< NeuralNet > net, int step)
 
void RunOneBatch (int step, Metric *perf=nullptr)
 check validation/test firstly, then TrainOneBatch Performance collects performance for the whole neuralnet. More...
 
virtual void TrainOneBatch (int step)=0
 Train one mini-batch. More...
 
virtual void TestOneBatch (shared_ptr< NeuralNet > net, int step, Phase phase)=0
 Test/validate one mini-batch.
 
void Test (shared_ptr< NeuralNet > net, int nsteps, const string &prefix)
 Test the perforance of the learned model on validation or test dataset. More...
 
virtual void Run ()
 Main function of Worker. More...
 
const bool DisplayNow (const int step) const
 Pull data from layers resident on other nodes due to Model Partition. More...
 
const bool DisplayDebugInfo (const int step) const
 
const void DisplayPerformance (const Metric &perf, const string &prefix)
 
const bool StopNow (const int step) const
 return true if the stop condition is satisfied, e.g., the maximum number of steps have been reached.
 
const bool CheckpointNow (const int step) const
 Check is it time to do checkpoint. More...
 
const bool TestNow (const int step) const
 Check is it time to do test. More...
 
const bool ValidateNow (const int step)
 Check is it time to do validation. More...
 
void ReceiveBlobs (shared_ptr< NeuralNet > net)
 start training from scratch. More...
 
void SendBlob ()
 

Protected Attributes

int thread_id_
 
int group_id_
 
int worker_id_
 
int step_
 
ModelProto modelproto_
 
shared_ptr< NeuralNettrain_net_
 
shared_ptr< NeuralNettest_net_
 
shared_ptr< NeuralNetvalidation_net_
 
shared_ptr< Dealerlayer_dealer_
 
shared_ptr< Dealerparam_dealer_
 
Poller layer_poller_
 
Poller param_poller_
 

Detailed Description

The Worker class which runs the training algorithm.

The first worker group will initialize parameters of the Net, and put them into the distributed memory/table.

Member Function Documentation

const bool singa::Worker::CheckpointNow ( const int  step) const
inline

Check is it time to do checkpoint.

Parameters
stepthe ::Train() has been called this num times.
const bool singa::Worker::DisplayNow ( const int  step) const
inline

Pull data from layers resident on other nodes due to Model Partition.

void Pull(zsock_t* pull, shared_ptr<NeuralNet> net); Check is it time to display training info, e.g., loss and precison.

void singa::Worker::ReceiveBlobs ( shared_ptr< NeuralNet net)

start training from scratch.

setup training/test/validation neuralnets, then call Run(). void Start(ModelProto model); TODO Resume from snapshot void Resume();

virtual void singa::Worker::Run ( )
virtual

Main function of Worker.

  1. Train the neuralnet step by step, test/validation is done periodically.
  2. TODO Communicate with others, e.g., zookeeper, after every step.
void singa::Worker::RunOneBatch ( int  step,
Metric perf = nullptr 
)

check validation/test firstly, then TrainOneBatch Performance collects performance for the whole neuralnet.

Hence, no need to collect performance in every thread. Only the main thread will pass none null perf.

void singa::Worker::Test ( shared_ptr< NeuralNet net,
int  nsteps,
const string &  prefix 
)

Test the perforance of the learned model on validation or test dataset.

Test is done by the first group.

Parameters
net,neuralnetwork
phasekValidation or kTest.
const bool singa::Worker::TestNow ( const int  step) const
inline

Check is it time to do test.

Parameters
stepthe ::Train() has been called this num times.
virtual void singa::Worker::TrainOneBatch ( int  step)
pure virtual

Train one mini-batch.

Test/Validation is done before training.

Implemented in singa::BPWorker.

const bool singa::Worker::ValidateNow ( const int  step)
inline

Check is it time to do validation.

Parameters
stepthe ::Train() has been called step times.

The documentation for this class was generated from the following file: