Apache SINGA
A distributed deep learning platform .
|
ClusterRuntime is a runtime service that manages dynamic configuration and status of the whole cluster. More...
#include <cluster_rt.h>
Public Member Functions | |
ClusterRuntime (const std::string &host, int job_id) | |
ClusterRuntime (const std::string &host, int job_id, int timeout) | |
bool | Init () |
Initialize the runtime instance. | |
int | RegistProc (const std::string &host_addr, int pid) |
register the process, and get a unique process id More... | |
std::string | GetProcHost (int proc_id) |
translate the process id to host address More... | |
bool | WatchSGroup (int gid, int sid, rt_callback fn, void *ctx) |
Server: watch all workers in a server group, will be notified when all workers have left. | |
bool | JoinSGroup (int gid, int wid, int s_group) |
Worker: join a server group (i.e. More... | |
bool | LeaveSGroup (int gid, int wid, int s_group) |
Worker: leave a server group (i.e. More... | |
ClusterRuntime is a runtime service that manages dynamic configuration and status of the whole cluster.
It mainly provides following services: 1) Provide running status of each server/worker 2) Translate process id to (hostname:port)
std::string singa::ClusterRuntime::GetProcHost | ( | int | proc_id | ) |
translate the process id to host address
bool singa::ClusterRuntime::JoinSGroup | ( | int | gid, |
int | wid, | ||
int | s_group | ||
) |
Worker: join a server group (i.e.
start to read/update these servers)
bool singa::ClusterRuntime::LeaveSGroup | ( | int | gid, |
int | wid, | ||
int | s_group | ||
) |
Worker: leave a server group (i.e.
finish its all work)
int singa::ClusterRuntime::RegistProc | ( | const std::string & | host_addr, |
int | pid | ||
) |
register the process, and get a unique process id