Apache Slider: YARN-hosted applications¶
NAME¶
slider -YARN-hosted applications
SYNOPSIS¶
Slider enables applications to be dynamically created on a YARN-managed datacenter. The program can be used to create, pause, and shutdown the application. It can also be used to list current active and existing but not running "stopped" application instances.
CONCEPTS¶
-
A Slider application is an application packaged to be deployed by Slider. It consists of one or more distributed components
-
A Slider application instance is a slider application configured to be deployable on a specific YARN cluster, with a specific configuration. An instance can be live -actually running- or stopped. When stopped all its configuration details and instance-specific data are preserved on HDFS.
-
An instance directory is a directory created in HDFS describing the application instance; it records the configuration -both user specified, application-default and any dynamically created by slider.
-
A user can create an application instance.
-
A live instances can be stopped, saving its final state to its application instance state directory. All running components are shut down.
-
A stopped instance can be started -a its components started on or near the servers where they were previously running.
-
A stopped instance can be destroyed.
-
Running instances can be listed.
-
An instance consists of a set of components
-
The supported component types depends upon the slider application.
-
the count of each component must initially be specified when an application instance is created.
-
Users can flex an application instance: adding or removing components dynamically. If the application instance is live, the changes will have immediate effect. If not, the changes will be picked up when the instance is next started.
Invoking Slider¶
slider <ACTION> [<name>] [<OPTIONS>]
COMMON COMMAND-LINE OPTIONS¶
--conf configuration.xml
¶
Configure the Slider client. This allows the filesystem, zookeeper instance and other properties to be picked up from the configuration file, rather than on the command line.
Important: *this configuration file is not propagated to the application. It is purely for configuring the client itself.
-D name=value
¶
Define a Hadoop configuration option which overrides any options in the client configuration XML files.
-m, --manager url
¶
URL of the YARN resource manager
--fs filesystem-uri
¶
Use the specific filesystem URI as an argument to the operation.
Instance Naming¶
Application instance names must:
- be at least one character long
- begin with a lower case letter
- All other characters must be in the range [a-z,0-9,_]
- All upper case characters are converted to lower case
Example valid names:
slider1 storm4 hbase_instance accumulo_m1_tserve4
Actions¶
COMMANDS
slider build <name>
¶
Build an instance of the given name, with the specific options.
It is not started; this can be done later with a start
command.
slider create <name>
¶
Build and run an application instance of the given name
Arguments for build
and create
¶
--package <uri-to-package>
¶
This define the slider application package to be deployed.
--option <name> <value>
¶
Set an application instance option.
Example:
Set an option to be passed into the -site.xml
file of the target system, reducing
the HDFS replication factor to 2. (
--option site.dfs.blocksize 128m
Increase the number of YARN containers which must fail before the Slider application instance itself fails.
-O slider.container.failure.threshold 16
--appconf dfspath
¶
A URI path to the configuration directory containing the template application specification. The path must be on a filesystem visible to all nodes in the YARN cluster.
- Only one configuration directory can be specified.
- The contents of the directory will only be read when the application instance is created/built.
Example:
--appconf hdfs://namenode/users/slider/conf/hbase-template --appconf file://users/accumulo/conf/template
--apphome localpath
¶
A path to the home dir of a pre-installed application. If set when a Slider application instance is created, the instance will run with the binaries pre-installed on the nodes at this location
Important: this is a path in the local filesystem which must be present on all hosts in the YARN cluster
Example
--apphome /usr/hadoop/hbase
--template <filename>
¶
Filename for the template application instance configuration. This
will be merged with -and can overwrite- the built-in configuration options, and can
then be overwritten by any command line --option
and --compopt
arguments to
generate the final application configuration.
--resources <filename>
¶
Filename for the resources configuration. This
will be merged with -and can overwrite- the built-in resource options, and can
then be overwritten by any command line --resopt
, --rescompopt
and --component
arguments to generate the final resource configuration.
--image path
¶
The full path in Hadoop HDFS to a .tar
or .tar.gz
file containing
the binaries needed to run the target application.
Example
--image hdfs://namenode/shared/binaries/hbase-0.96.tar.gz
--component <name> <count>
¶
The desired number of instances of a component
Example
--component worker 16
This just sets the component.instances
value of the named component's resource configuration.
it is exactly equivalent to
--rco worker component.instances 16
--compopt <component> <option> <value>
¶
Provide a specific application configuration option for a component
Example
--compopt master env.TIMEOUT 10000
These options are saved into the app_conf.json
file; they are not used to configure the YARN Resource
allocations, which must use the --rco
parameter
Resource Component option --rescompopt
--rco
¶
--rescompopt <component> <option> <value>
Set any role-specific option, such as its YARN memory requirements.
Example
--rco worker master yarn.memory 2048 --rco worker worker yarn.memory max
--zkhosts host1:port1,[host2:port2,host3:port3, ...]
¶
The zookeeper quorum.
Example
--zkhosts zk1:2181,zk2:2181,zk3:4096
If unset, the zookeeper quorum defined in the property slider.zookeeper.quorum
is used
--zkpath <zookeeper-path>
¶
A path in the zookeeper cluster to create for an application. This is useful for applications which require a path to be created in advance of their deployment. When the an application instance is destroyed, this path will be deleted.
--queue <queue name>
¶
The queue to deploy the application to. By default, YARN will pick the queue.
Example
--queue applications
Arguments purely for the create
operation¶
--wait <milliseconds>
¶
The --wait
parameter, if provided, specifies the time in milliseconds to wait until the YARN application is actually running. Even after the YARN application has started, there may be some delay for the instance to start up.
[--out
--out <filename>
¶
The name of a file to save a YARN application report to as a JSON file. This file will contain the YARN application ID and other information about the submitted application.
Examples
¶
Create an application by providing template
and resources
.
create hbase1 --template /usr/work/hbase/appConfig.json --resources /usr/work/hbase/resources.json
Create an application by providing template
and resources
and queue
.
create hbase1 --template /usr/work/hbase/appConfig.json --resources /usr/work/hbase/resources.json --queue default
destroy <name>
¶
Destroy a (stopped) application instance .
Important: This deletes all persistent data, hence invoking this command by default does not destroy the app. It prints an
appropriate message and asks the app owner to re-invoke the same command with --force
option, if they are sure they
know what they are doing.
Example
slider destroy --force instance1
exists <name> [--live] [--state status]
¶
Probe the existence of the named Slider application instance. If the --live
flag is set, the instance
must actually be running
If not, an error code is returned.
When the --live
flag is unset, the command looks for the application instance to be
defined in the filesystem -its operation state is not checked.
it will "succeed" if the definition files of the named application instance are found.
Example:
slider exists instance4
Return codes
0 : application instance is running -1 : application instance exists but is not running 69 : application instance is unknown
Live Tests¶
When the --live
flag is set, the application instance must be running
or about to run for the probe to succeed.
That is, either application is running (RUNNING
) or in any of the states
from which an application can start running. That means the service can be
in any of the states NEW
, NEW_SAVING
, SUBMITTED
, ACCEPTED
or RUNNING
An application instance that is FINISHED
or FAILED
or KILLED
is not considered to be live.
Note that probe does not check the liveness of the actually deployed application, merely that the application instance has been deployed
Return codes
0 : application instance is running -1 : application instance exists but is not running 69 : application instance is unknown
Example:
slider exists instance4 --live
When the --state
flag is set, a specific YARN application state is checked for.
The allowed YARN states are:
NEW: Application which was just created. NEW_SAVING: Application which is being saved. SUBMITTED: Application which has been submitted. ACCEPTED: Application has been accepted by the scheduler RUNNING: Application which is currently running. FINISHED: Application which finished successfully. FAILED: Application which failed. KILLED: Application which was terminated by a user or admin.
Example:
slider exists instance4 --state ACCEPTED
Return codes
0 : application instance is running -1 : application instance exists but is not in the desired state 69 : application instance is unknown
flex <name> [--component component count]*
¶
Flex the number of workers in an application instance to the new value. If greater than before, new copies of the component will be requested. If less, component instances will be destroyed.
This operation has a return value of 0 if the change was submitted. Changes are not immediate and depend on the availability of resources in the YARN cluster
It returns -1 if there is no running application instance
Example
slider flex instance1 --component worker 8 --filesystem hdfs://host:port slider flex instance1 --component master 2 --filesystem hdfs://host:port
install-package --name <name of the package> --package <package file> [--replacepkg]
¶
Install the application package to the default package location for the user under ~/.slider/package/
--name <name of the package>
¶
Name of the package. It may be the same as the name provided in the metainfo.xml. Ensure that the same value is used in the default application package location specified in the default appConfig.json file.
--package <package file>
¶
Location of the package on local disk.
--replacepkg
¶
Optional. Whether to overwrite an already installed package.
Example
slider install-package --name HBASE --package /usr/work/package/hbase/slider-hbase-app-package-0.98.4-hadoop2.zip slider install-package --name HBASE --package /usr/work/package/hbase/slider-hbase-app-package-0.98.4-hadoop2.zip --replacepkg
kdiag [--keytab <keytab> --principal <principal>] [--out <outfile>] [--keylength <length>] [--secure]
¶
Kerberos diagnostics.
Any information which can be obtained to diagnose Kerberos problems: dumping settings, attempting login from a given keytab, etc, etc.
The purpose here is to have something which can be used to begin to understand why the client is having problems talking to Kerberos; a file which can be attached to support calls.
For an example of the output, see SLIDER-1027
Arguments
-
--keytab <keytab> --principal <principal>
: list a keytab file to use and the principal to log in as. The file must contain the specific principal. -
--keylength <length>
: set the minimum encryption key length as measured in bits. If the JVM does not support this length, the command will fail. The default value is to 256, as needed for theAES256
encryption scheme. A JVM without the Java Cryptography Extensions installed does not support a key length of 256 bits: Kerberos will unless configured to use an encryption scheme with a shorter key length. -
--secure
: fail if the command is not executed on a secure cluster. That is: if the hadoop authentication mechanism of the cluster is "simple".
Although there is a --out outfile
option, much of the output can come from the JRE
(to stderr
) and via log4j (to stdout
). To get all the output, it is best
to redirect both these output streams to the same file, and omit the --out
option.
slider kdiag --keytab zk.service.keytab --principal zookeeper/devix.example.org@REALM > out.txt 2>&1
For extra logging during the operation
-
Set the environment variable
HADOOP_JAAS_DEBUG
totrue
.export HADOOP_JAAS_DEBUG=true
-
Edit the
log4j.properties
file for the slider client:log4j.logger.org.apache.hadoop.security=DEBUG
The diagnostics information currently generated are incomplete. Any contributions to this codebase is very welcome.
list [name] [--live] [--status status]
¶
List Slider application instances visible to the user. This includes instances which are on the filesystem.
If no instance name is specified, all instances matching the criteria are listed.
--live
indicates live instances are to be listed: that is anythingRUNNING
or awaiting execution (e.gACCEPTED
) or earlier.--state <state>
defines an explicit state for which a record of the cluster must be found in the RM.
The default is: list all application instances, running or not
If an instance name is given, then that instance must in the filesystem or the operation will fail with the unknown cluster exit code.
If the instance exists but is not in the -live
state or any state specified
by a --state
argument —the operation will return -1
Example
slider list slider list instance1 slider list --live slider list instance1 --live slider list --state FINISHED slider list --state KILLED slider list --state FAILED slider list instance1 --state FAILED
Important: listings which search for completed instances may succeed while an instance of the same name is running. This is because the operation lists YARN records —and records of completed applications are retained for some time.
That is, if an instance is started and then stopped, then a new instance started, the following two operations may both succeed
slider list instance1 --live slider list --state FINISHED
registry (--list | --listconf | --getconf <conf> ) [--name <name>] [--servicetype <servicetype>] [--out <filename>] [--verbose]
¶
List registered application instances visible to the user. This is slightly
different from the slider list
command in that it does not make use of the
YARN application list. Instead it communicates with Zookeeper -and works
with any applications which has registered itself with the "service registry"
The --name <name>
option names the registry entry to work with. For slider applications,
this is the application instance
The --user <user>
option names the user who owns/deployed the service.
it defaults to the current user.
The --servicetype <servicetype>
option allows a different service type to be chosen.
The default is org-apache-slider
The --verbose
flag triggers more verbose output on the operations
The --internal
flag indicates the configurations to be listed and retrieved
are from the "internal" list of configuration data provided for use within a
deployed application.
There are two common exit codes, the exact values being documented in Exit Codes
- If there is no matching service then the operation fails with the
EXIT_NOT_FOUND
code (77). - If there are no configurations in a listing, or the named configuration
is not found, the command returns the exit code
EXIT_NOT_FOUND
(77)
Operations:
slider registry --list [--servicetype <servicetype>] [--name <name>] [--verbose] [--user <user>] [--out <filename>]
¶
List all services of the service type and optionally the name.
If --out
specified a file, the output is listed to the file, one entry
per line. An empty file means that no entries were found.
slider registry --listconf [--name <name>] [--internal] [--servicetype <servicetype>] [--user <user>] [--out <filename>]
¶
List the configurations exported by of a named application.
If --out
specified a file, the output is listed to the file, one entry
per line. An empty file means that no entries were found.
slider registry --getconf <configuration> [--format (xml|json|properties)] [--servicetype <servicetype>] [--name <name>] [--dest <path>] [--internal] [--user <user>]
get the configuration¶
Get a named configuration in a chosen format. Default: XML
--dest <path>
: the filename or directory to save a configuration to.
--format (xml|json|properties)
defines the output format
If the --dest
argument is set and refers to a directory, the file is saved
under that file, with the filename derived from the configuration name requested:
Example:
slider registry --getconf hbase-site.xml --name hbase1 --dest confdir
If confdir
exists, this downloads the hbase site configuration to confdir/hbase-site.xml
.
If the destination path refers to a file (or does not exist), the specified path is used for the file:
slider registry --getconf hbase-site.xml --name hbase1 --dest configfile.xml
This will download the configuration to the file configfile.xml
.
slider resolve --path <path> [--out <filename>] [--list] [--destdir <dir]
¶
This command resolves the service record under a path in the registry, or lists all service records directly under a path.
The result can be printed to the console (default) or saved to the filesystem; the means of specifiying the destination varies depending on whether a single record or a listing is requested.
Resolve a single entry: slider resolve --path <path> [--out <file> [--destdir <dir>]
¶
The basic slider resolve --path <path>
command, without the --list
option
will attempt to resolve the service record at the destination. The record
may be saved to the file specified with the --out
Example: resolve and print the record at /users/hbase/services/org-apache-hbase/instance1
slider resolve --path /users/hbase/services/org-apache-hbase/instance1
Example: resolve the record at /users/hbase/services/org-apache-hbase/instance1
and save it to a file
slider resolve --path /users/hbase/services/org-apache-hbase/instance1 --out hbase.json
If the specified path is not in the registry, or the path exists but there
is no service record there, the return code is EXIT_NOT_FOUND
, 77.
List all entries and services under a path: slider resolve --path <path> --list
¶
slider resolve --path <path> --list
command will list all service records
directly under a path.
The all entries will be listed to the console, followed by the individual service records of those entries that contain a service record declaration.
The service records can be saved to a directory, one JSON file per entry.
The --destdir
option enables
this saving of the entries —and identifies the destination directory for them.
Each entry will be saved with the entry name suffixed by .json
.
It is an error if the path does not exist; the exit code will be EXIT_NOT_FOUND
, 77.
It is not an error if the path does exist but there are no records underneath it.
Example: list services under /users/hbase/services/org-apache-hbase/
slider resolve --path /users/hbase/services/org-apache-hbase/ --list
This will list all services deployed under this path. If a service "hbase-1"
had been deployed, it would be printed.
Example: list services under /users/hbase/services/org-apache-hbase/
and
save the results
slider resolve --path /users/hbase/services/org-apache-hbase/ --list --destdir services
This will create a directory services
and save service records to it.
If a service "hbase-1"
was registered a under this path, its service record
would be saved to the file services/hbase-1.json
.
The current users base path can be referred to via the "~" prefix:
slider resolve --path ~/services/ --list
This simplifies path creation, testing a,
start <name> [--wait milliseconds] [--out <filename>]
¶
Start a stopped application instance, recreating it from its previously saved state.
After the application is launched, if an --out
argument specified a file,
the "YARN application report" will be saved as a JSON document into the file
specified.
Examples:
slider start instance2 slider start instance1 --wait 60000 slider start instance1 --out appreport.json
If the application instance is already running, this command will not affect it.
status <name> [--out <filename>]
¶
Get the status of the named application instance in JSON format. A filename can be used to specify the destination.
Examples:
slider status instance1 --manager host:port slider status instance2 --manager host:port --out status.json
stop <name> [--force] [--wait time] [--message text]
¶
stop the application instance. The running application is stopped. Its settings are retained in HDFS.
The --wait
argument can specify a time in seconds to wait for the application instance to be stopped.
The --force
flag causes YARN asked directly to terminate the application instance.
The --message
argument supplies an optional text message to be used in
the request: this will appear in the application's diagnostics in the YARN RM UI.
If an unknown (or already stopped) application instance is named, no error is returned.
Examples
slider stop instance1 --wait 30 slider stop instance2 --force --message "maintenance session"
version
¶
The command slider version
prints out information about the compiled
Slider application, the version of Apache Hadoop against which it was built -and
the version of Hadoop that is currently on its classpath.
Note that this is the client-side Hadoop version, not that running on the server, though that can be obtained in the status operation
Commands for testing¶
These operations are here primarily for testing.
kill-container <name> --id container-id
¶
Kill a YARN container belong to the application. This is useful primarily for testing the resilience to failures.
Container IDs can be determined from the application instance status JSON document.
am-suicide <name> [--exitcode code] [--message message] [--wait time]
¶
This operation is purely for testing Slider Application Master restart; it triggers an asynchronous self-destruct operation in the AM -an operation that does not make any attempt to cleanly shut down the process.
If the application has not exceeded its restart limit (as set by
slider.yarn.restart.limit
), YARN will attempt to restart the failed application.
Example
slider am-suicide --exitcode 1 --wait 5000 -message "test"
tokens [--source <file>] [--out file] [--keytab <keytab> --principal <principal>]
¶
Lists current delegation tokens, or, on a secure cluster creates new ones.
This is useful for testing
the delegation token mechanism offered by Oozie, in which Oozie collects
the tokens needed by slider, saves them to a file, then starts slider in
with the environment variable HADOOP_TOKEN_FILE_LOCATION
set to the location
of this file. For ease of doing that, the bash command to set the property is
printed.
For reference the tokens needed are:
- An HDFS token.
- A YARN client token to interact with the RM.
- If the timeline server is enabled, a timeline server delegation token.
If the --keytab
and --principal
arguments are supplied, then the credentials
will be generated with the named principal logged in from the specific keytab.