[[index][::Go back to Oozie Documentation Index::]]
---+!! Oozie Installation and Configuration
%TOC%
---++ System Requirements
* Unix (tested in Linux and Mac OS X)
* Java 1.6+
* Hadoop
* [[http://hadoop.apache.org][Apache Hadoop]] (tested with 0.20.2)
* [[http://developer.yahoo.com/hadoop][Yahoo! Hadoop]] (tested with 0.20.104.2)
* [[http://tomcat.apache.org][Tomcat 6.x]] (tested with 6.0.18)
---++ Download Oozie
Download the latest Oozie distribution from [[http://yahoo.github.com/oozie/downloads]].
Expand the Oozie distribution =tar.gz=.
---++ Environment Setup
Java JRE should be in the =PATH=.
Set the =OOZIE_HOME= environment variable to the directory where the expanded Oozie distribution is located.
*NOTE:* The =OOZIE_HOME= environment variable is only required and used by the Oozie server. It is not used by the
Oozie client.
Add =${OOZIE_HOME}/bin= to the =PATH=.
---++ Oozie WAR Setup
Oozie WAR is bundled without the Hadoop JAR files and without the ExtJS library. The Hadoop JARs are required to run
Oozie. The ExtJS library is optional (only required for the Oozie web-console to work).
The ExtJS library can be downloaded from [[http://www.extjs.com/learn/Ext_Version_Archives][ExtJS 2.2]] (it must be
version 2.2). The ExtJS library is not bundled with Oozie because it uses a different license.
Use the =${OOZIE_HOME}/bin/addtowar.sh= script to add the Hadoop JARs and the ExtJS library to the Oozie WAR file.
Usage:
Usage : addtowar
Options: -inputwar INPUT_OOZIE_WAR
-outputwar OUTPUT_OOZIE_WAR
[-hadoop HADOOP_VERSION HADOOP_PATH]
[-extjs EXTJS_PATH]
[-jars JARS_PATH] (multiple JAR path separated by ':')
The original Oozie WAR file is at =${OOZIE_HOME}/oozie.war=.
After the Hadoop JARs and the ExtJS library has been added to the =oozie.war= file Oozie is ready for deployment.
If present, delete any previous =oozie.war= and =oozie= directory from Tomcat's =webapps/= directory.
Copy the =oozie.war= file (the one that contains the Hadoop JARs adn the ExtJS library) to Tomcat's =webapps/=
directory.
*IMPORTANT:* Only one Oozie instance can be deployed per Tomcat instance.
---++ Database Setup
Oozie works with HSQL, MySQL and Oracle databases.
If using HSQL, Oozie bundles HSQL JDBC driver. HSQL is an embedded in-memory database, all data is lost when Oozie
stops running.
If using MySQL or Oracle, the corresponding JDBC driver JARs files must be in Oozie classpath (added to Oozie WAR or
in Tomcat's =common/lib= directory). A database should be created for Oozie, Oozie creates its tables automatically.
The =bin/addtowar.sh= script has an option =-jars= that can be used to add the Oracle or MySQL JDBC driver JARs to the
Oozie WAR file.
---++ Oozie Configuration
Oozie configuration is always read from the =${OOZIE_HOME}/conf= directory.
The Oozie configuration is distributed in 3 different files:
* =oozie-site.xml= : Oozie server configuration
* =oozie-log4j.properties= : Oozie logging configuration
* =adminusers.txt= : Oozie admin users list
---+++ Oozie Configuration Properties
All Oozie configuration properties and their default values are defined in the =oozie-default.xml= file.
Oozie resolves configuration property values in the following order:
* If a Java System property is defined, it uses its value
* Else, if the Oozie configuration file (=oozie-site.xml=) contains the property, it uses its value
* Else, it uses the default value documented in the =oozie-default.xml= file
The =OOZIE_CONFIG_FILE= environment variable can be set to indicate an alternate Oozie configuration file than
the =oozie-site.xml= file. The alternate file must be in the =${OOZIE_HOME}/conf/= directory.
*NOTE:* The =oozie-default.xml= file found in the =${OOZIE_HOME}/conf/= directory is not used by Oozie, it is there
for reference purposes only.
---+++ Logging Configuration
By default Oozie logs to the =${OOZIE_HOME}/logs/= directory in 4 different files:
* oozie.log: web services log streaming works from this log
* oozie-ops.log: messages for Admin/Operations to monitor
* oozie-instrumentation.log: intrumentation data, every 60 seconds (configurable)
* oozie-audit.log: audit messages, workflow jobs changes
Oozie log configuration is defined in the =oozie-log4j.properties= files.
If the Oozie log configuration file changes, Oozie reloads the new settings dynamically.
The =OOZIE_LOG4J_FILE= environment variable can be set to indicate an alternate Oozie logging configuration file. The
alternate file must be in the =${OOZIE_HOME}/conf/= directory.
The =OOZIE_LOG4J_RELOAD= environment variable can be set to specify the logging configuration reload interval in
seconds. The default value is 10 seconds.
---+++ Oozie Authentication Configuration
Oozie can work with Hadoop 20 with Security distribution which supports Kerberos authentication.
Oozie authentication is configured using the following configuration properties (default values shown):
oozie.services.ext=org.apache.oozie.service.HadoopAccessorService
oozie.service.HadoopAccessorService.kerberos.enabled=false
oozie.service.HadoopAccessorService.keytab.file=${user.home}/oozie.keytab
oozie.service.HadoopAccessorService.kerberos.principal=${user.name}/localhost@LOCALHOST
The above default values are for a Hadoop 0.20.2 distribution without Kerberos authentication.
To use a Hadoop 20 with Security distribution (regardless of using Simple or Kerberos authentication) the following
property must be set:
oozie.services.ext=org.apache.oozie.service.KerberosHadoopAccessorService
To enable Kerberos authentication, the following property must be set:
oozie.service.HadoopAccessorService.kerberos.enabled=true
When using Kerberos authentication, the following properties must be set to the correct values (default values shown):
oozie.service.HadoopAccessorService.keytab.file=${user.home}/oozie.keytab
oozie.service.HadoopAccessorService.kerberos.principal=${user.name}/localhost@LOCALHOST
*IMPORTANT:* When using Oozie with a Hadoop 20 with Security distribution, the Oozie user in Hadoop must be configured
as a proxy user.
---+++ User Authorization Configuration
Oozie has a basic authorization model:
* Users have read access to all jobs
* Users have write access to their own jobs
* Users have write access to jobs for groups the users belong to
* Users have read access to admin operations
* Admin users have write access to all jobs
* Admin users have write access to admin operations
If security is disabled all users are admin users.
Oozie security is set via the following configuration property (default value shown):
oozie.service.AuthorizationService.security.enabled=false
If security is enabled, the admin users are read from the =${OOZIE_HOME}/conf/adminusers.txt= file:
* One user name per line
* Empty lines and lines starting with '#' are ignored
---+++ Database Configuration
The SQL database used by Oozie is configured using the following configuration properties (default values shown):
oozie.db.schema.name=oozie
oozie.db.schema.create=true
oozie.service.StoreService.jdbc.driver=org.hsqldb.jdbcDriver
oozie.service.StoreService.jdbc.url=jdbc:hsqldb:mem:${oozie.db.schema.name}
oozie.service.StoreService.jdbc.username=sa
oozie.service.StoreService.jdbc.password=
oozie.service.StoreService.pool.max.active.conn=10
The default values are for HSQLDB, an embedded in-memory database bundled with Oozie.
To use MySQL or Oracle the corresponding JDBC driver JAR must be in the classpath (or added to the Oozie WAR file).
*NOTE:* If the =oozie.db.schema.create= property is set to =true= (default) the Oozie tables will be created
automatically if they are not found in the database at Oozie start up time. In a production system this option should
be set to =false= once the databaset tables have been created.
---+++ Oozie System ID Configuration
Oozie has a system ID that is is used to generate the Oozie temporary runtime directory, the workflow job IDs, and the
workflow action IDs.
Two Oozie systems running with the same ID will not have any conflict but in case of troubleshooting it will be easier
to identify resources created/used by the different Oozie systems if they have different system IDs (default value
shown):
oozie.system.id=oozie-${user.name}
---+++ Oozie URL Configuration
Oozie needs to know its public base URL SCHEME ://HOST:PORT/CONTEXT
. This base URL is used to
create callback URLs for actions and the console URL for Oozie workflow jobs (default value shown):
oozie.base.url=http://localhost:8080/oozie
---+++ Fine Tuning an Oozie Server
Refer to the [[oozie-default.xml][oozie-default.xml.txt]] for details.
---++ Starting and Stopping Oozie
Use the standard Tomcat commands to start and stop Oozie.
---++ Oozie Command Line Installation
Copy and expand the =oozie-client= TAR.GZ file bundled with the distribution. Add the =bin/= directory to the =PATH=.
Refer to the [[DG_CommandLineTool][Command Line Interface Utilities]] document for a a full reference of the =oozie=
command line tool.
[[index][::Go back to Oozie Documentation Index::]]