---+FalconCLI
FalconCLI is a interface between user and Falcon. It is a command line utility provided by Falcon. FalconCLI supports Entity Management, Instance Management and Admin operations.There is a set of web services that are used by FalconCLI to interact with Falcon.
---++Common CLI Options
---+++Falcon URL
Optional -url option indicating the URL of the Falcon system to run the command against can be provided. If not mentioned it will be picked from the system environment variable FALCON_URL. If FALCON_URL is not set then it will be picked from client.properties file. If the option is not
provided and also not set in client.properties, Falcon CLI will fail.
---+++Proxy user support
The -doAs option allows the current user to impersonate other users when interacting with the Falcon system. The current user must be configured as a proxyuser in the Falcon system. The proxyuser configuration may restrict from
which hosts a user may impersonate users, as well as users of which groups can be impersonated.
Proxyuser support described here.
---+++Debug Mode
If you export FALCON_DEBUG=true then the Falcon CLI will output the Web Services API details used by any commands you execute. This is useful for debugging purposes to or see how the Falcon CLI works with the WS API.
Alternately, you can specify '-debug' through the CLI arguments to get the debug statements.
Example:
$FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml -debug
---++Entity Management Operations
---+++Submit
Submit option is used to set up entity definition.
Example:
$FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
Note: The url option in the above and all subsequent commands is optional. If not mentioned it will be picked from client.properties file. If the option is not provided and also not set in client.properties, Falcon CLI will fail.
---+++Schedule
Once submitted, an entity can be scheduled using schedule option. Process and feed can only be scheduled.
Usage:
$FALCON_HOME/bin/falcon entity -type [process|feed] -name <> -schedule
Optional Arg : -skipDryRun. When this argument is specified, Falcon skips oozie dryrun.
Example:
$FALCON_HOME/bin/falcon entity -type process -name sampleProcess -schedule
---+++Suspend
Suspend on an entity results in suspension of the oozie bundle that was scheduled earlier through the schedule function. No further instances are executed on a suspended entity. Only schedule-able entities(process/feed) can be suspended.
Usage:
$FALCON_HOME/bin/falcon entity -type [feed|process] -name <> -suspend
---+++Resume
Puts a suspended process/feed back to active, which in turn resumes applicable oozie bundle.
Usage:
$FALCON_HOME/bin/falcon entity -type [feed|process] -name <> -resume
---+++Delete
Delete removes the submitted entity definition for the specified entity and put it into the archive.
Usage:
$FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <> -delete
---+++List
Entities of a particular type can be listed with list sub-command.
Usage:
$FALCON_HOME/bin/falcon entity -list
Optional Args : -fields <>
-type <<[cluster|feed|process],[cluster|feed|process]>>
-nameseq <> -tagkeys <>
-filterBy <> -tags <>
-orderBy <> -sortOrder <> -offset 0 -numResults 10
Optional params described here.
---+++Summary
Summary of entities of a particular type and a cluster will be listed. Entity summary has N most recent instances of entity.
Usage:
$FALCON_HOME/bin/falcon entity -type [feed|process] -summary
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -fields <>
-filterBy <> -tags <>
-orderBy <> -sortOrder <> -offset 0 -numResults 10 -numInstances 7
Optional params described here.
---+++Update
Update operation allows an already submitted/scheduled entity to be updated. Cluster update is currently
not allowed.
Usage:
$FALCON_HOME/bin/falcon entity -type [feed|process] -name <> -update -file <>
Optional Arg : -skipDryRun. When this argument is specified, Falcon skips oozie dryrun.
Example:
$FALCON_HOME/bin/falcon entity -type process -name HourlyReportsGenerator -update -file /process/definition.xml
---+++Touch
Force Update operation allows an already submitted/scheduled entity to be updated.
Usage:
$FALCON_HOME/bin/falcon entity -type [feed|process] -name <> -touch
Optional Arg : -skipDryRun. When this argument is specified, Falcon skips oozie dryrun.
---+++Status
Status returns the current status of the entity.
Usage:
$FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <> -status
---+++Dependency
With the use of dependency option, we can list all the entities on which the specified entity is dependent. For example for a feed, dependency return the cluster name and for process it returns all the input feeds, output feeds and cluster names.
Usage:
$FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <> -dependency
---+++Definition
Definition option returns the entity definition submitted earlier during submit step.
Usage:
$FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <> -definition
---+++Lookup
Lookup option tells you which feed does a given path belong to. This can be useful in several scenarios e.g. generally you would want to have a single definition for common feeds like metadata with same location
otherwise it can result in a problem (different retention durations can result in surprises for one team) If you want to check if there are multiple definitions of same metadata then you can pick
an instance of that and run through the lookup command like below.
Usage:
$FALCON_HOME/bin/falcon entity -type feed -lookup -path /data/projects/my-hourly/2014/10/10/23/
If you have multiple feeds with location as /data/projects/my-hourly/${YEAR}/${MONTH}/${DAY}/${HOUR} then this command will return all of them.
---+++SLAAlert
Since: 0.8
This command lists all the feed instances which have missed sla and are still not available. If a feed instance missed
sla but is now available, then it will not be reported in results. The purpose of this API is alerting and hence it
doesn't return feed instances which missed SLA but are available as they don't require any action.
* Currently sla monitoring is supported only for feeds.
* Option end is optional and will default to current time if missing.
* Option name is optional, if provided only instances of that feed will be considered.
Usage:
*Example 1*
*$FALCON_HOME/bin/falcon entity -type feed -start 2014-09-05T00:00Z -slaAlert -end 2016-05-03T00:00Z -colo local*
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T11:59Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:00Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:01Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:02Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:03Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:04Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:05Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:06Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:07Z, tags: Missed SLA High
name: out, type: FEED, cluster: local, instanceTime: 2015-09-26T12:08Z, tags: Missed SLA Low
Response: default/Success!
Request Id: default/216978070@qtp-830047511-4 - f5a6c129-ab42-4feb-a2bf-c3baed356248
*Example 2*
*$FALCON_HOME/bin/falcon entity -type feed -start 2014-09-05T00:00Z -slaAlert -end 2016-05-03T00:00Z -colo local -name in*
name: in, type: FEED, cluster: local, instanceTime: 2015-09-26T06:00Z, tags: Missed SLA High
Response: default/Success!
Request Id: default/1580107885@qtp-830047511-7 - f16cbc51-5070-4551-ad25-28f75e5e4cf2
---++Instance Management Options
---+++Kill
Kill sub-command is used to kill all the instances of the specified process whose nominal time is between the given start time and end time.
Note:
1. The start time and end time needs to be specified in TZ format.
Example: 01 Jan 2012 01:00 => 2012-01-01T01:00Z
3. Process name is compulsory parameter for each instance management command.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -kill -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
---+++Suspend
Suspend is used to suspend a instance or instances for the given process. This option pauses the parent workflow at the state, which it was in at the time of execution of this command.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -suspend -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
---+++Continue
Continue option is used to continue the failed workflow instance. This option is valid only for process instances in terminal state, i.e. KILLED or FAILED.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -continue -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
---+++Rerun
Rerun option is used to rerun instances of a given process. On issuing a rerun, by default the execution resumes from the last failed node in the workflow. This option is valid only for process instances in terminal state, i.e. SUCCEEDED, KILLED or FAILED.
If one wants to forcefully rerun the entire workflow, -force should be passed along with -rerun
Additionally, you can also specify properties to override via a properties file.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -rerun -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" [-force] [-file <>]
---+++Resume
Resume option is used to resume any instance that is in suspended state.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -resume -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
---+++Status
Status option via CLI can be used to get the status of a single or multiple instances. If the instance is not yet materialized but is within the process validity range, WAITING is returned as the state. Along with the status of the instance time is also returned. Log location gives the oozie workflow url
If the instance is in WAITING state, missing dependencies are listed.
The job urls are populated for all actions of user workflow and non-succeeded actions of the main-workflow. The user then need not go to the underlying scheduler to get the job urls when needed to debug an issue in the job.
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getStatus is successful","instances":[{"instance":"2012-05-07T05:02Z","status":"SUCCEEDED","logFile":"http://oozie-dashboard-url"},{"instance":"2012-05-07T05:07Z","status":"RUNNING","logFile":"http://oozie-dashboard-url"}, {"instance":"2010-01-02T11:05Z","status":"WAITING"}]
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -status
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <>
-filterBy <> -lifecycle <>
-orderBy field -sortOrder <> -offset 0 -numResults 10
Optional params described here.
---+++List
List option via CLI can be used to get single or multiple instances. If the instance is not yet materialized but is within the process validity range, WAITING is returned as the state. Instance time is also returned. Log location gives the oozie workflow url
If the instance is in WAITING state, missing dependencies are listed
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getStatus is successful","instances":[{"instance":"2012-05-07T05:02Z","status":"SUCCEEDED","logFile":"http://oozie-dashboard-url"},{"instance":"2012-05-07T05:07Z","status":"RUNNING","logFile":"http://oozie-dashboard-url"}, {"instance":"2010-01-02T11:05Z","status":"WAITING"}]}
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -list
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
-colo <> -lifecycle <>
-filterBy <> -orderBy field -sortOrder <> -offset 0 -numResults 10
Optional params described here.
---+++Summary
Summary option via CLI can be used to get the consolidated status of the instances between the specified time period.
Each status along with the corresponding instance count are listed for each of the applicable colos.
The unscheduled instances between the specified time period are included as UNSCHEDULED in the output to provide more clarity.
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getSummary is successful", instancesSummary:[{"cluster": <> "map":[{"SUCCEEDED":"1"}, {"WAITING":"1"}, {"RUNNING":"1"}]}]}
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -summary
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <>
-filterBy <> -lifecycle <>
-orderBy field -sortOrder <>
Optional params described here.
---+++Running
Running option provides all the running instances of the mentioned process.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -running
Optional Args : -colo <> -lifecycle <>
-filterBy <> -orderBy <> -sortOrder <> -offset 0 -numResults 10
Optional params described here.
---+++FeedInstanceListing
Get falcon feed instance availability.
Usage:
$FALCON_HOME/bin/falcon instance -type feed -name <> -listing
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
-colo <>
Optional params described here.
---+++Logs
Get logs for instance actions
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -logs
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -runid <>
-colo <> -lifecycle <>
-filterBy <> -orderBy field -sortOrder <> -offset 0 -numResults 10
Optional params described here.
---+++LifeCycle
Describes list of life cycles of a entity , for feed it can be replication/retention and for process it can be execution.
This can be used with instance management options. Default values are replication for feed and execution for process.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -status -lifecycle <> -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
---+++Triage
Given a feed/process instance this command traces it's ancestors to find what all ancestors have failed. It's useful if
lot of instances are failing in a pipeline as it then finds out the root cause of the pipeline being stuck.
Usage:
$FALCON_HOME/bin/falcon instance -triage -type <> -name <> -start "yyyy-MM-dd'T'HH:mm'Z'"
---+++Params
Displays the workflow params of a given instance. Where start time is considered as nominal time of that instance and end time won't be considered.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -params -start "yyyy-MM-dd'T'HH:mm'Z'"
---+++Dependency
Display the dependent instances which are dependent on the given instance. For example for a given process instance it will
list all the input feed instances(if any) and the output feed instances(if any).
An example use case of this command is as follows:
Suppose you find out that the data in a feed instance was incorrect and you need to figure out which all process instances
consumed this feed instance so that you can reprocess them after correcting the feed instance. You can give the feed instance
and it will tell you which process instance produced this feed and which all process instances consumed this feed.
NOTE:
1. instanceTime must be a valid instanceTime e.g. instanceTime of a feed should be in it's validity range on applicable clusters,
and it should be in the range of instances produced by the producer process(if any)
2. For processes with inputs like latest() which vary with time the results are not guaranteed to be correct.
Usage:
$FALCON_HOME/bin/falcon instance -type <> -name <> -params -instanceTime "yyyy-MM-dd'T'HH:mm'Z'"
For example:
$FALCON_HOME/bin/falcon instance -dependency -type feed -name out -instanceTime 2014-12-15T00:00Z
name: producer, type: PROCESS, cluster: local, instanceTime: 2014-12-15T00:00Z, tags: Output
name: consumer, type: PROCESS, cluster: local, instanceTime: 2014-12-15T00:03Z, tags: Input
name: consumer, type: PROCESS, cluster: local, instanceTime: 2014-12-15T00:04Z, tags: Input
name: consumer, type: PROCESS, cluster: local, instanceTime: 2014-12-15T00:02Z, tags: Input
name: consumer, type: PROCESS, cluster: local, instanceTime: 2014-12-15T00:05Z, tags: Input
Response: default/Success!
Request Id: default/1125035965@qtp-503156953-7 - 447be0ad-1d38-4dce-b438-20f3de69b172
Optional params described here.
---++ Metadata Lineage Options
---+++Lineage
Returns the relationship between processes and feeds in a given pipeline in dot format.
You can use the output and view a graphical representation of DAG using an online graphviz viewer like this.
Usage:
$FALCON_HOME/bin/falcon metadata -lineage -pipeline my-pipeline
pipeline is a mandatory option.
---+++ Vertex
Get the vertex with the specified id.
Usage:
$FALCON_HOME/bin/falcon metadata -vertex -id <>
Example:
$FALCON_HOME/bin/falcon metadata -vertex -id 4
---+++ Vertices
Get all vertices for a key index given the specified value.
Usage:
$FALCON_HOME/bin/falcon metadata -vertices -key <> -value <>
Example:
$FALCON_HOME/bin/falcon metadata -vertices -key type -value feed-instance
---+++ Vertex Edges
Get the adjacent vertices or edges of the vertex with the specified direction.
Usage:
$FALCON_HOME/bin/falcon metadata -edges -id <> -direction <>
Example:
$FALCON_HOME/bin/falcon metadata -edges -id 4 -direction both
$FALCON_HOME/bin/falcon metadata -edges -id 4 -direction inE
---+++ Edge
Get the edge with the specified id.
Usage:
$FALCON_HOME/bin/falcon metadata -edge -id <>
Example:
$FALCON_HOME/bin/falcon metadata -edge -id Q9n-Q-5g
---++ Metadata Discovery Options
---+++ List
Lists of all dimensions of given type. If the user provides optional param cluster, only the dimensions related to the cluster are listed.
Usage:
$FALCON_HOME/bin/falcon metadata -list -type [cluster_entity|feed_entity|process_entity|user|colo|tags|groups|pipelines]
Optional Args : -cluster <>
Example:
$FALCON_HOME/bin/falcon metadata -list -type process_entity -cluster primary-cluster
$FALCON_HOME/bin/falcon metadata -list -type tags
---+++ Relations
List all dimensions related to specified Dimension identified by dimension-type and dimension-name.
Usage:
$FALCON_HOME/bin/falcon metadata -relations -type [cluster_entity|feed_entity|process_entity|user|colo|tags|groups|pipelines] -name <>
Example:
$FALCON_HOME/bin/falcon metadata -relations -type process_entity -name sample-process
---++Admin Options
---+++Help
Usage:
$FALCON_HOME/bin/falcon admin -help
---+++Version
Version returns the current version of Falcon installed.
Usage:
$FALCON_HOME/bin/falcon admin -version
---+++Status
Status returns the current state of Falcon (running or stopped).
Usage:
$FALCON_HOME/bin/falcon admin -status
---++ Recipe Options
---+++ Submit Recipe
Submit the specified recipe.
Usage:
$FALCON_HOME/bin/falcon recipe -name
Name of the recipe. User should have defined -template.xml and .properties in the path specified by falcon.recipe.path in client.properties file. falcon.home path is used if its not specified in client.properties file.
If its not specified in client.properties file and also if files cannot be found at falcon.home, Falcon CLI will fail.
Optional Args : -tool
Falcon provides a base tool that recipes can override. If this option is not specified the default Recipe Tool
RecipeTool defined is used. This option is required if user defines his own recipe tool class.
Example:
$FALCON_HOME/bin/falcon recipe -name hdfs-replication