ManifoldCF Scripting Language
Overview
The ManifoldCF scripting language allows symbolic communication with the ManifoldCF API Service in order to define connections and jobs, and perform crawls. The language provides support for JSON-like hierarchical documents, as well as the ability to construct properly encoded REST URLs. It also has support for simple control flow and error handling.
How to use the script interpreter
The ManifoldCF script interpreter can be used in two ways - either as a real-time shell (executing a script as it is typed), or interpreting a script file. The main class of the interpreter is org.apache.manifoldcf.scriptengine.ScriptParser, and the two ways of invoking it are:
java -cp ... org.apache.manifoldcf.scriptengine.ScriptParser
or:
java -cp ... org.apache.manifoldcf.scriptengine.ScriptParser <script_file> <arg1> ... <argN>
If you choose to invoke ScriptParser in interactive mode, simply type your script one line at a time. Any errors will be reported immediately, and the ScriptParser will accordingly exit. You can also type ^Z to terminate the script.
If you use ScriptParser with a scripting file, that file will be read and interpreted. The arguments you provide will be loaded into an array of strings, which is accessible from your script as the variable named __args__.
Running the script interpreter by hand
When you build ManifoldCF, the required dependent jars for the scripting language are copied to dist/script-engine/lib. You can run the interpreter in interactive mode by typing:
cd dist\script-engine run-script.bat <args>
Or, on Linux:
cd dist/script-engine run-script.sh <args>
You will need to set the environment variable ENGINE_HOME to point at the dist/script-engine directory beforehand, so that the scripts can locate the appropriate jars.
Running the script interpreter using Ant
You can also start the script interpreter with all the correct required jars using Ant. Simply type the following:
ant run-script-interpreter
This will start the script interpreter in interactive mode only.
Running the script interpreter using Maven
You can also run the script interpreter using maven. The commands are:
cd framework/script-engine mvn exec:exec
This, once again, will start the interpreter in interactive mode.
Script language syntax
A ManifoldCF script is not sensitive to whitespace or indenting. All comments begin with a '#' character and end with the end of that line. Unquoted tokens can include alphanumeric characters, plus '_', '$', and '@'. Numeric tokens always begin with a number ('0'-'9'), and are considered floating-point if they include a decimal point ('.'). Otherwise they are integers. String tokens can be quoted with either a double quote ('"') or a single quote, and within strings characters can be escaped with a preceding backslash ('\').
A ManifoldCF script has a syntax that is readily described with a BNF grammar. See below.
program: --> statements statements: --> statement1 ... statementN statement: --> 'set' expression '=' expression ';' --> 'print' expression ';' --> 'if' expression 'then' statements ['else' statements] ';' --> 'while' expression 'do' statements ';' --> 'break' ';' --> 'error' expression ';' --> 'insert' expression 'into' expression ['at' expression] ';' --> 'remove' expression 'from' expression ';' --> 'wait' expression ';' --> 'GET' expression '=' expression ';' --> 'PUT' expression '=' expression 'to' expression ';' --> 'POST' expression '=' expression 'to' expression ';' --> 'DELETE' expression ';' expression: --> '(' expression ')' --> expression '&&' expression --> expression '||' expression --> '!' expression --> expression '&' expression --> expression '|' expression --> expression '==' expression --> expression '!=' expression --> expression '>=' expression --> expression '<=' expression --> expression '>' expression --> expression '<' expression --> expression '+' expression --> expression '-' expression --> expression '*' expression --> expression '/' expression --> '-' expression --> '[' [expression [',' expression ...]] ']' --> '{' [expression [',' expression ...]] '}' --> '<<' expression ':' expression ':' [expression '=' expression [',' expression '=' expression ...]] ':' [expression [',' expression ...]] '>>' --> expression '[' expression ']' --> expression '.' token --> token --> string --> number --> 'true' | 'false' --> 'null' --> 'new' newexpression --> 'isnull' expression newexpression: --> 'url' expression --> 'connectionname' expression --> 'configuration' --> 'configurationnode' expression --> 'array' --> 'dictionary' --> 'queryarg' expression ['=' expression]
Script language variables
Variables in the ManifoldCF scripting language determine the behavior of all aspects of expression evaluation, with the exception of operator precedence. In particular, every canonical variable has the ability to support arbitrary attributes (which are named properties of the variable), subscripts (children which are accessed by a numeric subscript), and all other operations, such as '+' or '=='. Not all kinds of variable instance will in fact support all such features. Should you try to use a feature with a variable that does not support it, you will receive a ScriptException telling you what you did wrong.
Since the actual operation details are bound to the variable, for binary operations the left-hand variable typically determines what actually takes place. For example:
print 3+7; [java] 10 print "3"+7; [java] 37
There is, of course, a way to caste a variable to a different type. For example:
print "3".__int__+7; [java] 10
Here, we are using the built-in attribute __int__ to obtain the integer equivalent of the original string variable "3". See the following table for a list of some of the standard attributes and their meanings:
Attribute name | Meaning |
---|---|
__script__ | Returns the script code that would create this variable |
__string__ | Returns the string value of the variable, if any |
__int__ | Returns the integer value of the variable, if any |
__float__ | Returns the floating-point value of the variable, if any |
__boolean__ | Returns the boolean value of the variable, if any |
__size__ | Typically returns the number of subscript children |
__type__ | Returns the 'type' of the variable |
__value__ | Returns the 'value' of the variable |
__dict__ | Returns a dictionary equivalent of the variable |
__OK__ | Returns a boolean 'true' if the variable was "OK", false otherwise |
__NOTFOUND__ | Returns a boolean 'true' if the variable was "NOTFOUND", false otherwise |
__CREATED__ | Returns a boolean 'true' if the variable was "CREATED", false otherwise |
Obviously, only some variables will support each of the standard attributes. You will receive a script exception if you try to obtain a non-existent attribute for a variable.
Integers
Integer variable types are created by non-quoted numeric values that do not have a '.' in them. For example, the character '4' will create an integer variable type with a value of 4.
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary + | Addition, yielding an integer | 4+7 |
binary - | Subtraction, yielding an integer | 7-4 |
binary * | Multiplication, yielding an integer | 7*4 |
binary / | Division, yielding an integer | 7/4 |
unary - | Negation, yielding an integer | -4 |
binary == | Equality comparison, yielding a boolean | 7 == 4 |
binary != | Inequality comparison, yielding a boolean | 7 != 4 |
binary >= | Greater or equals comparison, yielding a boolean | 7 >= 4 |
binary <= | Less or equals comparison, yielding a boolean | 7 <= 4 |
binary > | Greater comparison, yielding a boolean | 7 > 4 |
binary < | Less comparison, yielding a boolean | 7 < 4 |
binary & | Bitwise AND, yielding an integer | 7 & 4 |
binary | | Bitwise OR, yielding an integer | 7 | 4 |
unary ! | Bitwise NOT, yielding an integer | ! 7 |
In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by integer types.
Strings
String variable types are created by quoted sequences of characters. For example, the character '"hello world"' will create a string variable type with an (unquoted) value of "hello world".
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary + | Concatenation, yielding a string | "hi" + "there" |
binary == | Equality comparison, yielding a boolean | "hi" == "there" |
binary != | Inequality comparison, yielding a boolean | "hi" != "there" |
In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by string types.
Floating-point numbers
Float variable types are created by non-quoted numeric values that have a '.' in them. For example, the token '4.1' will create a float variable type with a value of 4.1
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary + | Addition, yielding a float | 4.0+7.0 |
binary - | Subtraction, yielding a float | 7.0-4.0 |
binary * | Multiplication, yielding a float | 7.0*4.0 |
binary / | Division, yielding a float | 7.0/4.0 |
unary - | Negation, yielding a float | -4.0 |
binary == | Equality comparison, yielding a boolean | 7.0 == 4.0 |
binary != | Inequality comparison, yielding a boolean | 7.0 != 4.0 |
binary >= | Greater or equals comparison, yielding a boolean | 7.0 >= 4.0 |
binary <= | Less or equals comparison, yielding a boolean | 7.0 <= 4.0 |
binary > | Greater comparison, yielding a boolean | 7.0 > 4.0 |
binary < | Less comparison, yielding a boolean | 7.0 < 4.0 |
In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by float types.
Booleans
Boolean variable types are created by the keywords true and false. For example, the code 'true' will create a boolean variable type with a value of "true".
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary == | Equality comparison, yielding a boolean | 7.0 == 4.0 |
binary != | Inequality comparison, yielding a boolean | 7.0 != 4.0 |
binary && | AND logical operation, yielding a boolean | true && false |
binary || | OR logical operation, yielding a boolean | true || false |
binary & | AND logical operation, yielding a boolean | true & false |
binary | | OR logical operation, yielding a boolean | true | false |
unary ! | NOT logical operation, yielding a boolean | ! true |
In addition, the standard attributes __script__ and __boolean__ are supported by boolean types.
Arrays
Array variable types are created by an initializer of the form [ [expression [, expression ...]] ]. For example, the script code '[3, 4]' will create an array variable type with two values, the integer "3" and the integer "4".
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
subscript [] | Find the specified subscript variable, yielding the variable | [3,4] [0] |
In addition, the standard attributes __script__ and __size__ are supported by array types, as well as the insert and remove statements.
Dictionaries
Dictionary variable types are created using the "new" operator, e.g. new dictionary.
The operations supported for this variable type, and their meanings, are listed in the table below:
Operation | Meaning | Example |
---|---|---|
subscript [] | Find the specified key, yielding the keyed variable | mydict ["keyname"] |
In addition, the standard attributes __script__ and __size__ are supported by dictionary types.
Configurations
Configuration variables contain the equivalent of the JSON used to communicate with the ManifoldCF API. They can be created using an initializer of the form { [expression [, expression ...]] }. For example, the script code '{ << "outputconnector" : "" : : , << "description" : "Solr" : : >>, << "class_name" : "org.apache.manifoldcf.agents.output.solr.SolrConnector" : : >> >> }' would create a configuration variable equivalent to one that might be returned from the ManifoldCF API if it was queried for the output connectors registered by the system.
The operations supported for this variable type, and their meanings are listed in the table below:
Operation | Meaning | Example |
---|---|---|
subscript [] | Find the specified child configuration node variable, yielding the variable | myconfig [0] |
binary + | Append a configuration child node variable to the list | myconfig + << "something" : "somethingvalue" : : >> |
In addition, the standard attributes __script__, __dict__, and __size__ are supported by configuration variable types, as well as the insert and remove statements.
Configuration nodes
Configuration node variable types are children of configuration variable types or configuration node variable types. They have several components, as listed below:
- A type
- A value
- Attributes, described as a set of name/value pairs
- Children, which must be configuration node variable types
Configuration node variable types can be created using an initializer of the form << expression : expression : [expression = expression [, expression = expression ...]] : [expression [, expression ... ]] '>>'. The first expression represents the type of the node. The second is the node's value. The series of '=' expressions represents attribute names and values. The last series represents the children of the node. For example, the script code '<< "description" : "Solr" : : >>' represents a node of type 'description' with a value of 'Solr', with no attributes or children.
The operations supported for this variable type, and their meanings are listed in the table below:
Operation | Meaning | Example |
---|---|---|
subscript [] | Find the specified child configuration node variable, yielding the variable | myconfig [0] |
binary + | Append a configuration child node variable to the list | myconfig + << "something" : "somethingvalue" : : >> |
In addition, the standard attributes __script__, __string__, __size__, __type__, __dict__ and __value__ are supported by configuration node variable types, as well as the insert and remove statements.
URLs
URL variable types exist to take care of the details of URL encoding while assembling the REST URL's needed to describe objects in ManifoldCF's REST API. A URL variable type can be created using a 'new' operation of the form new url expression, where the expression is the already-encoded root path. For example, the script code 'new url "http://localhost:8345/mcf-api-service/json"' would create a URL variable type with the root path "http://localhost:8345/mcf-api-service/json".
The operations supported for this variable type, and their meanings are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary == | Equals comparison, yielding a boolean | url1 == url2 |
binary != | Non-equals comparison, yielding a boolean | url1 != url2 |
binary + | Append and encode another path or query argument element, yielding a URL | url1 + "repositoryconnections" |
In addition, the standard attributes __script__ and __string__ are supported by URL variable types.
Query Arguments
Query Argument variable types exist to take care of the details of URL encoding while assembling the query arguments of a REST URL for ManifoldCF's REST API. A Query Argument variable type can be created using a 'new' operation of the form new queryarg expression [= expression], where the first expression is the query argument name, and the second optional expression is the query argument value. For example, the script code 'new queryarg "report" = "simple"' would create a Query Argument variable type representing the query argument "report=simple". To add query arguments to a URL, simply add them using the '+' operator, for example "urlvar = urlvar + new queryarg 'report' = 'simple';" .
The operations supported for this variable type, and their meanings are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary == | Equals comparison, yielding a boolean | arg1 == arg2 |
binary != | Non-equals comparison, yielding a boolean | arg1 != arg2 |
In addition, the standard attributes __script__ and __string__ are supported by Query Argument variable types.
Connection names
Connection name variable types exist to perform the extra URL encoding needed for ManifoldCF's REST API. Connection names must be specially encoded so that they do not contain slash characters ('/'). Connection name variable types take care of this encoding.
You can create a connection name variable type using the following syntax: new connectionname expression, where the expression is the name of the connection.
The operations supported for this variable type, and their meanings are listed in the table below:
Operation | Meaning | Example |
---|---|---|
binary == | Equals comparison, yielding a boolean | cn1 == cn2 |
binary != | Non-equals comparison, yielding a boolean | cn1 != cn2 |
In addition, the standard attributes __script__ and __string__ are supported by connection name variable types.
Results
Result variable types capture the result of a GET, PUT, POST, or DELETE statement. They consist of two parts:
- A result code
- A result configuration value
There is no way to directly create a result variable type, nor does it support any operations. However, the standard attributes __script__, __string__, __value__, __OK__, __NOTFOUND__, and __CREATED__ are all supported by result variable types.
Statements
The statements available to a ManifoldCF script programmer are designed to support interaction with the ManifoldCF API. Thus, there is support for all four HTTP verbs, as well as basic variable setting and control flow. The table below describes each statement type:
Statement | Meaning | Example |
---|---|---|
'set' expression '=' expression ';' | Sets the variable described by the first expression with the value computed for the second | set myvar = 4 + 5; |
'print' expression ';' | Prints the string value of the expression to stdout | print "hello world"; |
'if' expression 'then' statements ['else' statements] ';' | If the boolean value of the expression is 'true', executes the first set of statements, otherwise executes the (optional) second set | if true then print "hello"; else print "there"; ; |
'while' expression 'do' statements ';' | While expression is true, execute the specified statements, and repeat | while count > 0 do set count = count - 1; ; |
'break' ';' | Exits from the nearest enclosing while loop | while true do break; ; |
'error' expression ';' | Aborts the script with a script exception based on the string value of the expression | error "bad stuff"; |
'wait' expression ';' | Waits the number of milliseconds corresponding to the integer value of the expression | wait 1000; |
'insert' expression 'into' expression ['at' expression] ';' | Inserts the first expression into the second variable expression, either at the end or optionally at the position specified by the third expression | insert 4 into myarray at 0 ; |
'delete' expression 'from' expression ';' | Deletes the element described by the first expression from the second expression | delete 0 from myarray ; |
'GET' expression '=' expression ';' | Perform an HTTP GET from the URL specified in the second expression capturing the result in the first expression | GET result = new url "http://localhost:8345/mcf-api-service/json/repositoryconnections" ; |
'DELETE' expression '=' expression ';' | Perform an HTTP DELETE on the URL specified in the second expression capturing the result in the first expression | DELETE result = myurl ; |
'PUT' expression '=' expression 'to' expression ';' | Perform an HTTP PUT of the second expression to the URL specified in the third expression capturing the result in the first expression | PUT result = configurationObject to myurl ; |
'POST' expression '=' expression 'to' expression ';' | Perform an HTTP POST of the second expression to the URL specified in the third expression capturing the result in the first expression | POST result = configurationObject to myurl ; |