Apache Jackrabbit : Frm RemoteOperations

Remote operations

This document generically describes the operations exposed by the Oak Remote Interface. This section doesn't use any concept from any transport protocol. The intention to use the operations described here as the backend for multiple transport protocols.

The operations are defined in a stateless way. This means that the description of every operation contains the definition of every needed input parameter, and it will not rely on a particular state of the system or on a particular sequence of operations to be performed before the current one.

In this document JSON is sometimes used to represent some data structure. Please note that JSON has been chosen just for clarity, but the phisical representation of data structures is dependent on the transport protocol implementing the operations.

Read a revision

Before a client is able to perform any operation on the repository, he has to read a revision. The revision represent a state of the repository over time and guarantees stable reads over time and data consitency when performing write operations.

Required parameters:

  • User credentials.

Returned values:

  • Revision.

Read a tree

This operation allows the client to read a subtree from the repository. To minimize the amount of data sent from the server, the client can specify glob filters on node and property names.

Required parameters:

  • User credentials.
  • Revision.
  • Root path of the tree.

Optional parameters:

  • Depth. How deep the returned tree should be. Defaults to 0.
  • Property filters. A list of globs to filter property names. Defaults to ['*'].

  • Node filters. A list of globs to filter node names. Defaults to ['*'].

  • Binary threshold. The maximum number of bytes a binary property should have to be inlined inside the response. See the section about serializing binary properties below. Defaults to 0.
  • Children start index. A number representing the staring index where to read children from. Defaults to 0, meaning that children will be returned starting from the beginning.
  • Children count. A number representing the maximum number of children to read. Defaults to -1, meaning that every children will be returned.

Returned values:

  • Tree.

The tree is represented as a recursive data structure. Every node has the following information.

  • A dictionary of properties, mapping property names to property values. The property value is also given an explicit type.
  • A dictionary of child nodes, mapping a child name to a tree. This field defines the recursive relation from the current tree to the sub-trees representing its children.

A tree returned by this operation may be represented by the following JSON object.

{
  "properties": {
    "foo": {"type": "string", "value": "bar"},
    "baz": {"type": "uri", "value": "http://acme.com"},
    "qux": {"type": "booleans", "value": [true, false, true]}
  },
  "children": {
    "bob": null,
    "sue": null
  }
}

The snippet above represents a tree with depth zero. The root of the tree has three properties: a string property named "foo", a URI property names "baz" and a multi-value boolean property named "qux". The root of the tree also has two children named "bob" and "sue". Since including the children would make the depth of the tree exceed the requested depth of zero, the special empty tree value null is used to end the recursion in the tree representation.

The caller has two way to filter out children. The first is by providing one or more name filters. The second is by providing a children start index, the maximum number of children to return, or both. When all these mechanisms are used in the same call, the following order in applying filters must be followed:

  • If the children start index is provided, the list of children is trimmed to exclude every children before the provided start index.
  • If the children count is provided, the list of children is trimmed to include at most the number of children specified by the children count.
  • If children filters are provided, they are applied to filter out the resulting list even more.

The simple property types supported during serialization are string, binary, binary ID, long, double, date, boolean, name, path, reference, weak reference, URI and decimal. The multi-value property types are strings, binaries, binary IDs, longs, doubles, dates, booleans, names, paths, references, weak references, URIs and decimals.

Serializing binary properties

In addition to the binary and binaries types, this specification introduces two virtual types for the sake of performance. This types are binary ID and binary IDs, that represent reference(s) to binary objects stored in the repository. A binary property in the repository can be serialized either as binary or as a binary ID, depending on the size of the binary value and on the binary threshold specified by the client.

In the case of a simple binary property, if the size of its value is less than the binary threshold parameter, the value will be serialized with a binary type and inlined inside the response. If the size of its value is greater than or equal to the binary threshold parameter, the value will be serialized with a binary ID type instead.

In the case of a multi-value binary property, if the sum of the sizes of its values is less than the binary threshold parameter, every value will be serialized with a binary type and inlined inside the response. If the sum of the sizes of its values is greater than or equal to the binary threshold parameter, every value will be serialized with a binary ID type instead.

TODO - Define basic serialization rules for property values. Decimal values should probably always be serialized as strings, since their high precision could prevent the client to fit a decimal value into a fixed-precision numeric type. References and weak references should probably be represented as opaque string objects. The main problem with reference types would be writing a value consistently. Should every node inside a tree representation have an "id" property that can be used as a value to a reference property?

Write content

Writing inside the repository consists in defining a set of operations that represent the changes to perform on the content. The set of operations represent a transition from a valid state of the repository to another valid state. If the set of operations would bring the repository to an invalid state, they should all be rolled back.

If the operations can be applied successfully, the client will receive another revision that represents the state of the repository where those changes are applied.

Required parameters:

  • User credentials.
  • Revision.
  • Operations.

Returned values:

  • Revision.

The changes to perform are represented as an ordered list of operations. The operations should be applied in the order provided by the client.

An operation can have one of the following types:

  • Add: add a non-existing node at a given path and initialize it with the given set of properties. The operation fails if the parent node of the added one doesn't exist or if it's not possible to initialize the new node with the provided set of properties. The operation is also invalid if a node at the given path already exists.
  • Remove: remove an existing node at a given path and all of its children. The operation fails if there is no node at the given path or if it is impossible for any other reason to remove the node.
  • Set: create or overwrite a property with the given name, type and value in a node at a given path. The operation fails if there isn't any node at the given path or if it is invalid to create a property with the given name, type and value for any other reason.
  • Unset: remove a property with the given name from a node at a given path. The operation fails if there isn't any node at the given path or if the node doesn't have a property with the given name.
  • Copy: copy a tree at a given path to a given destination path in the repository. The operation fails if the source tree doesn't exist, if a node already exists at the given destination path, or if it's impossible to copy the tree to the given location for any other reason.
  • Move: move a tree at a given path to to a given destination path in the repository. The operation fails if the source tree doesn't exist, if a node already exists at the given destination path, or if it's impossible to move the tree to the given destination for any other reason.

Add operation

The add operation is represented by a data structure containing the following information:

  • Path. The path where the node should be added.
  • Properties. The set of properties that will be used to initialize the new node. The properties are represented as a dictionary mapping property names to property values. Every property value must have an explicit type. The properties dictionary may be empty.

An add operation may be represented by the following JSON object.

{
  "type": "add",
  "path": "/a/b/c",
  "properties": {
    "foo": {"type": "string", "value": "bar"},
    "baz": {"type": "uri", "value": "http://acme.com"},
    "qux": {"type": "booleans", "value": [true, false, true]}
  }
}

Remove operation

The remove operation is represented by a data structure containing the following information:

  • Path. The path of the node to be removed.

A remove operation may be represented by the following JSON object.

{
  "type": "remove",
  "path": "/a/b/c"
}

Set operation

The set operation is represented by a data structure containing the following information:

  • Path: the path of the node that needs to be modified.
  • Property name. The name of the property that should be created or overwritten.
  • Property type. The type of the property.
  • Value. The value of the property.

A set operation may be represented by the following JSON object.

{
  "type": "set",
  "path": "/a/b/c",
  "name": "foo",
  "type": "string",
  "value": "bar"
}

Unset operation

The unset operation is represented by a data structure containing the following information:

  • Path: the path of the node that needs to be modified.
  • Property name. The name of the property that should be created or overwritten.

An unset operation may be represented by the following JSON object.

{
  "type": "unset",
  "path": "/a/b/c",
  "name": "foo"
}

Read binary objects

A binary object can be read by passing a reference to it to the server. A binary object can be read in its entirety or in chunks. A reference to a binary object can be obtained by reading content from the repository or by creating a new binary object.

Required parameters:

  • User credentials.
  • Blob ID.

Optional parameters:

  • Start index. The index in the binary object where to start reading. Defaults to 0.
  • Number of bytes. The maximum number of bytes to read from the binary. The operation can still return less data if not enough bytes are in the binary. Defaults to -1. The default value means that every byte after the reading position should be returned.

Return values:

  • Number of bytes returned.
  • Binary content.

Write binary objects

Binary objects can be written to the repository using this operation. When a binary is written, the client receives a binary ID that can be used as a value for binary properties when writing content in the repository.

Required parameters:

  • User credentials.
  • Number of bytes to write.
  • Binary content.

Return values:

  • Binary ID.

Content in the repository can be queried using this operation. A server may support different query languages.

Required parameters:

  • User credentials.
  • Revision.
  • Query string. The query string can contain placeholders that are substituted with the values provided in the bindings parameter.
  • Query type. The query type identifies the syntax used in the query string.
  • Limit. The maximum number of results to include in the response.
  • Offset. The number of rows to skip when providing results. This parameter has a default value of zero.

Return values:

  • Size. The number of rows, if known. Otherwise, -1.
  • Column names. The names of the columns included in the result.
  • Results. An array of objects, where each object has a property for every column name and a value for the corresponding value.