apache tinkerpop logo

TinkerPop Upgrade Information

This document helps users of TinkerPop to understand the changes that come with each software release. It outlines new features, how to resolve breaking changes and other information specific to a release. This document is useful to end-users who are building applications on TinkerPop, but it is equally useful to TinkerPop providers, who build libraries and other systems on the the core APIs and protocols that TinkerPop exposes.

These providers include:

TinkerPop 3.1.0

gremlin gangster

A 187 On The Undercover Gremlinz

TinkerPop 3.1.0

Release Date: November 16, 2015

Please see the changelog for a complete list of all the modifications that are part of this release.

Additional upgrade information can be found here:

Upgrading for Users

Shading Jackson

The Jackson library is now shaded to gremlin-shaded, which will allow Jackson to version independently without breaking compatibility with dependent libraries or with those who depend on TinkerPop. The downside is that if a library depends on TinkerPop and uses the Jackson classes, those classes will no longer exist with the standard Jackson package naming. They will have to shifted as follows:

  • org.objenesis becomes org.apache.tinkerpop.shaded.objenesis

  • com.esotericsoftware.minlog becomes org.apache.tinkerpop.shaded.minlog

  • com.fasterxml.jackson becomes org.apache.tinkerpop.shaded.jackson

PartitionStrategy and VertexProperty

PartitionStrategy now supports partitioning within VertexProperty. The Graph needs to be able to support meta-properties for this feature to work.

Gremlin Server and Epoll

Gremlin Server provides a configuration option to turn on support for Netty native transport on Linux, which has been shown to help improve performance.

Rebindings Deprecated

The notion of "rebindings" has been deprecated in favor of the term "aliases". Alias is a better and more intuitive term than rebindings which should make it easier for newcomers to understand what they are for.

Configurable Driver Channelizer

The Gremlin Driver now allows the Channerlizer to be supplied as a configuration, which means that custom implementations may be supplied.

GraphSON and Strict Option

The GraphMLReader now has a strict option on the Builder so that if a data type for a value is invalid in some way, GraphMLReader will simply skip that problem value. In that way, it is a bit more forgiving than before especially with empty data.

Transaction.close() Default Behavior

The default behavior of Transaction.close() is to rollback the transaction. This is in contrast to previous versions where the default behavior was commit. Using rollback as the default should be thought of as a like a safer approach to closing where a user must now explicitly call commit() to persist their mutations.

See TINKERPOP3-805 for more information.

ThreadLocal Transaction Settings

The Transaction.onReadWrite() and Transaction.onClose() settings now need to be set for each thread (if another behavior than the default is desired). For gremlin-server users that may be changing these settings via scripts. If the settings are changed for a sessionless request they will now only apply to that one request. If the settings are changed for an in-session request they will now only apply to all future requests made in the scope of that session.

Hadoop-Gremlin

  • Hadoop1 is no longer supported. Hadoop2 is now the only supported Hadoop version in TinkerPop.

  • Spark and Giraph have been split out of Hadoop-Gremlin into their own respective packages (Spark-Gremlin and Giraph-Gremlin).

  • The directory where application jars are stored in HDFS is now hadoop-gremlin-3.1.0-incubating-libs.

    • This versioning is important so that cross-version TinkerPop use does not cause jar conflicts.

See link:https://issues.apache.org/jira/browse/TINKERPOP3-616

Spark-Gremlin

  • Providers that wish to reuse a graphRDD can leverage the new PersistedInputRDD and PersistedOutputRDD.

    • This allows the graphRDD to avoid serialization into HDFS for reuse. Be sure to enabled persisted SparkContext (see documentation).

See link:https://issues.apache.org/jira/browse/TINKERPOP3-868, link:https://issues.apache.org/jira/browse/TINKERPOP3-925

TinkerGraph Serialization

TinkerGraph is serializable over Gryo, which means that it can shipped over the wire from Gremlin Server. This feature can be useful when working with remote subgraphs.

Deprecation in TinkerGraph

The public static String configurations have been renamed. The old public static variables have been deprecated. If the deprecated variables were being used, then convert to the replacements as soon as possible.

Deprecation in Gremlin-Groovy

The closure wrappers classes GFunction, GSupplier, GConsumer have been deprecated. In Groovy, a closure can be specified using as Function and thus, these wrappers are not needed. Also, the GremlinExecutor.promoteBindings() method which was previously deprecated has been removed.

Gephi Traversal Visualization

The process for visualizing a traversal has been simplified. There is no longer a need to "name" steps that will represent visualization points for Gephi. It is possible to just "configure" a visualTraversal in the console:

gremlin> :remote config visualTraversal graph vg

which creates a special TraversalSource from graph called vg. The traversals created from vg can be used to :submit to Gephi.

Alterations to GraphTraversal

There were a number of changes to GraphTraversal. Many of the changes came by way of deprecation, but some semantics have changed as well:

  • ConjunctionStrategy has been renamed to ConnectiveStrategy (no other behaviors changed).

  • ConjunctionP has been renamed to ConnectiveP (no other behaviors changed).

  • DedupBijectionStrategy has been renamed (and made more effective) as FilterRankingStrategy.

  • The GraphTraversal mutation API has change significantly with all previous methods being supported but deprecated.

    • The general pattern used now is addE('knows').from(select('a')).to(select('b')).property('weight',1.0).

  • The GraphTraversal sack API has changed with all previous methods being supported but deprecated.

    • The old sack(mult,'weight') is now sack(mult).by('weight').

  • GroupStep has been redesigned such that there is now only a key- and value-traversal. No more reduce-traversal.

    • The previous group()-methods have been renamed to groupV3d0(). To immediately upgrade, rename all your group()-calls to groupV3d0().

    • To migrate to the new group()-methods, what was group().by('age').by(outE()).by(sum(local)) is now group().by('age').by(outE().sum()).

  • There was a bug in fold(), where if a bulked traverser was provided, the traverser was only represented once.

    • This bug fix might cause a breaking change to a user query if the non-bulk behavior was being counted on. If so, used dedup() prior to fold().

  • Both GraphTraversal().mapKeys() and GraphTraversal.mapValues() has been deprecated.

    • Use select(keys) and select(columns). However, note that select() will not unroll the keys/values. Thus, mapKeys()select(keys).unfold().

  • The data type of Operator enums will now always be the highest common data type of the two given numbers, rather than the data type of the first number, as it’s been before.

Aliasing Remotes in the Console

The :remote command in Gremlin Console has a new alias configuration option. This alias option allows specification of a set of key/value alias/binding pairs to apply to the remote. In this way, it becomes possible to refer to a variable on the server as something other than what it is referred to for purpose of the submitted script. For example once a :remote is created, this command:

:remote alias x g

would allow "g" on the server to be referred to as "x".

:> x.E().label().groupCount()

Upgrading for Providers

Important
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation.

All providers should be aware that Jackson is now shaded to gremlin-shaded and could represent breaking change if there was usage of the dependency by way of TinkerPop, a direct dependency to Jackson may be required on the provider’s side.

Graph System Providers

GraphStep Alterations
  • GraphStep is no longer in sideEffect-package, but now in map-package as traversals support mid-traversal V().

  • Traversals now support mid-traversal V()-steps. Graph system providers should ensure that a mid-traversal V() can leverage any suitable index.

See link:https://issues.apache.org/jira/browse/TINKERPOP3-762

Decomposition of AbstractTransaction

The AbstractTransaction class has been abstracted into two different classes supporting two different modes of operation: AbstractThreadLocalTransaction and AbstractThreadedTransaction, where the former should be used when supporting ThreadLocal transactions and the latter for threaded transactions. Of course, providers may still choose to build their own implementation on AbstractTransaction itself or simply implement the Transaction interface.

The AbstractTransaction gains the following methods to potentially implement (though default implementations are supplied in AbstractThreadLocalTransaction and AbstractThreadedTransaction):

  • doReadWrite that should execute the read-write consumer.

  • doClose that should execute the close consumer.

Transaction.close() Default Behavior

The default behavior for Transaction.close() is to rollback the transaction and is enforced by tests, which previously asserted the opposite (i.e. commit on close). These tests have been renamed to suite the new semantics:

  • shouldCommitOnCloseByDefault became shouldCommitOnCloseWhenConfigured

  • shouldRollbackOnCloseWhenConfigured became shouldRollbackOnCloseByDefault

If these tests were referenced in an OptOut, then their names should be updated.

Graph Traversal Updates

There were numerous changes to the GraphTraversal API. Nearly all changes are backwards compatible with respective "deprecated" annotations. Please review the respective updates specified in the "Graph System Users" section.

  • GraphStep is no longer in sideEffect package. Now in map package.

  • Make sure mid-traversal GraphStep calls are folding HasContainers in for index-lookups.

  • Think about copying TinkerGraphStepStrategyTest for your implementation so you know folding is happening correctly.

Element Removal

Element.Exceptions.elementAlreadyRemoved has been deprecated and test enforcement for consistency have been removed. Providers are free to deal with deleted elements as they see fit.

VendorOptimizationStrategy Rename

The VendorOptimizationStrategy has been renamed to ProviderOptimizationStrategy. This renaming is consistent with revised terminology for what were formerly referred to as "vendors".

GraphComputer Updates

GraphComputer.configure(String key, Object value) is now a method (with default implementation). This allows the user to specify engine-specific parameters to the underlying OLAP system. These parameters are not intended to be cross engine supported. Moreover, if there are not parameters that can be altered (beyond the standard GraphComputer methods), then the provider’s GraphComputer implementation should simply return and do nothing.

Driver Providers

Aliases Parameter

The "rebindings" argument to the "standard" OpProcessor has been renamed to "aliases". While "rebindings" is still supported it is recommended that the upgrade to "aliases" be made as soon as possible as support will be removed in the future. Gremlin Server will not accept both parameters at the same time - a request must contain either one parameter or the other if either is supplied.

ThreadLocal Transaction Settings

If a driver configures the Transaction.onReadWrite() or Transaction.onClose() settings, note that these settings no longer apply to all future requests. If the settings are changed for a sessionless request they will only apply to that one request. If the settings are changed from an in-session request they will only apply to all future requests made in the scope of that session.

TinkerPop 3.0.0

gremlin hindu

A Gremlin Rāga in 7/16 Time

TinkerPop 3.0.2

Release Date: October 19, 2015

Please see the changelog for a complete list of all the modifications that are part of this release.

Upgrading for Users

BulkLoaderVertexProgram (BLVP)

BulkLoaderVertexProgram now supports arbitrary inputs (i addition to HadoopGraph, which was already supported in version 3.0.1-incubating). It can now also read from any TP3 enabled graph, like TinkerGraph or Neo4jGraph.

TinkerGraph

TinkerGraph can now be configured to support persistence, where TinkerGraph will try to load a graph from a specified location and calls to close() will save the graph data to that location.

Gremlin Driver and Server

There were a number of fixes to gremlin-driver that prevent protocol desynchronization when talking to Gremlin Server.

On the Gremlin Server side, Websocket sub-protocol introduces a new "close" operation to explicitly close sessions. Prior to this change, sessions were closed in a more passive fashion (i.e. session timeout). There were also so bug fixes around the protocol as it pertained to third-party drivers (e.g. python) using JSON for authentication.

Upgrading for Providers

Graph Driver Providers

Gremlin Server close Operation

It is important to note that this feature of the sub-protocol applies to the SessionOpProcessor (i.e. for session-based requests). Prior to this change, there was no way to explicitly close a session. Sessions would get closed by the server after timeout of activity. This new "op" gives drivers the ability to close the session explicitly and as needed.

TinkerPop 3.0.1

Release Date: September 2, 2015

Please see the changelog for a complete list of all the modifications that are part of this release.

Upgrading for Users

Gremlin Server

Gremlin Server now supports a SASL-based (Simple Authentication and Security Layer) authentication model and a default SimpleAuthenticator which implements the PLAIN SASL mechanism (i.e. plain text) to authenticate requests. This gives Gremlin Server some basic security capabilities, especially when combined with its built-in SSL feature.

There have also been changes in how global variable bindings in Gremlin Server are established via initialization scripts. The initialization scripts now allow for a Map of values that can be returned from those scripts. That Map will be used to set global bindings for the server. See this sample script for an example.

Neo4j

Problems related to using :install to get the Neo4j plugin operating in Gremlin Console on Windows have been resolved.

Upgrading for Providers

Graph System Providers

GraphFactoryClass Annotation

Providers can consider the use of the new GraphFactoryClass annotation to specify the factory class that GraphFactory will use to open a new Graph instance. This is an optional feature and will generally help implementations that have an interface extending Graph. If that is the case, then this annotation can be used in the following fashion:

@GraphFactory(MyGraphFactory.class)
public interface MyGraph extends Graph{
}

MyGraphFactory must contain the static open method that is normally expected by GraphFactory.

GraphProvider.Descriptor Annotation

There was a change that affected providers who implemented GraphComputer related tests such as the ProcessComputerSuite. If the provider runs those tests, then edit the GraphProvider implementation for those suites to include the GraphProvider.Descriptor annotation as follows:

@GraphProvider.Descriptor(computer = GiraphGraphComputer.class)
public final class HadoopGiraphGraphProvider extends HadoopGraphProvider {

    public GraphTraversalSource traversal(final Graph graph) {
        return GraphTraversalSource.build().engine(ComputerTraversalEngine.build().computer(GiraphGraphComputer.class)).create(graph);
    }
}

See: TINKERPOP3-690 for more information.

Semantics of Transaction.close()

There were some adjustments to the test suite with respect to how Transaction.close() was being validated. For most providers, this will generally mean checking OptOut annotations for test renaming problems. The error that occurs when running the test suite should make it apparent that a test name is incorrect in an OptOut if there are issues there.

See: TINKERPOP3-764 for more information.

Graph Driver Providers

Authentication

Gremlin Server now supports SASL-based authentication. By default, Gremlin Server is not configured with authentication turned on and authentication is not required, so existing drivers should still work without any additional change. Drivers should however consider implementing this feature as it is likely that many users will want the security capabilities that it provides.