General What is JCR?

JCR is the acronym of the JSR 170: Content Repository for Java™ technology API, a standard interface for accessing content repositories.

What is a content repository?

A content repository is an information management system that provides various services for storing, accessing, and managing content. In addition to a hierarchically structured storage system, common services of a content repository are versioning, access control, full text searching, and event monitoring.

A content repository is not a content management system (CMS), although most of the existing CMSs contain a more or less featured custom content repository implementation. A CMS uses a content repository as an underlying component for presentation, business logic, and other features.

What is Jackrabbit?

The Apache Jackrabbit is a fully featured content repository that implements all of the JCR API. The Jackrabbit project was started when Day Software, the JSR-170 specification lead, licensed their initial implementation of the JCR reference implementation. Since then the Jackrabbit codebase has been used for the official reference implementation (RI) and technology compatibility kit (TCK) released along with the final JCR API.

Building Jackrabbit How do I build the Jackrabbit sources? See the Building Jackrabbit section of the Jackrabbit documentation for detailed build instructions. Why does the Maven build fail with the message "You must define currentVersion in your POM."?

You are most probably running Maven from a wrong directory. Maven expects to find the file project.xml in the current directory (unless the -d, -p, or -f option is given). Please check that you are in the correct directory and try running Maven again.

Why does the Maven build fail with the message "java.net.ConnectException: Connection timed out: connect"?

This error message can appears when one of the Maven repositories used for downloading Jackrabbit dependencies is not available. This can happen if your network connection is broken or if the repository server is down. Please check your network connection or wait a while for the repository to come back online.

Using Jackrabbit How do I do X with JCR/Jackrabbit?

See the JCR specification, the JCR API documentation, or the Examples page on the Jackrabbit wiki for information on how to perform various operation using the JCR API.

For Jackrabbit features (like access control and node type management) not covered by the JCR API, see the Examples page on the wiki, the Jackrabbit javadocs, or contact the Jackrabbit mailing list.

How do I use transactions with JCR?

See the mailing list announcement for a simple example on using the JTA support in Jackrabbit.

For a more complete explanation of the transaction features, please see section 8.1 Transactions of the JCR specification.

How do I create new workspaces in Jackrabbit?

The JCR API does not contain features for creating or managing workspaces, so you need to use Jackrabbit-specific functionality for creating new workspaces.

You can create a new workspace either manually or programmatically. The manual way is to create a new workspace directory within the repository home directory and to place a new workspace.xml configuration file in that folder. You can use the configuration file of an existing workspace as an example, just remember to change the name of the workspace in the <Workspace name="...">" tag. See the Configuring Jackrabbit page for configuration details. Note also that you need to restart the repository instance to access the new workspace.

The programmatic way is to acquire a Workspace instance using the normal JCR API and to cast the instance to the Jackrabbit WorkspaceImpl class. You can then use the WorkspaceImpl.createWorkspace(String) method to create new workspaces.

How do I delete a workspace in Jackrabbit?

There is currently no programmatic way to delete workspaces. You can delete a workspace by manually removing the workspace directory when the repository instance is not running.

Access control How do I use LDAP, Kerberos, or some other authentication mechanism with Jackrabbit?

Jackrabbit uses the Java Authentication and Authorization Service (JAAS) for authenticating users. You should be able to use any JAAS LoginModule implementation (e.g. the LoginModules in the com.sum.security.auth.module package) for authentication. See the JAAS documentation for configuration instructions.

How do I manage the access rights of authenticated users?

The current Jackrabbit SimpleAccessManager class only supports three access levels: anonymous, normal, and system. Anonymous users have read access while normal and system users have full read-write access. You need to implement a custom AccessManager class to get more fine-grained access control.

Persistence managers What is a persistence manager?

A persistence manager (PM) is an internal Jackrabbit component that handles the persistent storage of content nodes and properties. Each workspace of a Jackrabbit content repository uses a separate persistence manager to store the content in that workspace. Also the Jackrabbit version handler uses a separate persistence manager.

The persistence manager sits at the very bottom layer of the Jackrabbit system architecture. Reliability, integrity and performance of the PM are crucial to the overall stability and performance of the repository. If e.g. the data that a PM is based upon is allowed to change through external means the integrity of the repository would be at risk (think of referential integrity / node references e.g.).

In practice, a persistence manager is any Java class that implements the PersistenceManager interface and the associated behavioural contracts. Jackrabbit contains a set of built-in persistence manager classes that cover most of the deployment needs. There are also a few contributed persistence managers that give additional flexibility.

What is a Jackrabbit file system?

A Jackrabbbit file system (FS) is an internal component that implements standard file system operations on top of some underlying storage mechanism (a normal file system, a database, a webdav server, or a custom file format). A file system component is any Java class that implements the FileSystem interface and the associated behavioural contracts. File systems are used in Jackrabbit both as subcomponents of the persistence managers and for general storage needs (for example to store the full text indexes).

Can I use a persistence manager to access an existing data source?

No. The persistence manager interface was never intended as being a general SPI that you could implement in order to integrate external data sources with proprietary formats (e.g. a customers database). The reason why we abstracted the PM interface was to leave room for future performance optimizations that would not affect the rest of the implementation (e.g. by storing the raw data in a b-tree based database instead of individual file).

How "smart" should a persistence manager be?

A persistence manager should not be intelligent, i.e. it should not interpret the content it is managing. The only thing it should care about is to efficiently, consistently, and reliably store and read the content encapsulated in the passed NodeState and PropertyState objects. Though it might be feasible to write a custom persistence manager to represent existing legacy data in a level-1 (read-only) repository, I don't think the same is possible for a level-2 repository and I certainly would not recommend it.

What persistence managers are available?

The table below lists the currently available persistence managers, along with the status and pros and cons of each PM.

Persistence manager Status Pros Cons
SimpleDbPersistenceManager (and subclasses thereof) mature
  • Jackrabbit's default persistence manager
  • JDBC based persistence supporting a wide range of RDBMSs
  • zero-deployment, schema is automatically created
  • Transactional
  • uses simple non-normalized schema and binary serialization format which might not appeal to relational data modeling fans
BerkeleyDBPersistenceManager mature?
  • btree-based persistence (BerkeleyDB JE)
  • zero-deployment
  • Transactional
  • Uses binary serialization format
  • Licensing issues
ObjectPersistenceManager mature
  • File system based persistence
  • Easy to configure
  • Uses binary serialization format
  • If the JVM process is killed the repository might turn inconsistent
  • Not transactional
XMLPersistenceManager mature
  • File system based persistence
  • Uses XML serialization format
  • Easy to configure
  • If the JVM process is killed the repository might turn inconsistent
  • Poor performance
  • Not transactional
ORM persistence manager experimental & unfinished
  • ORM-based persistence
  • Transactional
  • Complex to configure & setup
  • Still being maintained?
What Jackrabbit file systems are available?

The table below lists the currently available Jackrabbit file systems, along with the status and pros and cons of each FS.

File system Status Pros Cons
LocalFileSystem mature
  • Slow on Windows boxes
DbFileSystem mature
  • JDBC based file system supporting a wide range of RDBMSs
  • zero-deployment, schema is automatically created
  • Slower than native file systems
CQFS file system mature
  • Fast on Windows boxes
  • Undocumented configuration options
  • Proprietary binary format
  • Not open source
Which persistence manager and file systems should I use?

The answer depends on your priorities. If you want to store your data in a RDBMS, use SimpleDbPersistenceManager and either LocalFileSystem or DbFileSystem. If you want to store your data in an accessible format (just in case or for manual debugging), you might want to try the XMLPersistenceManager and the LocalFileSystem. If you use Windows and performance is a must, you might want to try the ObjectPersistenceManager and the proprietary CQFS.