Fork me on GitHub

Validation

Overview

The artifact org.apache.jackrabbit.vault:vault-validation provides both an API for validating FileVault packages as well as an SPI for implementing validators. In addition this JAR contains useful validators.

This validation framework is supposed to be used as

  1. dependency for custom validators (SPI)
  2. library for build tools which want to call validation on FileVault packages (API and Implementation)

Validators

Settings

It is possible to adjust every validator registered in the system (both default and external validators) via settings. Settings have a common section (which apply to every validator) but also support validator specific options.

Element Description Default Value
defaultSeverity Each validation message has a severity. The default validation message severity of each validator can be influenced with this parameter. If a validator emits different types of validation messages the other types can be influenced via options. error
isDisabled A boolean flag defining whether validator is disabled or not. false
options A map (i.e. keys and values) of validator specific options. The supported options for the validators are outlined below. empty

Each validator settings are set for a specific validator id.

Incremental Execution

It is possible to run validation only on a subset of files contained in the package. Currently the filevault-package-maven-plugin with its goal validate-files is leveraging that if running inside Eclipse with m2e. To prevent some validators from emitting false positives they act less strict if executed in incremental mode. The table below outlines how incremental executions differ from full executions.

Standard Validators

ID Description Options Incremental Execution Limitations
jackrabbit-filter Checks for validity of the filter.xml (according to a predefined XML schema). In addition checks that every docview xml node is contained in the filter. It also makes sure that all filter root's ancestors are either known/valid roots or are contained in the package dependencies. For ancestor nodes which are not covered by a filter at least a warn is emitted. Also it makes sure that pattern values for includes/excludes as well as root values for each filter entry are valid. Orphaned filter rules (i.e. ones not being necessary) lead to validation issues as well. severityForUncoveredAncestorNodes: severity of validation messages for uncovered ancestor nodes (default = warn).
severityForUndefinedFilterRootAncestors severity of validation messages in case the filter root ancestors are neither covered by package depenendencies or are one of the validRoots (default = error, for package type=application or warn for all other package types). The deprecated option severityForUncoveredFilterRootAncestors should no longer be used.
validRoots: comma-separated list of valid roots (default = "/,/libs,/apps,/etc,/var,/tmp,/content")
Orphaned filter rules are not checked.
jackrabbit-properties Checks for validity of the properties.xml. none none
jackrabbit-dependencies Checks for overlapping filter roots of the referenced package dependencies as well as for valid package dependency references (i.e. references which can be resolved). severityForUnresolvedDependencies: severity of validation messages for unresolved dependencies (default = warn) none
jackrabbit-docviewparser Checks if all docview files in the package are compliant with the (extended) Document View Format. This involves checking for XML validity as well as checking for known property types and their valid string serializations. allowUndeclaredPrefixInFileName (since 3.6.0): if set to false then prefixes only used in the encoded docview xml filename need to be declared as namespace in the XML itself, otherwise a missing namespace declaration is ignored (default = true, since 3.6.2, in older versions default = false) none
jackrabbit-emptyelements Check for empty elements within DocView files (used for ordering purposes, compare with (extended) Document View Format) which are included in the filter with import=replace as those are actually not replaced! none none
jackrabbit-mergelimitations Checks for the limitation of import mode=merge outlined at JCRVLT-255. none none
jackrabbit-oakindex Checks if the package (potentially) modifies/creates an Oak index definition. This is done by evaluating both the filter.xml for potential matches as well as the actual content for nodes with jcr:primaryType oak:indexDefinition. none none
jackrabbit-packagetype Checks if the package type is correctly set for this package, i.e. is compliant with all rules outlined at Package Types. jcrInstallerNodePathRegex: the regular expression which all JCR paths of OSGi bundles and configurations within packages must match (default=/([^/]*/){0,4}?(install|config)[\./].*). This should match the paths being picked up by JCR Installer. Paths of OSGi configurations based on sling:OsgiConfig nodes are tested against this pattern as well.
additionalJcrInstallerFileNodePathRegex: the regular expression which the JCR paths of all file-based OSGi bundles and configurations within packages must match in addition to jcrInstallerPathRegex (default=.*\.(config|cfg|cfg\.json|jar)). This should match the paths being picked up by JCR Installer. OSGi configurations based on sling:OsgiConfig nodes are not tested against this pattern.
legacyTypeSeverity: the severity of the validation message for package type mixed (default = warn).
noTypeSeverity: the severity of the validation message when package type is not set at all (default = warn).
prohibitMutableContent: boolean flag determining whether package type content or mixed (mutable content) leads to a validation message with severity error (default = false). Useful when used with Oak Composite NodeStore.
prohibitImmutableContent: boolean flag determining whether package type app, container or mixed (immutable content) leads to a validation message with severity error (default = false). Useful when used with Oak Composite NodeStore.
allowComplexFilterRulesInApplicationPackages: boolean flag determining whether complex rules (containing includes/excludes) are allowed in application content packages (default = false).
allowInstallHooksInApplicationPackages: boolean flag determining whether install hooks are allowed in application content packages (default = false).
immutableRootNodeNames: comma-separated list of immutable root node names (default = "apps,libs")
none
jackrabbit-nodetypes Checks if all non empty elements within DocView files have the mandatory property jcr:primaryType set and follow the node type definition of their given type. cnds: A URI pointing to one or multiple CNDs (separated by ,) which define the additional namespaces and node types used apart from the default ones defined in JCR 2.0 and the ones defined in the package's metadata. If a URI is pointing to a JAR, the validator will leverage all the node types being mentioned in the Sling-Nodetypes manifest header. Apart from the standard protocols the scheme tccl can be used to reference names from the Thread's context class loader. In the Maven plugin context this is the plugin classloader.
defaultNodeType: the node type in expanded or qualified form which is used for unknown ancestor nodes which are not given otherwise (default = nt:folder). Note Using the default is pretty conservative but the safest approach. It may lead to a lot of issues as nt:folder is heavily restricted. In general you cannot know with which type the parent node already exists in the repository and FileVault itself for a long time created nt:folder nodes as intermediates so this is the safest option. If you are sure that all intermediate node types are of the correct type, you should use a type with no restrictions (nt:unstructured).
severityForDefaultNodeTypeViolations: The severity of issues being emitted due to violations against the default node type (for implicit ancestor nodes, default = WARN).
severityForUnknownNodetypes: The severity of issues being emitted due to an unknown primary/mixin type set on a node (default = WARN).
validNameSpaces: Configure list of namespaces that are known to be valid. Syntax: prefix1=http://uri1,prefix2=http://uri2,....
Child node validity, mandatory properties and mandatory child nodes are not checked as they might not be fully visible.
jackrabbit-accesscontrol Checks that access control list nodes (primary type rep:ACL, rep:CugPolicy and rep:PrincipalPolicy) are only used when the package property's acHandling is set to something but ignore or clear and also that there is at least one access control list node otherwise. none Validation message in case no access control list node is found but acHandling is set to anything but ignore or clear is suppressed.
jackrabbit-duplicateuuid Checks that every value of property jcr:uuid is unique. Compare with Referenceable Nodes. none might emit false negatives (i.e. not detect duplicates)
jackrabbit-overlappingfilter Checks that filters of two distinct content packages don't overlap. In order for this validator to work, the conflicting packages must be included as subpackages or embeds (even nested ones) of the same container package. severityForOverlappingSingleNodePatterns: severity of validation messages for package filters which overlap but only affect a single node and not a full subtree (default = WARN). This pattern is used frequently to enforce a certain ancestor node type and therefore is often acceptable (still the properties set through multiple packages on this single node should always be the same). does not work

Custom Validators

The SPI for implementing custom validators is provided in this package. The validators are registered via a ValidatorFactory which is supposed to be registered via the ServiceLoader.

The SPI is exported from the artifact org.apache.jackrabbit.vault:vault-validation as well.

The validator which is returned via the ValidatorFactory is one of the following types below package org.apache.jackrabbit.filevault.maven.packaging.validator

Validator Class Description Scope Called from another validator
DocumentViewXmlValidator Called for each node serialized into a DocView element jcr_root no
NodePathValidator Called for each node path contained in the package (even for ones not listed in the filter.xml) jcr_root no
JcrPathValidator Called for each file path contained in the package jcr_root no
GenericJcrDataValidator Called for all serialized nodes which are not DocViewXml jcr_root no
FilterValidator Called for the vault/filter.xml file META-INF yes (jackrabbit-filter)
PropertiesValidator Called for the vault/properties.xml file META-INF yes (jackrabbit-properties)
GenericMetaInfDataValidator Called for all META-INF files (even vault/filter.xml nor vault/properties.xml). In general prefer the higher level validators (i.e. FilterValidator or PropertiesValidator if possible) META-INF no
MetaInfFilePathValidator Called for each file path contained in the package below META-INF META-INF no

3rd Party Validators

Name Description Link
Sling Repoinit Validator Validates Sling Repoinit statements https://sling.apache.org/documentation/bundles/repository-initialization.html#filevault-validator
AEM Cloud Validator Prevents invalid usage patterns for AEM as a Cloud Service https://github.com/Netcentric/aem-cloud-validator
AEM Replication Medata Validator Enforces correct replication metadata for certain nodes https://github.com/Netcentric/aem-replication-metadata-validator
AEM Content Classification Validator Validates usage of nodes according to AEM Content Classification https://github.com/Netcentric/aem-classification/tree/master/aem-classification-validator

Please raise a PR to get other 3rd party validators listed above.

Validation API

The API for calling validation on specific files is provided in package org.apache.jackrabbit.vault.validation.

First you need one instance of ValidationExecutorFactory. For each new ValidationContext (i.e. new package context) you create a new ValidationExecutor via ValidationExecutorFactory.createValidationExecutor(...). For each file you then call either

  • ValidationExecutor.validateJcrRoot(...) for input streams referring to files which are supposed to end up in the repository or
  • ValidationExecutor.validateMetaInf(...) for input streams representing metaInf data of the FileVault package

The Validation API is currently used by the FileVault Package Maven Plugin.