Abdera2 - Atom

The Feed Object Model (FOM) is the set of objects you will use to interact with Atom Documents. It contains classes such as "Feed", "Entry", "Person", etc, all of which are modeled closely after the various elements and documents defined by the two Atom specifications. Refer to the Javadocs for the Feed Object Model for specific details on each class.

Extensions to the Atom format within Abdera can either be dynamic or static. A dynamic extensions use a generic API for setting and getting elements and attributes of the extension.

For instance, suppose we have an extension element called "mylist" within an Atom entry: one two ... ]]>

We can add this element to an existing Atom Entry element using the dynamic extension API:

Or you can implement a static extension by implementing a class that extends the ExtensibleElementWrapper class, implementing an ExtensionFactory for it and registering the ExtensionFactory with Abdera:

foo/MyListExtensionFactory.java:

META-INF/services/org.apache.abdera2.factory.ExtensionFactory foo.MyListExtensionFactory

Abdera will automatically locate the ExtensionFactory using the configuration file in the META-INF services path, making it simple to use the static extension:

When parsing an Atom document, the parser uses a set of default configuration options that will adequately cover most application use cases. There are, however, times when the parsing options need to be adjusted. The ParserOptions class can be used to tweak the operation of the parser in a number of important ways.

Abdera will, by default, attempt to automatically detect the character set used in an XML document. It will do so by looking at the XML prolog, the Byte Order Mark, or the first few bytes of the document. The process works reasonably well for the overwhelming majority of cases but it does cause of bit of performance hit. The autodetection algorithm can be disabled by calling options.setAutodetectCharset(false). This only has an effect when parsing an InputStream. This option allows you to manually set the character set the parser should use when decoding an InputStream. Abdera is capable of parsing InputStream's that have been compressed using the GZIP or Deflate algorithms (typically used as HTTP transfer encodings). setCompressionCodecs can be used to specify which encodings have been applied. By default, Abdera will throw a parse exception if any characters not allowed in XML are detected. By setting setFilterRestrictedCharacters(true), the parser will automatically filter out invalid XML characters. When setFilterRestrictedCharacters has been set to "true", Abdera will, by default, replace the character with an empty string. Alternatively, you can use setFilterRestrictedCharacterReplacement to specify a replacement character. See below There are a number of named character entities allowed by HTML and XHTML that are not supported in XML without a DTD. However, it is not uncommon to find these entities being used without a DTD. Abdera will, by default, automatically handle these entities by replacing them with the appropriate character equivalent. To disable automatic entity resolution call setResolveEntities(false). Doing so will cause Abdera to return an error whenever a named character entity is used. When setResolveEntities is true, registerEntity can be used to register a new custom named entity reference.

A ParseFilter is used to filter the stream of parse events. In the example below, only the elements added to the ParseFilter will be parsed and added to the Feed Object Model instance. All other elements will be silently ignored. The resulting savings in CPU and memory costs is significant.

Using a ParseFilter: doc = parser.parse(in,base,options); ]]> There are three basic types of ParseFilters: Only elements and attributes listed in the filter will be parsed. Elements and attributes listed in the filter will be ignored Allows multiple parse filters to be applied Developers can also create their own ParseFilter instances by implementing the ParseFilter, or extending the AbstractParseFilter or AbstractSetParseFilter abstract base classes:

MyCustomParseFilter.java

Using a CompoundParseFilter, a developer can apply multiple ParseFilters at once: There are four forms of CompoundParseFilter that can be created using static methods on the CompoundParseFilter class: Accepts the element or attribute only if it is acceptable to all contained ParseFilters Accepts the element or attribute if it is acceptable to any of the contained ParseFilters Accepts the element or attribute only if it is unacceptable to all contained ParseFilters Accepts the element or attribute if it is unacceptable to any of the contained ParseFilters Note that the unacceptableTo* variants will accept an element or attribute based on a negative result. This is particularly useful when building blacklist-based filters, where an item is only acceptable if it does not meet an explicitly stated condition.

Abdera uses a flexible mechanism for serializing Atom documents to a Java InputStream or Writer. A developer can use the default serializer or select an alternative Abdera writer implementation to use.

Using the default serializer: The default serializer will output valid, but unformatted XML; there will be no line-breaks or indents. Using the "Named Writer" mechanism, it is possible to select alternative serializers. Abdera ships with three alternative serialiers: PrettyXML, Activity Streams and JSON. Developers can implement additional serializers by implementing the Writer interface. Note: In Abdera2, the NamedWriter interface used in Abdera 1.x has been removed. Named Writers are now implemented using the base Writer interface with a org.apache.abdera2.common.anno.Name Attribute.

The PrettyXML Writer will output formatted XML containing linebreaks and indents:

The Activity Streams Writer will output the Atom Document using the JSON Activity Streams format:

The org.apache.abdera.writer.StreamWriter interface was added to Abdera after the release of 0.3.0. It provides an alternative means of writing out Atom documents using a streaming interface that avoids the need to build up a complex, in-memory object model. It is well suited for applications that need to quickly produce potentially large Atom documents.

") .writeSubtitle("Foo") .writeAuthor("James", null, null) .writeUpdatedNow() .writeLink("http://example.org/foo") .writeLink("http://example.org/bar","self") .writeCategory("foo") .writeCategory("bar") .writeLogo("logo") .writeIcon("icon") .writeGenerator("1.0", "http://example.org", "foo") .flush(); for (int n = 0; n < 100; n++) { out.startEntry() .writeId("http://example.org/" + n) .writeTitle("Entry #" + n) .writeUpdatedNow() .writePublishedNow() .writeEditedNow() .writeSummary("This is text summary") .writeAuthor("James", null, null) .writeContributor("Joe", null, null) .startContent("application/xml") .startElement(new QName("a","b","c")) .startElement(new QName("x","y","z")) .writeElementText("This is a test") .startElement("a") .writeElementText("foo") .endElement() .startElement("b","bar") .writeAttribute("foo", DateTimes.now()) .writeAttribute("bar", 123L) .writeElementText(123.123) .endElement() .endElement() .endElement() .endContent() .endEntry() .flush(); } out.endFeed() .endDocument() .flush(); ]]>

Atom allows for a broad range of text and content options. The choices can often times be confusing. Text constructs such as atom:title, atom:rights, atom:subtitle and atom:summary can contain plain text, escaped HTML or XHTML markup. The atom:content element can contain plain text, escaped HTML, XHTML markup, arbitrary XML markup, any arbitrary text-based format, Base64-encoded binary data or referenced external content. Abdera provides methods for dealing with these options.

Text Content Options: An HTML title

"); entry.setTitleAsXhtml("

An XHTML title

"); ]]>

Resulting atom:title elements: A text title <p>An HTML title</p> <div xmlns="http://www.w3.org/1999/xhtml"> <p>An XHTML title</p> </div> ]]>

Getting the text value:

Content options: An HTML title

"); // XHTML markup entry.setContentAsXhtml("

An XHTML title

"); // Arbitrary XML (parsed) entry.setContent("",Content.Type.XML); // Arbitrary XML QName qname = new QName("foo"); Element element = abdera.getFactory().newElement(qname); entry.setContent(element); // Base64-encoded binary from an inputstream InputStream in = ...; entry.setContent(in,"image/png"); // Base64-encoded from a Java Activation Framework DataHandler DataHandler dh = ...; entry.setContent(dh,"image/png"); // Content-by-reference using atom:content/@src IRI iri = new IRI("http://example.org"); entry.setContent(iri,"image/png"); ]]>

The resulting atom:content elements: A text title <p>An HTML title</p>

An XHTML title

{base64} {base64} ]]>

The Atom format requires that all dates and times be formatted to match the date-time construct from RFC 3339. The basic format is YYYY-MM-DD'T'HH:mm:ss.ms'Z' where 'Z' is either the literal value 'Z' or a timezone offset in the form +-HH:mm. Examples: 2007-10-31T12:11:12.123Z and 2007-10-31T12:11:12.123-08:00. Abdera2 uses the Joda-Time library for working with timestamps and provides a number of useful utility methods in the org.apache.abdera2.common.date.DateTimes class for working with dates.

// The current date and time in the default timezone org.joda.time.DateTime dt = DateTimes.now(); // Setting the value of the updated element on an entry entry.setUpdated(dt); // alternatively, if setting to the current time, // you can use the shortcut setUpdatedNow() method entry.setUpdatedNow(); The Joda-Time DateTime class provides a host of additional benefits and features relative to the old Abdera 1.x AtomDate class, including Date Arithmetic operations.

For instance, to indicate that an entry was updated five minutes prior to the current date and time, you can use the following: entry.setUpdated(DateTimes.now().minusMinutes(5)); The DateTime class has also been integrated with the Abdera2 Selector Framework and Guava Libraries Predicate API to enable powerful filtering options.

For instance, if you have an Atom feed and want to retrieve a listing of only the Entry objects whose updated timestamps fall within a given range, you can call:

The Atom format explicitly allows the Atom Date Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:updated, atom:published and app:edited elements. Such extensions can use the dynamic and static extension APIs:

Adding a Date Construct Extension using the dynamic API: Note that the addDateExtension on the Abdera Entry class is new in Abdera2.

More complex extensions of the Date Construct can be implemented statically by extending the DateTimeWrapper abstract class:

Atom defines the notion of a Person Construct to represent people and entities. A Person Construct consists minimally of a name, an optional email address and an optional URI.

The Atom format explicitly allows the Atom Person Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:author and atom:contributor elements. Such extensions can use the dynamic and static extension APIs:

The resulting extension element: John Doe john.doe@example.org http://example.org/~jdoe ]]> Static Person Construct extensions can be implemented by extending the org.apache.abdera2.model.PersonWrapper abstract class.

Atom link elements are similar in design to the link tag used in HTML and XHTML. They can be added to feed, entry and source objects.

Adding a link to an Atom Entry: The rel attribute specifies the meaning of the link. The value of rel can either be a simple name or an IRI. Simple names MUST be registered with IANA. Note that each of the values in the IANA registry have a full IRI equivalent value, e.g., the value "http://www.iana.org/assignments/relation/alternate" is equivalent to the simple name "alternate". Any rel attribute value that is not registered MUST be an IRI.

Abdera supports digital signatures and encryption of Atom documents.

Initialize the Signing Key:

Prepare the entry to sign: is markup"); entry.addAuthor("James"); entry.addLink("http://www.example.org"); ]]>

Prepare the signing options:

Sign the entry:

Verifying the signed entry: Note that the signer() and verifier() methods on the org.apache.abdera2.security.Security class are new within Abdera2 and return Guava Function objects that wrap the signing and verification logic.

Any valid Java crypto provider can be used. In these examples, we are using the Bouncy Castle provider.

Prepare the provider and the encryption key:

Prepare the entry to encrypt: is markup"); entry.addAuthor("James"); entry.addLink("http://www.example.org"); ]]>

Prepare the encryption options:

Encrypt the document: enc = absec.encryptor(options) .apply(entry.getDocument()); ]]>

Decrypting the document: ent = absec.decryptor(options) .apply(enc); ]]> Note that the encryptor() and decryptor() methods on the org.apache.abdera2.security.Security class are new in Abdera2 are return Guava Function objects that wrap the encryption and decryption logic.