Any23 0.5.0 Release Notes Fixes * Fixed wrong conversion of a generic XML file to RDF. [131] * Fixed usage of 'base' tag when resolving relative URIs in RDFa. [75] * Fixed error parsing Turtle data. [87] * Fixed issue with escaping in NQuads parser. [126] * Fixed XML DTD validation attempt. [95] * Fixed concurrent modification exception in ExtractionContentBlocker filter. [86] * Fixed mime type detection of direct input when source contains blank chars. [83, 90] * Fixed reporting when producing no triples. [79] * Fixed any23-service packaging, added profile for excluding embedded dependencies. [113] Enhancements * Improved extraction report: added list of activated extractors. [89] * Improved extraction of HTML link element. [133] * Added XPath HTML extractor. [124] * Added HRecipe Microformat extractor. [103] * Added plugin support for Any23. [111] * Implemented HTML Scraper Plugin. [123] * Upgraded to Sesame 2.4.0. [136] * Upgraded to Jetty 8.0.0 [138] * Upgraded maven-site-plugin. [85] * Added flags to exclude metadata triples [134] * Added removal of CSS related triples. [135] * Improved overall documentation. [130] * Overall POM refactoring. [125] ========================================================================== Any23 0.4.0 Release Notes * The any23-service module has been separated from the any23-core module, the Ant build system has been dropped. [Issue 44] * Added support for HTML metadata (RDFa / Microformats) validation and correction (validator). [Issue 77] * Added flag to disable the nesting relationship property enrichment. [Issue 67] * Improved coverage of Microformats tests. [Issue 65] * Improved documentation. [Issue 44] * Various code consolidation. [Issues 68, 69, 70, 71, 72, 73, 74, 77] ========================================================================== Any23 0.3.0 Release Notes * Added detection and enrichment of nested microformats. [Issue #61] * Added detection and support of N-Quads as input and output format. [Issue #7] * General Improvements in RDFa extraction. [Issue #12, Issue #14] * Added support of Turtle embedded in HTML script tag. [Issue #62] * Improvement in encoding support. [Issue #43] * Improvement in Core API. [Issue #27] * Improved support for Species Microformat. [Issue #63] * General Code prettification. ========================================================================== Any23 0.2.2 Release Notes * Fixed dependency management on Maven. A second level dependency of Xerces introduced a conflict on the java.xml.transform API causing wrong XSLT transformations within RDFa extractor. ========================================================================== Any23 0.2.1 Release Notes * Major applyFix on Tika configuration management. This applyFix solves the auto detection of the main Semantic Web related formats. ========================================================================== Any23 0.2 Release Notes ============ Introduction ============ This release features a redesigned API and incorporating enhancements and bug fixes that have accumulated since the 0.1 release. Apart from some new or changed dependencies on the underlying libraries, this version comes with an improved unit test coverage and other features like the automatic charset encoding detection and an improved documentation. Maven build system has been introduced. ================================== Summary of major changes since 0.1 ================================== * Redesigned Java API - Input from string, stream, file, or URI - Allow choosing which extractors to use - Report origin of triples (document/extractor) to client processors - Various processors/serializers for extracted triples * Added flexible command-line tool for easy testing * Vastly improved website and documentation * Media type and encoding detection via Apache Tika * Switched RDF library from Jena to Sesame * Added Maven build * Better RDF extraction from Microformats * Extractors now come with an example file to document typical in- and output * Major refactoring * Lots and lots of bugfixes ================= Supported formats ================= * RDF/XML * Notation3 and Turtle * N-Triples * RDFa Various microformats, see http://sindice.com/developers/microformat on Sindice Microformats support. =================== Dependency Upgrade =================== CyberNeko Html parser has been upgraded to 1.9.14. Apache Tika 0.3 has been replaced with 0.6, with the new support for the automatic encoding detection. EOF