---------------- Apache Tika 1.10 ---------------- ~~ Licensed to the Apache Software Foundation (ASF) under one or more ~~ contributor license agreements. See the NOTICE file distributed with ~~ this work for additional information regarding copyright ownership. ~~ The ASF licenses this file to You under the Apache License, Version 2.0 ~~ (the "License"); you may not use this file except in compliance with ~~ the License. You may obtain a copy of the License at ~~ ~~ http://www.apache.org/licenses/LICENSE-2.0 ~~ ~~ Unless required by applicable law or agreed to in writing, software ~~ distributed under the License is distributed on an "AS IS" BASIS, ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~~ See the License for the specific language governing permissions and ~~ limitations under the License. Apache Tika 1.10 The most notable changes in Tika 1.10 over the previous release are: * Tika Config XML can now be used to create composite detectors, and exclude detectors that DefaultDetector would otherwise have used. This brings support in-line with Parsers. ({{{http://issues.apache.org/jira/browse/TIKA-1702}TIKA-1702}}). * Reverted to legacy sort order of parsers that was mistakenly reversed in Tika 1.9 ({{{http://issues.apache.org/jira/browse/TIKA-1689}TIKA-1689}}). * Upgrade to POI 3.13-beta1 ({{{http://issues.apache.org/jira/browse/TIKA-1667}TIKA-1667}}). * Upgrade to PDFBox 1.8.10 ({{{http://issues.apache.org/jira/browse/TIKA-1588}TIKA-1588}}). * MimeTypes now tries to find a registered type with and without parameters ({{{http://issues.apache.org/jira/browse/TIKA-1692}TIKA-1692}}). * Added more robust error handling for encoding detection of .MSG files ({{{http://issues.apache.org/jira/browse/TIKA-1238}TIKA-1238}}). * Fixed bug in Tika's use of the Jackcess parser that prevented reading of v97 Access files ({{{http://issues.apache.org/jira/browse/TIKA-1681}TIKA-1681}}). * Upgrade xerial.org's sqlite-jdbc to 3.8.10.1. NOTE: as of Tika 1.9, this jar is "provided." Make sure to upgrade your provided jar! ({{{http://issues.apache.org/jira/browse/TIKA-1687}TIKA-1687}}). * Add header/footer extraction to xls (via Aeham Abushwashi) ({{{http://issues.apache.org/jira/browse/TIKA-1400}TIKA-1400}}). * Drop the source file name from the embedded file path in RecursiveParserWrapper's "X-TIKA:embedded_resource_path" ({{{http://issues.apache.org/jira/browse/TIKA-1673}TIKA-1673}}). * Upgraded to Java 7 ({{{http://issues.apache.org/jira/browse/TIKA-1536}TIKA-1536}}). * Non-standards compliant emails are now correctly detected as message/rfc822 ({{{http://issues.apache.org/jira/browse/TIKA-1602}TIKA-1602}}). * Added parser for MS Access files via Jackcess. Many thanks to Health Market Science, Brian O'Neill and James Ahlborn for relicensing Jackcess to Apache v2! ({{{http://issues.apache.org/jira/browse/TIKA-1601}TIKA-1601}}). * GDALParser now correctly sets "nitf" as a supported MediaType ({{{http://issues.apache.org/jira/browse/TIKA-1664}TIKA-1664}}). * Added DigestingParser to calculate digest hashes and record them in metadata. Integrated with tika-app and tika-server ({{{http://issues.apache.org/jira/browse/TIKA-1663}TIKA-1663}}). * Fixed ZipContainerDetector to detect all IPA files ({{{http://issues.apache.org/jira/browse/TIKA-1659}TIKA-1659}}). The following people have contributed to Tika 1.10 by submitting or commenting on the issues resolved in this release: * Aashish Chaudhary * Adam Estrada * Albert L. * Alessandro De Angelis * Andrew Jackson * Ann Burgess * Bin Hawking * Bob Paulin * Chris A. Mattmann * Chris Wilson * Daniel Bonniot de Ruisselet * David Warren * Filip Bednárik * Giuseppe Totaro * Jeremy B. Merrill * Johannes Mockenhaupt * Joseph North * Ken Krugler * Lewis John McGibbney * Markus Jelsma * Michael McCandless * Namrata Malarout * Nick Burch * Niels * Paul Ramirez * Paul Tunison * Rami Shomali * Ray Gauss II * Sergey Beryozkin * Tim Allison * Tyler Palsulich * jefferyyuan See {{http://s.apache.org/EQ2}} for more details on these contributions.