---------------- Apache Tika 2.2.0 ---------------- ~~ Licensed to the Apache Software Foundation (ASF) under one or more ~~ contributor license agreements. See the NOTICE file distributed with ~~ this work for additional information regarding copyright ownership. ~~ The ASF licenses this file to You under the Apache License, Version 2.0 ~~ (the "License"); you may not use this file except in compliance with ~~ the License. You may obtain a copy of the License at ~~ ~~ http://www.apache.org/licenses/LICENSE-2.0 ~~ ~~ Unless required by applicable law or agreed to in writing, software ~~ distributed under the License is distributed on an "AS IS" BASIS, ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~~ See the License for the specific language governing permissions and ~~ limitations under the License. Apache Tika 2.2.0 The most notable changes in Tika 2.2.0 over the previous release are: * Add support for OneNote files downloaded from O365 ({{{http://issues.apache.org/jira/browse/TIKA-3446}TIKA-3446}}). * Fix logic bug in PipesServer that prevented concatenation of content from attachments ({{{http://issues.apache.org/jira/browse/TIKA-3609}TIKA-3609}}). * Improve extraction of embedded files from MSOffice files created by non-Microsoft tools ({{{http://issues.apache.org/jira/browse/TIKA-3526}TIKA-3526}}). * Added back ability to ignore load errors in TikaConfig ({{{http://issues.apache.org/jira/browse/TIKA-3575}TIKA-3575}}). * Make SecureContentHandler and other parameters configurable in AutoDetectParser programmatically and via tika-config.xml ({{{http://issues.apache.org/jira/browse/TIKA-3594}TIKA-3594}}). * Fix default logging in tika-app in batch mode ({{{http://issues.apache.org/jira/browse/TIKA-3589}TIKA-3589}}). * Fix bug that prevented specifying a config with the long --config= option in tika-app in batch mode ({{{http://issues.apache.org/jira/browse/TIKA-3589}TIKA-3589}}). * Fix thread starvation after numerous restarts in PipesClient ({{{http://issues.apache.org/jira/browse/TIKA-3588}TIKA-3588}}). * Fix race condition when starting multiple forked servers on multiple ports ({{{http://issues.apache.org/jira/browse/TIKA-3586}TIKA-3586}}). * Add timeout per task to be configured via headers for tika-server's legacy endpoints /tika and /rmeta. Note that this timeout greater than taskTimeoutMillis ({{{http://issues.apache.org/jira/browse/TIKA-3582}TIKA-3582}}). * Add metadata item for whether or not a PDF has a collection/is a Portfolio PDF ({{{http://issues.apache.org/jira/browse/TIKA-3579}TIKA-3579}}). * Add detection of ESRI Layer files ({{{http://issues.apache.org/jira/browse/TIKA-3570}TIKA-3570}}). * Add detection of JPEG XL, MARC, ICC profiles, NES-ROM file types({{{http://issues.apache.org/jira/browse/TIKA-3562}TIKA-3562}} and {{{http://issues.apache.org/jira/browse/TIKA-3563}TIKA-3563}}) * Remove duplicate "subject" metadata keys that were intended for backwards compatibility with 1.x only ({{{http://issues.apache.org/jira/browse/TIKA-3564}TIKA-3564}}). * Fix Open Office mime types to be subclasses of application/zipand no longer require OPCPackageDetector-last ordering of zipdetectors ({{{http://issues.apache.org/jira/browse/TIKA-3556}TIKA-3556}}). * Improve robustness and features of the httpfetcher ({{{http://issues.apache.org/jira/browse/TIKA-3543}TIKA-3543}}) * Add optional fetch ranges to FetchEmitTuple to allow range fetching from,e.g. http or s3 ({{{http://issues.apache.org/jira/browse/TIKA-3542}TIKA-3542}}). The following people have contributed to Tika 2.2.0 by submitting or commenting on the issues resolved in this release: * Abha * Andreas Hubold * August Valera * César Soto Valero * dataminer.accolade * David Brosius * Laura Delmaestro * Lewis John McGibbney * Luís Filipe Nassif * Robin Schimpf * Sebastian Nagel * Tim Allison See {{https://s.apache.org/0pfp7}} for more details on these contributions.