TIKA-431: Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly.
Make text and html parsers return character encoding as a charset parameter in the content type metadata field
|