org.apache.any23.extractor.html
Class EncodingTest
java.lang.Object
org.apache.any23.extractor.html.EncodingTest
public class EncodingTest
- extends Object
Test class to ensure behaviors of HTMLDocument
parser with encoding
corner cases.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
EncodingTest
public EncodingTest()
testEncodingHTML_ISO_8859_1
public void testEncodingHTML_ISO_8859_1()
testEncodingHTML_UTF_8
public void testEncodingHTML_UTF_8()
testEncodingHTML_UTF_8_DeclarationAfterTitle
public void testEncodingHTML_UTF_8_DeclarationAfterTitle()
- Known issue: NekoHTML does not auto-detect the encoding, but relies
on the explicitly specified encoding (via XML declaration or
HTTP-Equiv meta header). If the meta header comes *after* the
title element, then NekoHTML will not use the declared encoding
for the title.
For this test we expect to not recognize the title.
testEncodingXHTML_ISO_8859_1
public void testEncodingXHTML_ISO_8859_1()
testEncodingXHTML_UTF_8
public void testEncodingXHTML_UTF_8()
Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.