org.apache.any23.extractor.html
Class EncodingTest

java.lang.Object
  extended by org.apache.any23.extractor.html.EncodingTest

public class EncodingTest
extends Object

Test class to ensure behaviors of HTMLDocument parser with encoding corner cases.


Constructor Summary
EncodingTest()
           
 
Method Summary
 void testEncodingHTML_ISO_8859_1()
           
 void testEncodingHTML_UTF_8_DeclarationAfterTitle()
          Known issue: NekoHTML does not auto-detect the encoding, but relies on the explicitly specified encoding (via XML declaration or HTTP-Equiv meta header).
 void testEncodingHTML_UTF_8()
           
 void testEncodingXHTML_ISO_8859_1()
           
 void testEncodingXHTML_UTF_8()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EncodingTest

public EncodingTest()
Method Detail

testEncodingHTML_ISO_8859_1

public void testEncodingHTML_ISO_8859_1()

testEncodingHTML_UTF_8

public void testEncodingHTML_UTF_8()

testEncodingHTML_UTF_8_DeclarationAfterTitle

public void testEncodingHTML_UTF_8_DeclarationAfterTitle()
Known issue: NekoHTML does not auto-detect the encoding, but relies on the explicitly specified encoding (via XML declaration or HTTP-Equiv meta header). If the meta header comes *after* the title element, then NekoHTML will not use the declared encoding for the title. For this test we expect to not recognize the title.


testEncodingXHTML_ISO_8859_1

public void testEncodingXHTML_ISO_8859_1()

testEncodingXHTML_UTF_8

public void testEncodingXHTML_UTF_8()


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.