EncodingTest (Apache Any23 :: Core 0.7.0-incubating-SNAPSHOT Test API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.any23.extractor.html
Class EncodingTest

java.lang.Object
  org.apache.any23.extractor.html.EncodingTest

public class EncodingTest
extends Object
extends Object

Test class to ensure behaviors of HTMLDocument parser with encoding corner cases.

Constructor Summary
`EncodingTest()`

Method Summary
`void`	`testEncodingHTML_ISO_8859_1()`
`void`	`testEncodingHTML_UTF_8_DeclarationAfterTitle()` Known issue: NekoHTML does not auto-detect the encoding, but relies on the explicitly specified encoding (via XML declaration or HTTP-Equiv meta header).
`void`	`testEncodingHTML_UTF_8()`
`void`	`testEncodingXHTML_ISO_8859_1()`
`void`	`testEncodingXHTML_UTF_8()`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

EncodingTest

public EncodingTest()

Method Detail

testEncodingHTML_ISO_8859_1

public void testEncodingHTML_ISO_8859_1()

testEncodingHTML_UTF_8

public void testEncodingHTML_UTF_8()

testEncodingHTML_UTF_8_DeclarationAfterTitle

public void testEncodingHTML_UTF_8_DeclarationAfterTitle()

Known issue: NekoHTML does not auto-detect the encoding, but relies on the explicitly specified encoding (via XML declaration or HTTP-Equiv meta header). If the meta header comes *after* the title element, then NekoHTML will not use the declared encoding for the title. For this test we expect to not recognize the title.

testEncodingXHTML_ISO_8859_1

public void testEncodingXHTML_ISO_8859_1()

testEncodingXHTML_UTF_8

public void testEncodingXHTML_UTF_8()