org.apache.any23.extractor.html
Class TagSoupParserTest

java.lang.Object
  extended by org.apache.any23.extractor.html.TagSoupParserTest

public class TagSoupParserTest
extends Object

Reference Test class for TagSoupParser parser.

Author:
Davide Palmisano (dpalmisano@gmail.com), Michele Mostarda (michele.mostarda@gmail.com)

Constructor Summary
TagSoupParserTest()
           
 
Method Summary
 void tearDown()
           
 void testEmptySpanElements()
          Test related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.
 void testExplicitEncodingBehavior()
           
 void testImplicitEncodingBehavior()
          This tests the Neko HTML parser without forcing it on using a specific encoding charset.
 void testParseSimpleHTML()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TagSoupParserTest

public TagSoupParserTest()
Method Detail

tearDown

public void tearDown()
              throws org.openrdf.repository.RepositoryException
Throws:
org.openrdf.repository.RepositoryException

testParseSimpleHTML

public void testParseSimpleHTML()
                         throws IOException
Throws:
IOException

testExplicitEncodingBehavior

public void testExplicitEncodingBehavior()
                                  throws IOException,
                                         org.apache.any23.extractor.ExtractionException,
                                         org.openrdf.repository.RepositoryException
Throws:
IOException
org.apache.any23.extractor.ExtractionException
org.openrdf.repository.RepositoryException

testImplicitEncodingBehavior

public void testImplicitEncodingBehavior()
                                  throws IOException,
                                         org.apache.any23.extractor.ExtractionException,
                                         org.openrdf.repository.RepositoryException
This tests the Neko HTML parser without forcing it on using a specific encoding charset. We expect that this test may fail if something changes in the Neko library, as an auto-detection of the encoding.

Throws:
IOException
org.apache.any23.extractor.ExtractionException
org.openrdf.repository.RepositoryException

testEmptySpanElements

public void testEmptySpanElements()
                           throws IOException
Test related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.

Throws:
IOException


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.