org.apache.any23.extractor.html
Class AbstractExtractorTestCase

java.lang.Object
  extended by org.apache.any23.extractor.html.AbstractExtractorTestCase
Direct Known Subclasses:
AbstractRDFaExtractorTestCase, AdrExtractorTest, CSVExtractorTest, HCalendarExtractorTest, HCardExtractorTest, HeadLinkExtractorTest, HListingExtractorTest, HRecipeExtractorTest, HResumeExtractorTest, HReviewExtractorTest, HTMLMetaExtractorTest, LicenseExtractorTest, MicrodataExtractorTest, RDFMergerTest, SpeciesExtractorTest, TitleExtractorTest, TurtleHTMLExtractorTest, XFNExtractorTest

public abstract class AbstractExtractorTestCase
extends Object

Abstract class used to write Extractor specific test cases.


Field Summary
protected static org.openrdf.model.URI baseURI
          Base test document.
 
Constructor Summary
AbstractExtractorTestCase()
          Constructor.
 
Method Summary
protected  void assertContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, String l)
          Assert that the model contains the statement (s p l) where l is a literal.
protected  void assertContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, String l, String lang)
          Assert that the model contains the statement (s p l) where l is a language literal.
protected  void assertContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o)
          Asserts that the extracted triples contain the pattern (s p o).
protected  void assertContains(org.openrdf.model.Statement statement)
          Checks that a statement is contained in the extracted model.
protected  void assertContains(org.openrdf.model.URI p, org.openrdf.model.Resource o)
          Asserts that the extracted triples contain the pattern (_ p o).
protected  void assertContains(org.openrdf.model.URI p, String o)
          Asserts that the extracted triples contain the pattern (_ p o).
 void assertContainsModel(org.openrdf.model.Statement[] statements)
          Verifies that the current model contains all the given statements.
 void assertContainsModel(String modelResource)
          Verifies that the current model contains all the statements declared in the specified modelFile.
protected  void assertError(org.apache.any23.extractor.ErrorReporter.ErrorLevel level, String errorRegex)
          Asserts that an error has been produced by the processed Extractor.
protected  void assertExtracts(String resource)
          Performs data extraction over the content of a resource and assert that the extraction was correct.
protected  void assertModelEmpty()
          Asserts that the model is expected to contains no statements.
protected  void assertModelNotEmpty()
          Asserts that the model contains at least a statement.
protected  void assertNotContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Literal o)
          Asserts that the model doesn't contain the pattern (s p o)
protected  void assertNotContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Resource o)
          Asserts that the extracted triples contain the pattern (s p o).
protected  void assertNotContains(org.openrdf.model.Resource s, org.openrdf.model.URI p, String o)
          Asserts that the extracted triples contain the pattern (s p o).
protected  void assertNotContains(org.openrdf.model.URI p, org.openrdf.model.Resource o)
          Asserts that the extracted triples contain the pattern (_ p o).
protected  void assertNotFound(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Asserts that the given pattern (s p _) is not present.
protected  void assertStatementsSize(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o, int expected)
          Asserts that the given pattern (s p o) satisfies the expected number of statements.
protected  void assertStatementsSize(org.openrdf.model.URI p, String o, int expected)
          Asserts that the given pattern (_ p o) satisfies the expected number of statements.
protected  void assertStatementsSize(org.openrdf.model.URI p, org.openrdf.model.Value o, int expected)
          Asserts that the given pattern (_ p o) satisfies the expected number of statements.
protected  List<org.openrdf.model.Statement> dumpAsListOfStatements()
          Dumps the list of statements contained in the extracted model.
protected  String dumpHumanReadableTriples()
           
protected  String dumpModelToNQuads()
          Dumps the extracted model in NQuads format.
protected  String dumpModelToRDFXML()
          Dumps the extracted model in RDFXML format.
protected  String dumpModelToTurtle()
          Dumps the extracted model in Turtle format.
protected  void extract(String resource)
          Applies the extractor provided by the getExtractorFactory() to the specified resource.
protected  org.openrdf.model.Resource findExactlyOneBlankSubject(org.openrdf.model.URI p, org.openrdf.model.Value o)
          Returns the blank subject matching the pattern (_:b p o), it is expected to exists and be just one.
protected  org.openrdf.model.Value findExactlyOneObject(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Returns the object matching the pattern (s p o), it is expected to exists and be just one.
protected  org.openrdf.model.Value findObject(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Finds the object matching the pattern (s p _), asserts to find exactly one result.
protected  String findObjectAsLiteral(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Finds the literal object matching the pattern (s p _), asserts to find exactly one result.
protected  org.openrdf.model.Resource findObjectAsResource(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Finds the resource object matching the pattern (s p _), asserts to find exactly one result.
protected  List<org.openrdf.model.Value> findObjects(org.openrdf.model.Resource s, org.openrdf.model.URI p)
          Returns all the objects matching the pattern (s p _).
protected  List<org.openrdf.model.Resource> findSubjects(org.openrdf.model.URI p, org.openrdf.model.Value o)
          Returns all the subjects matching the pattern (s? p o).
protected  org.openrdf.repository.RepositoryConnection getConnection()
           
protected  Collection<org.apache.any23.extractor.ErrorReporter.Error> getErrors(String extractorName)
          Returns the list of errors raised by a given extractor.
protected abstract  org.apache.any23.extractor.ExtractorFactory<?> getExtractorFactory()
           
protected  org.apache.any23.extractor.SingleDocumentExtractionReport getReport()
           
protected  org.openrdf.repository.RepositoryResult<org.openrdf.model.Statement> getStatements(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o)
          Returns all statements matching the pattern (s p o).
protected  int getStatementsSize(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o)
          Counts all statements matching the pattern (s p o).
 void setUp()
          Test case initialization.
 void tearDown()
          Test case resources release.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

baseURI

protected static org.openrdf.model.URI baseURI
Base test document.

Constructor Detail

AbstractExtractorTestCase

public AbstractExtractorTestCase()
Constructor.

Method Detail

getExtractorFactory

protected abstract org.apache.any23.extractor.ExtractorFactory<?> getExtractorFactory()
Returns:
the factory of the extractor to be tested.

setUp

public void setUp()
           throws Exception
Test case initialization.

Throws:
Exception

tearDown

public void tearDown()
              throws org.openrdf.repository.RepositoryException
Test case resources release.

Throws:
org.openrdf.repository.RepositoryException

getConnection

protected org.openrdf.repository.RepositoryConnection getConnection()
Returns:
the connection to the memory repository.

getReport

protected org.apache.any23.extractor.SingleDocumentExtractionReport getReport()
Returns:
the last generated report.

getErrors

protected Collection<org.apache.any23.extractor.ErrorReporter.Error> getErrors(String extractorName)
Returns the list of errors raised by a given extractor.

Parameters:
extractorName - name of the extractor.
Returns:
collection of errors.

extract

protected void extract(String resource)
                throws org.apache.any23.extractor.ExtractionException,
                       IOException
Applies the extractor provided by the getExtractorFactory() to the specified resource.

Parameters:
resource - resource name.
Throws:
org.apache.any23.extractor.ExtractionException
IOException

assertExtracts

protected void assertExtracts(String resource)
Performs data extraction over the content of a resource and assert that the extraction was correct.

Parameters:
resource - resource name.

assertContains

protected void assertContains(org.openrdf.model.URI p,
                              org.openrdf.model.Resource o)
                       throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (_ p o).

Parameters:
p - predicate
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertContains

protected void assertContains(org.openrdf.model.URI p,
                              String o)
                       throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (_ p o).

Parameters:
p - predicate
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertNotContains

protected void assertNotContains(org.openrdf.model.URI p,
                                 org.openrdf.model.Resource o)
                          throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (_ p o).

Parameters:
p - predicate
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertContains

protected void assertContains(org.openrdf.model.Resource s,
                              org.openrdf.model.URI p,
                              org.openrdf.model.Value o)
                       throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (s p o).

Parameters:
s - subject.
p - predicate.
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertNotContains

protected void assertNotContains(org.openrdf.model.Resource s,
                                 org.openrdf.model.URI p,
                                 String o)
                          throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (s p o).

Parameters:
s - subject.
p - predicate.
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertNotContains

protected void assertNotContains(org.openrdf.model.Resource s,
                                 org.openrdf.model.URI p,
                                 org.openrdf.model.Resource o)
                          throws org.openrdf.repository.RepositoryException
Asserts that the extracted triples contain the pattern (s p o).

Parameters:
s - subject.
p - predicate.
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertModelNotEmpty

protected void assertModelNotEmpty()
                            throws org.openrdf.repository.RepositoryException
Asserts that the model contains at least a statement.

Throws:
org.openrdf.repository.RepositoryException

assertNotContains

protected void assertNotContains(org.openrdf.model.Resource s,
                                 org.openrdf.model.URI p,
                                 org.openrdf.model.Literal o)
                          throws org.openrdf.repository.RepositoryException
Asserts that the model doesn't contain the pattern (s p o)

Parameters:
s - subject.
p - predicate.
o - object.
Throws:
org.openrdf.repository.RepositoryException

assertModelEmpty

protected void assertModelEmpty()
                         throws org.openrdf.repository.RepositoryException
Asserts that the model is expected to contains no statements.

Throws:
org.openrdf.repository.RepositoryException

assertError

protected void assertError(org.apache.any23.extractor.ErrorReporter.ErrorLevel level,
                           String errorRegex)
Asserts that an error has been produced by the processed Extractor.

Parameters:
level - expected error level
errorRegex - regex matching the expected human readable error message.

assertContainsModel

public void assertContainsModel(org.openrdf.model.Statement[] statements)
                         throws org.openrdf.repository.RepositoryException
Verifies that the current model contains all the given statements.

Parameters:
statements - list of statements to be verified.
Throws:
org.openrdf.repository.RepositoryException

assertContainsModel

public void assertContainsModel(String modelResource)
                         throws org.openrdf.rio.RDFHandlerException,
                                IOException,
                                org.openrdf.rio.RDFParseException,
                                org.openrdf.repository.RepositoryException
Verifies that the current model contains all the statements declared in the specified modelFile.

Parameters:
modelResource - the resource containing the model.
Throws:
org.openrdf.rio.RDFHandlerException
IOException
org.openrdf.rio.RDFParseException
org.openrdf.repository.RepositoryException

assertStatementsSize

protected void assertStatementsSize(org.openrdf.model.Resource s,
                                    org.openrdf.model.URI p,
                                    org.openrdf.model.Value o,
                                    int expected)
                             throws org.openrdf.repository.RepositoryException
Asserts that the given pattern (s p o) satisfies the expected number of statements.

Parameters:
s - subject.
p - predicate.
o - object.
expected - expected matches.
Throws:
org.openrdf.repository.RepositoryException

assertStatementsSize

protected void assertStatementsSize(org.openrdf.model.URI p,
                                    org.openrdf.model.Value o,
                                    int expected)
                             throws org.openrdf.repository.RepositoryException
Asserts that the given pattern (_ p o) satisfies the expected number of statements.

Parameters:
p - predicate.
o - object.
expected - expected matches.
Throws:
org.openrdf.repository.RepositoryException

assertStatementsSize

protected void assertStatementsSize(org.openrdf.model.URI p,
                                    String o,
                                    int expected)
                             throws org.openrdf.repository.RepositoryException
Asserts that the given pattern (_ p o) satisfies the expected number of statements.

Parameters:
p - predicate.
o - object.
expected - expected matches.
Throws:
org.openrdf.repository.RepositoryException

assertNotFound

protected void assertNotFound(org.openrdf.model.Resource s,
                              org.openrdf.model.URI p)
                       throws org.openrdf.repository.RepositoryException
Asserts that the given pattern (s p _) is not present.

Parameters:
s - subject.
p - predicate.
Throws:
org.openrdf.repository.RepositoryException

findExactlyOneBlankSubject

protected org.openrdf.model.Resource findExactlyOneBlankSubject(org.openrdf.model.URI p,
                                                                org.openrdf.model.Value o)
                                                         throws org.openrdf.repository.RepositoryException
Returns the blank subject matching the pattern (_:b p o), it is expected to exists and be just one.

Parameters:
p - predicate.
o - object.
Returns:
the matching blank subject.
Throws:
org.openrdf.repository.RepositoryException

findExactlyOneObject

protected org.openrdf.model.Value findExactlyOneObject(org.openrdf.model.Resource s,
                                                       org.openrdf.model.URI p)
                                                throws org.openrdf.repository.RepositoryException
Returns the object matching the pattern (s p o), it is expected to exists and be just one.

Parameters:
s - subject.
p - predicate.
Returns:
the matching object.
Throws:
org.openrdf.repository.RepositoryException

findSubjects

protected List<org.openrdf.model.Resource> findSubjects(org.openrdf.model.URI p,
                                                        org.openrdf.model.Value o)
                                                 throws org.openrdf.repository.RepositoryException
Returns all the subjects matching the pattern (s? p o).

Parameters:
p - predicate.
o - object.
Returns:
list of matching subjects.
Throws:
org.openrdf.repository.RepositoryException

findObjects

protected List<org.openrdf.model.Value> findObjects(org.openrdf.model.Resource s,
                                                    org.openrdf.model.URI p)
                                             throws org.openrdf.repository.RepositoryException
Returns all the objects matching the pattern (s p _).

Parameters:
s - predicate.
p - predicate.
Returns:
list of matching objects.
Throws:
org.openrdf.repository.RepositoryException

findObject

protected org.openrdf.model.Value findObject(org.openrdf.model.Resource s,
                                             org.openrdf.model.URI p)
                                      throws org.openrdf.repository.RepositoryException
Finds the object matching the pattern (s p _), asserts to find exactly one result.

Parameters:
s - subject.
p - predicate
Returns:
matching object.
Throws:
org.openrdf.repository.RepositoryException

findObjectAsResource

protected org.openrdf.model.Resource findObjectAsResource(org.openrdf.model.Resource s,
                                                          org.openrdf.model.URI p)
                                                   throws org.openrdf.repository.RepositoryException
Finds the resource object matching the pattern (s p _), asserts to find exactly one result.

Parameters:
s - subject.
p - predicate.
Returns:
matching object.
Throws:
org.openrdf.repository.RepositoryException

findObjectAsLiteral

protected String findObjectAsLiteral(org.openrdf.model.Resource s,
                                     org.openrdf.model.URI p)
                              throws org.openrdf.repository.RepositoryException
Finds the literal object matching the pattern (s p _), asserts to find exactly one result.

Parameters:
s - subject.
p - predicate.
Returns:
matching object.
Throws:
org.openrdf.repository.RepositoryException

dumpModelToTurtle

protected String dumpModelToTurtle()
                            throws org.openrdf.repository.RepositoryException
Dumps the extracted model in Turtle format.

Returns:
a string containing the model in Turtle.
Throws:
org.openrdf.repository.RepositoryException

dumpModelToNQuads

protected String dumpModelToNQuads()
                            throws org.openrdf.repository.RepositoryException
Dumps the extracted model in NQuads format.

Returns:
a string containing the model in NQuads.
Throws:
org.openrdf.repository.RepositoryException

dumpModelToRDFXML

protected String dumpModelToRDFXML()
                            throws org.openrdf.repository.RepositoryException
Dumps the extracted model in RDFXML format.

Returns:
a string containing the model in RDFXML.
Throws:
org.openrdf.repository.RepositoryException

dumpAsListOfStatements

protected List<org.openrdf.model.Statement> dumpAsListOfStatements()
                                                            throws org.openrdf.repository.RepositoryException
Dumps the list of statements contained in the extracted model.

Returns:
list of extracted statements.
Throws:
org.openrdf.repository.RepositoryException

dumpHumanReadableTriples

protected String dumpHumanReadableTriples()
                                   throws org.openrdf.repository.RepositoryException
Returns:
string containing human readable statements.
Throws:
org.openrdf.repository.RepositoryException

assertContains

protected void assertContains(org.openrdf.model.Statement statement)
                       throws org.openrdf.repository.RepositoryException
Checks that a statement is contained in the extracted model. If the statement declares bnodes, they are replaced with _ patterns.

Parameters:
statement -
Throws:
org.openrdf.repository.RepositoryException

assertContains

protected void assertContains(org.openrdf.model.Resource s,
                              org.openrdf.model.URI p,
                              String l)
                       throws org.openrdf.repository.RepositoryException
Assert that the model contains the statement (s p l) where l is a literal.

Parameters:
s - subject.
p - predicate.
l - literal content.
Throws:
org.openrdf.repository.RepositoryException

assertContains

protected void assertContains(org.openrdf.model.Resource s,
                              org.openrdf.model.URI p,
                              String l,
                              String lang)
                       throws org.openrdf.repository.RepositoryException
Assert that the model contains the statement (s p l) where l is a language literal.

Parameters:
s - subject.
p - predicate.
l - literal content.
lang - literal language.
Throws:
org.openrdf.repository.RepositoryException

getStatements

protected org.openrdf.repository.RepositoryResult<org.openrdf.model.Statement> getStatements(org.openrdf.model.Resource s,
                                                                                             org.openrdf.model.URI p,
                                                                                             org.openrdf.model.Value o)
                                                                                      throws org.openrdf.repository.RepositoryException
Returns all statements matching the pattern (s p o).

Parameters:
s - subject.
p - predicate.
o - object.
Returns:
list of statements.
Throws:
org.openrdf.repository.RepositoryException

getStatementsSize

protected int getStatementsSize(org.openrdf.model.Resource s,
                                org.openrdf.model.URI p,
                                org.openrdf.model.Value o)
                         throws org.openrdf.repository.RepositoryException
Counts all statements matching the pattern (s p o).

Parameters:
s - subject.
p - predicate.
o - object.
Returns:
number of matches.
Throws:
org.openrdf.repository.RepositoryException


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.