MicrodataExtractor (Apache Any23 :: Core 0.7.0-incubating-SNAPSHOT API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.any23.extractor.microdata
Class MicrodataExtractor

java.lang.Object
  org.apache.any23.extractor.microdata.MicrodataExtractor

All Implemented Interfaces:: Extractor<Document>, Extractor.TagSoupDOMExtractor

public class MicrodataExtractor
extends Object
implements Extractor.TagSoupDOMExtractor
extends Object
implements Extractor.TagSoupDOMExtractor

Default implementation of Microdata extractor, based on TagSoupDOMExtractor.

Author:: Michele Mostarda (mostarda@fbk.eu), Davide Palmisano ( dpalmisano@gmail.com )

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
`Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor`

Field Summary
`static ExtractorFactory<MicrodataExtractor>`	`factory`

Constructor Summary
`MicrodataExtractor()`

Method Summary
`ExtractorDescription`	`getDescription()` Returns a `ExtractorDescription` of this extractor.
`void`	`run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out)` This extraction performs the Microdata to RDF conversion algorithm.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

factory

public static final ExtractorFactory<MicrodataExtractor> factory

Constructor Detail

MicrodataExtractor

public MicrodataExtractor()

Method Detail

getDescription

public ExtractorDescription getDescription()

Description copied from interface: Extractor

Returns a ExtractorDescription of this extractor.

Specified by:: getDescription in interface Extractor<Document>

Returns:: the object representing the extractor description.

run

public void run(ExtractionParameters extractionParameters,
                ExtractionContext extractionContext,
                Document in,
                ExtractionResult out)
         throws IOException,
                ExtractionException

This extraction performs the Microdata to RDF conversion algorithm. A slight modification of the specification algorithm has been introduced to avoid performing actions 5.2.1, 5.2.2, 5.2.3, 5.2.4 if step 5.2.6 doesn't detect any Microdata.

Specified by:: run in interface Extractor<Document>

Parameters:: extractionParameters - the parameters to be applied during the extraction.; extractionContext - The document context.; in - The extractor input data.; out - the collector for the extracted data.
Throws:: IOException - On error while reading from the input stream.; ExtractionException - On other error, such as parse errors.