HeadLinkExtractor (Apache Any23 :: Core 0.7.0-incubating-SNAPSHOT API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.any23.extractor.html
Class HeadLinkExtractor

java.lang.Object
  org.apache.any23.extractor.html.HeadLinkExtractor

All Implemented Interfaces:: Extractor<Document>, Extractor.TagSoupDOMExtractor

public class HeadLinkExtractor
extends Object
implements Extractor.TagSoupDOMExtractor
extends Object
implements Extractor.TagSoupDOMExtractor

This Extractor.TagSoupDOMExtractor implementation retrieves the LINKs declared within the HTML/HEAD page header.

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
`Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor`

Field Summary
`static ExtractorFactory<HeadLinkExtractor>`	`factory`

Constructor Summary
`HeadLinkExtractor()`

Method Summary
`ExtractorDescription`	`getDescription()` Returns a `ExtractorDescription` of this extractor.
`void`	`run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out)` Executes the extractor.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

factory

public static final ExtractorFactory<HeadLinkExtractor> factory

Constructor Detail

HeadLinkExtractor

public HeadLinkExtractor()

Method Detail

run

public void run(ExtractionParameters extractionParameters,
                ExtractionContext extractionContext,
                Document in,
                ExtractionResult out)
         throws IOException,
                ExtractionException

Description copied from interface: Extractor

Executes the extractor. Will be invoked only once, extractors are not reusable.

Specified by:: run in interface Extractor<Document>

Parameters:: extractionParameters - the parameters to be applied during the extraction.; extractionContext - The document context.; in - The extractor input data.; out - the collector for the extracted data.
Throws:: IOException - On error while reading from the input stream.; ExtractionException - On other error, such as parse errors.

getDescription

public ExtractorDescription getDescription()