Interface HtmlParseFilter

All Superinterfaces:
Configurable, Pluggable
All Known Implementing Classes:
CCParseFilter, HTMLLanguageParser, JSParseFilter, RelTagParser

public interface HtmlParseFilter
extends Pluggable, Configurable

Extension point for DOM-based HTML parsers. Permits one to add additional metadata to HTML parses. All plugins found which implement this extension point are run sequentially on the parse.

Field Summary
static String X_POINT_ID
          The name of the extension point.
Method Summary
 ParseResult filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page.
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf

Field Detail


static final String X_POINT_ID
The name of the extension point.

Method Detail


ParseResult filter(Content content,
                   ParseResult parseResult,
                   HTMLMetaTags metaTags,
                   DocumentFragment doc)
Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page.

Copyright © 2006 The Apache Software Foundation