org.apache.nutch.parse
Class MetaTagsParser
java.lang.Object
org.apache.nutch.parse.MetaTagsParser
- All Implemented Interfaces:
- Configurable, HtmlParseFilter, Pluggable
public class MetaTagsParser
- extends Object
- implements HtmlParseFilter
Parse HTML meta tags (keywords, description) and store them in the parse metadata so that
they can be indexed with the index-metadata plugin with the prefix 'metatag.'
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MetaTagsParser
public MetaTagsParser()
setConf
public void setConf(Configuration conf)
- Specified by:
setConf
in interface Configurable
getConf
public Configuration getConf()
- Specified by:
getConf
in interface Configurable
filter
public ParseResult filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
- Description copied from interface:
HtmlParseFilter
- Adds metadata or otherwise modifies a parse of HTML content, given
the DOM tree of a page.
- Specified by:
filter
in interface HtmlParseFilter
Copyright © 2012 The Apache Software Foundation