|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface Parser
A parser for content generated by a Protocol
implementation. This interface is implemented by extensions. Nutch's core
contains no page parsing code.
Field Summary | |
---|---|
static String |
X_POINT_ID
The name of the extension point. |
Method Summary | |
---|---|
ParseResult |
getParse(Content c)
This method parses the given content and returns a map of <key, parse> pairs. |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
---|
getConf, setConf |
Field Detail |
---|
static final String X_POINT_ID
Method Detail |
---|
ParseResult getParse(Content c)
This method parses the given content and returns a map of
<key, parse> pairs. Parse
instances will be persisted
under the given key.
Note: Meta-redirects should be followed only when they are coming from
the original URL. That is:
Assume fetcher is in parsing mode and is currently processing
foo.bar.com/redirect.html. If this url contains a meta redirect
to another url, fetcher should only follow the redirect if the map
contains an entry of the form <"foo.bar.com/redirect.html",
Parse
with a ParseStatus
indicating the redirect>.
c
- Content to be parsed
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |