Package | Description |
---|---|
org.apache.nutch.parse | |
org.apache.nutch.parse.ext | |
org.apache.nutch.parse.feed | |
org.apache.nutch.parse.html |
An HTML document parsing plugin.
|
org.apache.nutch.parse.js | |
org.apache.nutch.parse.swf | |
org.apache.nutch.parse.tika | |
org.apache.nutch.parse.zip |
Modifier and Type | Method and Description |
---|---|
Parser |
ParserFactory.getParserById(String id)
Function returns a
Parser instance with the specified
extId , representing its extension ID. |
Parser[] |
ParserFactory.getParsers(String contentType,
String url)
Function returns an array of
Parser s for a given content type. |
Modifier and Type | Class and Description |
---|---|
class |
ExtParser
A wrapper that invokes external command to do real parsing job.
|
Modifier and Type | Class and Description |
---|---|
class |
FeedParser |
Modifier and Type | Class and Description |
---|---|
class |
HtmlParser |
Modifier and Type | Class and Description |
---|---|
class |
JSParseFilter
This class is a heuristic link extractor for JavaScript files and
code snippets.
|
Modifier and Type | Class and Description |
---|---|
class |
SWFParser
Parser for Flash SWF files.
|
Modifier and Type | Class and Description |
---|---|
class |
TikaParser
Wrapper for Tika parsers.
|
Modifier and Type | Class and Description |
---|---|
class |
ZipParser
ZipParser class based on MSPowerPointParser class by Stephan Strittmatter.
|
Copyright © 2014 The Apache Software Foundation