Modifier and Type | Field and Description |
---|---|
static String |
CHECK_BLOCKING
Property name.
|
static String |
CHECK_ROBOTS
Property name.
|
static String |
X_POINT_ID
The name of the extension point.
|
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
getProtocolOutput(org.apache.hadoop.io.Text url,
CrawlDatum datum)
Returns the
Content for a fetchlist entry. |
crawlercommons.robots.BaseRobotRules |
getRobotRules(org.apache.hadoop.io.Text url,
CrawlDatum datum)
Retrieve robot rules applicable for this url.
|
static final String X_POINT_ID
static final String CHECK_BLOCKING
static final String CHECK_ROBOTS
ProtocolOutput getProtocolOutput(org.apache.hadoop.io.Text url, CrawlDatum datum)
Content
for a fetchlist entry.crawlercommons.robots.BaseRobotRules getRobotRules(org.apache.hadoop.io.Text url, CrawlDatum datum)
url
- url to checkdatum
- page datumCopyright © 2014 The Apache Software Foundation