public class FtpRobotRulesParser extends RobotRulesParser
RobotRulesParser
class and contains
Ftp protocol specific implementation for obtaining the robots file.Modifier and Type | Field and Description |
---|---|
static org.slf4j.Logger |
LOG |
agentNames, CACHE, EMPTY_RULES, FORBID_ALL_RULES
Constructor and Description |
---|
FtpRobotRulesParser(org.apache.hadoop.conf.Configuration conf) |
Modifier and Type | Method and Description |
---|---|
crawlercommons.robots.BaseRobotRules |
getRobotRulesSet(Protocol ftp,
URL url)
The hosts for which the caching of robots rules is yet to be done,
it sends a Ftp request to the host corresponding to the
URL
passed, gets robots file, parses the rules and caches the rules object
to avoid re-work in future. |
getConf, getRobotRulesSet, main, parseRules, setConf
public FtpRobotRulesParser(org.apache.hadoop.conf.Configuration conf)
public crawlercommons.robots.BaseRobotRules getRobotRulesSet(Protocol ftp, URL url)
URL
passed, gets robots file, parses the rules and caches the rules object
to avoid re-work in future.getRobotRulesSet
in class RobotRulesParser
ftp
- The Protocol
objecturl
- URLBaseRobotRules
object for the rulesCopyright © 2014 The Apache Software Foundation