org.apache.nutch.protocol.sftp
Class Sftp

java.lang.Object
  extended by org.apache.nutch.protocol.sftp.Sftp
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, FieldPluggable, Pluggable, Protocol

public class Sftp
extends Object
implements Protocol

This class uses the Jsch package to fetch content using the Sftp protocol.


Field Summary
 
Fields inherited from interface org.apache.nutch.protocol.Protocol
CHECK_BLOCKING, CHECK_ROBOTS, X_POINT_ID
 
Constructor Summary
Sftp()
           
 
Method Summary
 org.apache.hadoop.conf.Configuration getConf()
          Get the Configuration object
 Collection<WebPage.Field> getFields()
           
 ProtocolOutput getProtocolOutput(String url, WebPage page)
           
 crawlercommons.robots.BaseRobotRules getRobotRules(String url, WebPage page)
          Retrieve robot rules applicable for this url.
 void setConf(org.apache.hadoop.conf.Configuration arg0)
          Set the Configuration object
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Sftp

public Sftp()
Method Detail

getProtocolOutput

public ProtocolOutput getProtocolOutput(String url,
                                        WebPage page)
Specified by:
getProtocolOutput in interface Protocol

getConf

public org.apache.hadoop.conf.Configuration getConf()
Get the Configuration object

Specified by:
getConf in interface org.apache.hadoop.conf.Configurable

setConf

public void setConf(org.apache.hadoop.conf.Configuration arg0)
Set the Configuration object

Specified by:
setConf in interface org.apache.hadoop.conf.Configurable

getRobotRules

public crawlercommons.robots.BaseRobotRules getRobotRules(String url,
                                                          WebPage page)
Description copied from interface: Protocol
Retrieve robot rules applicable for this url.

Specified by:
getRobotRules in interface Protocol
Parameters:
url - url to check
Returns:
robot rules (specific for this url or default), never null

getFields

public Collection<WebPage.Field> getFields()
Specified by:
getFields in interface FieldPluggable


Copyright © 2013 The Apache Software Foundation