public class Http extends HttpBase
Modifier and Type | Field and Description |
---|---|
static org.slf4j.Logger |
LOG |
accept, acceptLanguage, BUFFER_SIZE, maxContent, maxCrawlDelay, proxyHost, proxyPort, RESPONSE_TIME, responseTime, timeout, useHttp11, useProxy, userAgent
CHECK_BLOCKING, CHECK_ROBOTS, X_POINT_ID
Constructor and Description |
---|
Http()
Constructs this plugin.
|
Modifier and Type | Method and Description |
---|---|
protected Response |
getResponse(URL url,
CrawlDatum datum,
boolean redirect)
Fetches the
url with a configured HTTP client and
gets the response. |
static void |
main(String[] args)
Main method.
|
void |
setConf(org.apache.hadoop.conf.Configuration conf)
Reads the configuration from the Nutch configuration files and sets
the configuration.
|
getAccept, getAcceptLanguage, getConf, getMaxContent, getProtocolOutput, getProxyHost, getProxyPort, getRobotRules, getTimeout, getUseHttp11, getUserAgent, logConf, main, processDeflateEncoded, processGzipEncoded, useProxy
public void setConf(org.apache.hadoop.conf.Configuration conf)
public static void main(String[] args) throws Exception
args
- Command line argumentsException
protected Response getResponse(URL url, CrawlDatum datum, boolean redirect) throws ProtocolException, IOException
url
with a configured HTTP client and
gets the response.getResponse
in class HttpBase
url
- URL to be fetcheddatum
- Crawl dataredirect
- Follow redirects if and only if trueProtocolException
IOException
Copyright © 2014 The Apache Software Foundation