public class DefaultWebCrawler
extends edu.uci.ics.crawler4j.crawler.WebCrawler
WebCrawler
implementation.Constructor and Description |
---|
DefaultWebCrawler() |
Modifier and Type | Method and Description |
---|---|
boolean |
shouldVisit(edu.uci.ics.crawler4j.crawler.Page referringPage,
edu.uci.ics.crawler4j.url.WebURL url)
Override this method to specify whether the given URL should be visited or not.
|
void |
visit(edu.uci.ics.crawler4j.crawler.Page page)
Override this method to implement the single page processing logic.
|
getMyController, getMyId, getMyLocalData, getThread, handlePageStatusCode, handleUrlBeforeProcess, init, isNotWaitingForNewURLs, onBeforeExit, onContentFetchError, onContentFetchError, onPageBiggerThanMaxSize, onParseError, onRedirectedStatusCode, onStart, onUnexpectedStatusCode, onUnhandledException, run, setThread, shouldFollowLinksIn
public boolean shouldVisit(edu.uci.ics.crawler4j.crawler.Page referringPage, edu.uci.ics.crawler4j.url.WebURL url)
shouldVisit
in class edu.uci.ics.crawler4j.crawler.WebCrawler
public void visit(edu.uci.ics.crawler4j.crawler.Page page)
visit
in class edu.uci.ics.crawler4j.crawler.WebCrawler
Copyright © 2010–2019 The Apache Software Foundation. All rights reserved.