A B C D E F G H I K L M N O P R S T U V W _

A

actionURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The form's action URI
activate() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Activate the connection.
activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
ACTIVITY_FETCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_LOGON_END - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_LOGON_START - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_ROBOTSPARSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
addAgent(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add a user-agent.
addAllow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add an allow.
addAuthPage(String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Add an auth page
addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
addData(IVersionActivity, String, IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Add a data entry into the cache.
addDisallow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add a disallow.
addElement(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
addHeader(String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
addPageParameter(String, String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Add a page parameter
addParameter(String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Add parameter
addRule(WebcrawlerConnector.CanonicalizationPolicy) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
addSeedDocuments(ISeedingActivity, DocumentSpecification, long, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Queue "seed" documents.
addToPool(ThrottledFetcher.ThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Put a connection into the pool.
allows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
amt - Variable in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
applyFormOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ATTR_BINREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bin regular expression
ATTR_DOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Domain/realm part of credentials (if any)
ATTR_INSENSITIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Whether the match is case insensitive
ATTR_MATCHREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Form name or link target regexp for authentication page
ATTR_NAMEREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication parameter name regexp
ATTR_PASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Password part of credentials
ATTR_TRUSTEVERYTHING - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
"Trust everything" attribute - replacing truststore if set to 'true'
ATTR_TRUSTSTORE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Trust store section of authentication record
ATTR_TYPE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type of security
ATTR_URLREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Regexp for access control node
ATTR_USERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Username part of credentials
ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The value attribute (used for maxconnections and maxkbpersecond)
ATTRVALUE_BASIC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for basic authentication
ATTRVALUE_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Form
ATTRVALUE_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Link
ATTRVALUE_NTLM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for NTLM authentication
ATTRVALUE_REDIRECTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Redirection
ATTRVALUE_SESSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for session-based authentication
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
The credential
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Authentication
AuthenticationCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for all kinds of authentication.
available() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Get available.

B

BasicParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents the basic, outermost parse state.
BasicParseState() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_ATTR_LOOKING_FOR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_ATTR_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_COMMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_DOUBLE_QUOTES_ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_END_TAG_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_SINGLE_QUOTES_ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_TAG_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_TAG_SAW_SLASH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_IN_UNQUOTED_ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_SAWCOMMENTDASH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_SAWDASH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_SAWEXCLAMATION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_SAWLEFTBRACKET - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
BASICPARSESTATE_SAWSECONDCOMMENTDASH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
basicRead(byte[], int, int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Basic read, which uses the server object to throttle activity.
beginFetch(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Begin the fetch process.
beginFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Note the start of a fetch operation for a bin.
beginFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Begin the fetch process.
beginRead(int, double) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Note the start of an individual byte read of a specified size.
beginRead(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Begin a read operation, from within a stream
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Handle the tag beginning to set the correct second-level parsing context
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
beginTag(String, String, String, Attributes) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
binName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This is the bin name which this connection pool belongs to
binName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
This is the bin name which this throttle belongs to.
booleanToString(boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a boolean to a boolean string.

C

cache - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This is where we keep data around between the getVersions() phase and the processDocuments() phase.
cacheData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
calculateDocumentEvents(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate events that should be associated with a document.
canonicalizationPolicies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Canonicalization policies
canRemoveAspSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemoveBvSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemoveJavaSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemovePhpSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canReorder() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
check() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check status of connection.
checkFetchAllowed(String, String, long, String, IVersionActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Read robots.txt data from the cache or from the database.
checkFetchAllowed(String, String, String, int, PageCredentials, IKeystoreManager, String, String[], long, String, IVersionActivity, int, String, int, String, String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check robots to see if fetch is allowed.
checkIfValidFeed() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Check if feed was valid
checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
clearThreadContext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Clear out any state information specific to a given thread.
client - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
clientHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
clientPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
close() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Close the connection.
close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Close the connection.
close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Close.
commentField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
commentURLField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
compileList(ArrayList, ArrayList) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Compile all regexp entries in the passed in list, and add them to the output list.
connectionBinArray - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The connection has resolved pointers to the ConnectionBin structures that manage pool maximums.
connectionBins - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
This is the static pool of ConnectionBin's, keyed by bin name.
connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Connection timeout, milliseconds.
connectionWait - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This object is what we synchronize on when we are waiting on a connection to free up for this bin.
connManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The http connection manager.
contentType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The content-type header value
contextDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
cookieList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
CookieManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we write cookies.
CookieManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Constructor.
cookieManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The cookie manager used by this instance
CookieManager.CookiesCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
CookieManager.CookiesCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
 
CookieManager.CookiesDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a session key object.
CookieManager.CookiesDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
CookieManager.CookiesExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating cookies session objects.
CookieManager.CookiesExecutor(CookieManager, CookieManager.CookiesDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Constructor.
CookieManager.DynamicCookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a set of cookies, built dynamically.
CookieManager.DynamicCookieSet() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
cookiesCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
CookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents a bunch of cookies
CookieSet(Cookie[]) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
countConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Count connections that are in use.
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Create a set of new objects to operate on and cache.
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Create a set of new objects to operate on and cache.
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Create a set of new objects to operate on and cache.
createSocket(String, int, InetAddress, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
createSocket(String, int, InetAddress, int, HttpConnectionParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
createSocket(String, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
createSocket(Socket, String, int, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
CredentialsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes credential information pulled from a configuration.
CredentialsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Constructor.
credentialsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The credentials description
CredentialsDescription.BasicCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Basic type credentials
CredentialsDescription.BasicCredential(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Constructor
CredentialsDescription.CredentialsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual credential item.
CredentialsDescription.CredentialsItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Constructor.
CredentialsDescription.LoginParameterIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
LoginParameter iterator
CredentialsDescription.LoginParameterIterator(Map, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Constructor
CredentialsDescription.NTLMCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
NTLM-style credentials
CredentialsDescription.NTLMCredential(String, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Constructor
CredentialsDescription.SessionCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credentials
CredentialsDescription.SessionCredential(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Constructor
CredentialsDescription.SessionCredentialItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credential helper class
CredentialsDescription.SessionCredentialItem(String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Constructor
CredentialsDescription.SessionCredentialParameter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credential parameter class
CredentialsDescription.SessionCredentialParameter(String, Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
credentialsObject - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
currentAttrMap - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentAttrName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentAttrNameBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
currentIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
currentOne - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
currentState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentTagName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentTagNameBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
currentValueBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 

D

data - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The cache file for the data
DataCache - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is a cache of a specific URL's data.
DataCache() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Constructor.
DataCache.DocumentData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents everything we need to know about a document that's getting passed from the getDocumentVersions() phase to the processDocuments() phase.
DataCache.DocumentData(File, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Constructor.
dataFileFolder - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
dataRecorder - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
dataSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Hack added to record all access data from current crawler
dataSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
 
dealWithCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
Deal with a character.
DEFAULT_BUNDLE_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
DEFAULT_PATH_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Uninstall the manager.
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Uninstall the manager.
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Uninstall the manager.
deinstall(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Uninstall the connector.
deleteData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Delete specified item of data.
destroy() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Destroy the connection forever
disallows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
discardField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
disconnect() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Close the connection.
discoveredFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
dnsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
DNSManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we DNS entries for hosts.
DNSManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Constructor.
dnsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The DNS manager currently used by this instance
DNSManager.DNSCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
DNSManager.DNSCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
 
DNSManager.DNSInfo - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a cached data item.
DNSManager.DNSInfo(String, String, long, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Constructor.
DNSManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a robots host object.
DNSManager.HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
DNSManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating robots host objects.
DNSManager.HostExecutor(DNSManager, DNSManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Constructor.
doCanonicalization(WebcrawlerConnector.DocumentURLFilter, WebURL) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to canonicalize a URL.
documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
documentName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
documentNumber - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataRecorder
 
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
The document uri
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
The document identifier
doesPathMatch(String, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Check if path matches specification
doesPathMatch(String, int, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Recursive method for matching specification to path.
domain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 
domainField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
domainSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
doneFetch(IVersionActivity) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Done with the fetch.
doneFetch(IVersionActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Done with the fetch.
dr - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 

E

ELEMENTCATEGORY_FIXEDEXCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ELEMENTCATEGORY_FIXEDINCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ELEMENTCATEGORY_FREEFORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The set of elements
elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
endFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Note the end of a fetch operation.
endHeader() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
endRead(int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Note the end of an individual read from the server.
endRead(int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
End a read operation, from within a stream
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Handle the tag ending
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
Convert the individual sub-fields of the item context into their final forms
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
Convert the individual sub-fields of the item context into their final forms
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
Convert the individual sub-fields of the item context into their final forms
equals(Object) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
estimateInProgress - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Flag indicating whether rate estimation is in progress yet
estimateValid - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Flag indicating whether a rate estimate is needed
exception - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
excludeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of index exclude patterns
excludePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of exclude patterns
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Perform the desired operation.
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Perform the desired operation.
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Perform the desired operation.
executeFetch(String, String, String, int, int, boolean, String, FormData, LoginCookies, String, int, String, String, String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Execute the fetch and get the return code.
executeFetch(String, String, String, int, int, boolean, String, FormData, LoginCookies, String, int, String, String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Execute the fetch and get the return code.
executeMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Notify the implementing class of the existence of a cached version of the object.
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Notify the implementing class of the existence of a cached version of the object.
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Notify the implementing class of the existence of a cached version of the object.
existsInPool(ThrottledFetcher.ThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Check if a connection exists in the pool already.
expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
 
expirationDateField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
extractContentType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
extractEncoding(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
extractLinks(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to extract links from an already-fetched document.
extractMimeType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 

F

FETCH_BAD_URI - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_CIRCULAR_REDIRECT - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_INTERRUPTED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_IO_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_NOT_TRIED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_ROBOTS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_SEQUENCE_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_STANDARD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_UNKNOWN_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
fetchCounter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The current bytes in the current fetch
fetchMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The method object
fetchType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The kind of fetch we are doing
filter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
findConnection(int, ThrottledFetcher.ConnectionBin[], String, String, int, PageCredentials, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This method is called only when there is no existing connection yet identified that can be used for contacting the server and port specified.
findHTMLForm(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find matching HTML form data, if present.
findHTMLLinkURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find HTML link URI, if present, making sure specified preference is matched.
findLoginParameters(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
For a given login page, specific information may need to be submitted to the server to properly log in.
findLoginParameters(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
For a given login page, specific information may need to be submitted to the server to properly log in.
findMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
findMetadata(DocumentSpecification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Read a document specification to yield a map of name/value pairs for metadata
findNextOne() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Find next one
findPreferredRedirectionURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find a preferred redirection URI, if it exists
findRedirectionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find a redirection URI, if it exists
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
firstChunkLock - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
This object is used to gate access while the first chunk is being read
flushIdleConnections(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Flush any idle connections.
flushIdleConnections() - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Flush connections that have timed out from inactivity.
flushIdleConnections(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Do periodic bookkeeping.
FormData - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the form data gleaned from an HTML page.
FormDataAccumulator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class accumulates form data and allows overrides
FormDataAccumulator(String, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
FormDataAccumulator.FormItemIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Iterator over FormItems
FormDataAccumulator.FormItemIterator(ArrayList) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
FormDataElement - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes individual form data elements, for form submission.
FormItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class provides an individual data item
FormItem(String, String, int, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The form name pattern, or null if no form is expected
formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
formNameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The form name regexp
FormParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
FormParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
formParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_OPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_SELECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_TEXTAREA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
fqdn - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
fqdnField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
freePool - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This map contains ThrottledConnection objects that are in the pool, and are not in use.
from - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The email address for this connector instance

G

getAcls(DocumentSpecification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Grab forced acl out of document specification.
getActionURI() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Get the full action URI for this form.
getActionURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Get the full action URI for this form.
getActivitiesList() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Return the list of activities that this connector supports (i.e.
getAttributeJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBinName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Get the bin name.
getBinName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Get the bin name.
getBinNames(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the bin name string for a document identifier.
getBodyJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getCanonicalizationPolicies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Get canonicalization policies
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
Get the name of the object class.
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
Get the name of the object class.
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
Get the name of the object class.
getConnection(String, String, int, PageCredentials, IKeystoreManager, ThrottleDescription, String[], int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Obtain a connection to specified protocol, server, and port.
getConnectorModel() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Tell the world what model this connector uses for getDocumentIdentifiers().
getContentType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the contentType
getContentType(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the content type.
getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
getCookie(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
Get the cookie name
getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
getCookieCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
Get the cookie count
getCookiesCacheKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Construct a global key which represents an individual session.
getCredential() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Get credential type
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
getData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the data
getData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Fetch binary data entry from the cache.
getDataLength(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Fetch binary data length.
getDNSKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Construct a key which represents an individual host name.
getDocumentVersions(String[], String[], IVersionActivity, DocumentSpecification, int, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get document versions given an array of document identifiers.
getElementIterator() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Iterate over the active form data elements.
getElementIterator() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Iterate over the active form data elements.
getElementName() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
Get the element name
getElementName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
Get the element name
getElementValue() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
Get the element value
getElementValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
Get the element value
getEnabled() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
getException() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
getException() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the expiration time.
getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Get expiration
getFormData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
getFormNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the form name pattern.
getFormNamePattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the form name pattern.
getFQDN() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the fqdn
getHost() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the host name
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
getIPAddress() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the ipaddress
getLastFetchCookies() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the last fetch cookies.
getLastFetchCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the last fetch cookies.
getLastFetchTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Get the last fetch time.
getLimitedResponseBody(int, String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get limited response as a string.
getLimitedResponseBody(int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get limited response as a string.
getMaxDocumentRequest() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the maximum number of documents to amalgamate together into one batch, for this connector.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
Get the maximum LRU count of the object class.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
Get the maximum LRU count of the object class.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
Get the maximum LRU count of the object class.
getMaxOpenConnections(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Given a bin name, find the max open connections to use for that bin.
getMaxOpenConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get maximum open connections.
getMinimumMillisecondsPerByte(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Look up minimum milliseconds per byte for a bin.
getMinimumMillisecondsPerByte() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get minimum milliseconds per byte.
getMinimumMillisecondsPerFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Look up minimum milliseconds for a fetch for a bin.
getMinimumMillisecondsPerFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get minimum milliseconds per fetch
getName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
getNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
Get the object class for an object.
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
Get the object class for an object.
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
Get the object class for an object.
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Given a URL, find the right PageCredentials object to use.
getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the page credentials for a given document identifier (URL)
getParameter(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the actual parameter
getParameterCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the parameter count
getParameterCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the number of parameters.
getParameterNamePattern(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the name of the i'th parameter.
getParameterNamePattern(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the name of the i'th parameter.
getParameterValue(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the desired value of the i'th parameter.
getParameterValue(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the desired value of the i'th parameter.
getPath() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Get the pattern.
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the pattern
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get the pattern.
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Get the pattern.
getPoolConnection() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Grab a connection from the current pool.
getPort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getPreferredLinkPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the preferred link pattern.
getPreferredLinkPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the preferred link pattern.
getPreferredRedirectionPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the preferred redirection pattern.
getPreferredRedirectionPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the preferred redirection pattern.
getRawQuery() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getReferralURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the referral URI
getReferralURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the referral URI.
getRelationshipTypes() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Return the list of relationship types that this connector recognizes.
getResponse() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
getResponseBodyStream() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the response input stream.
getResponseBodyStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the response input stream.
getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the response code
getResponseCode(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the response code.
getResponseCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the http response code.
getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the http response code.
getResponseHeader(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get a specified response header, if it exists.
getResponseHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get a specified response header, if it exists.
getResponseHeaders() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get response headers
getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get response headers
getResult() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Get the result.
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Get the result.
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Get the result.
getRobotsKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Construct a key which represents an individual host name.
getScheme() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Given a URL, find the right SequenceCredentials object to use.
getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the sequence credentials for a given document identifier (URL)
getSequenceKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Fetch the unique key value for this particular credential.
getSequenceKey() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
Fetch the unique key value for this particular credential.
getSession(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataRecorder
 
getSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Start a session
getSessionKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
getString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getSubmitMethod() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Get the submit method for this form.
getSubmitMethod() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Get the submit method for this form.
getTargetURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHandler
 
getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
Given a URL, build the right trust certificate store, or return null if all certs should be accepted.
getTrustStore() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Get keystore
getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the trust store for a given document identifier (URL)
getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
getVersionString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Get whatever contribution to the version string should come from this data.
getWaitAmount() - Method in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
guidField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 

H

handleHTML(String, IHTMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle document references from HTML
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
XML handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
The link handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
XML handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
Link handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
Link notification interface
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
XML handler
handleRedirects(String, IRedirectionHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle extracting the redirect link from a redirect response.
handleXML(String, IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle document references from XML.
hashCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
 
hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Check for next
hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
headerNames - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
headerValues - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
host - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
hostConfiguration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
htmlAttributeDecode(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
Decode an html attribute
htmlBodyDecode(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
Decode html body text

I

IDiscoveredLinkHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by a link extractor to note a discovered link.
IHTMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
IMetaTagHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by a parser to handle metadata tags.
inactiveTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
If not active, this is when it went inactive
includeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of index include patterns
includePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of include patterns
inputStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
The stream we are wrapping.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Install the manager.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Install the manager.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Install the manager.
install(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Install the connector.
insureWithinLimits(int, ThrottledFetcher.ThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Verify that this bin is within limits.
interestingMimeTypeArray - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This represents a list of the mime types that this connector knows how to extract links from.
interestingMimeTypeMap - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
inUseConnections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This is the number of connections in this bin that are signed out and presumably in use
ipaddress - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
ipaddressField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
IRedirectionHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
isActive - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Is the connection considered "active"?
isAgentMatch(String, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if user-agent matches.
isAllowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if path is allowed.
isContentInteresting(IFingerprintActivity, String, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to check if data is interesting, based on response code and content type.
isDataIngestable(IFingerprintActivity, String, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to check if an already-fetched document should be ingested.
isDisallowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if path is disallowed.
isDocumentAndHostLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if both a document and host are legal.
isDocumentIndexable(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if the document identifier is indexable.
isDocumentLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if the document identifier is legal.
isDocumentText(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Is the document text, as far as we can tell?
isEnabled - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
isFetchAllowed(String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Check if fetch is allowed
isHostLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if a host is legal.
isHTMLWhitespace(char) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
Is a character HTML whitespace?
isInitialized - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This flag is set when the instance has been initialized
isStrange(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check if character is not typical ASCII.
isText(byte[], int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Test to see if a document is text or not.
isWhiteSpace(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check if a byte is a whitespace character.
IThrottledConnection - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface represents an established connection to a URL.
IXMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an XML processor in order to handle an XML document.

K

keyField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 

L

lastFetchCookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The cookies from the last fetch
lastFetchTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
This is the last time a fetch was done on this bin
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
LinkParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class recognizes and interprets all links
LinkParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
linkType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
logFetchCount(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Log the fetch of a number of bytes, from within a stream.
LoginCookies - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes cookies obtained during sequential authentication.
LoginParameters - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes login parameters to be used to submit a page during sequential authentication.
lookup(String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Given a host name, look up the ip address and fqdn.
lookupIPAddress(String, IVersionActivity, String, long, StringBuilder) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Look up an ipaddress given a non-canonical host name.

M

makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Turn this instance into a Credentials object, given the specified target host name
makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Turn this instance into a Credentials object, given the specified target host name
makeCredentialsObject(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
Turn this instance into a Credentials object, given the specified target host name
makeDNSEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate the event name for DNS access.
makeDocumentIdentifier(String, String, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Convert an absolute or relative URL to a document identifier.
makeReadable(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Convert a string from the robots file into a readable form that does NOT contain NUL characters (since postgresql does not accept those).
makeRobotsEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Construct a name for the global web-connector robots event.
makeRobotsKey(String, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Construct the robots key for a host.
makeSessionLoginEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate the event name for session login.
mapChunk(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
Map an entity reference back to a character
mapLookup - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
mark(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Mark.
markSupported() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Check if mark is supported.
matches(ThrottledFetcher.ConnectionBin[], String, String, int, PageCredentials, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
See if this instances matches a given server and port.
matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
maxOpenConnections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The maximum open connections, or null if no limit.
Messages - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
Messages() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
Constructor - do no instantiate
MetaParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class recognizes and interprets all meta tags
MetaParseState(IMetaTagHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
minimumMillisecondsPerByte - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The minimum milliseconds between bytes, or null if no limit.
minimumMillisecondsPerBytePerServer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Stream throttling parameters
minimumMillisecondsPerFetch - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The minimum milliseconds per fetch, or null if no limit
minMillisecondsPerByte - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
These are the bandwidth limits, per bin
mustHaveReference(ThrottledFetcher.ConnectionBin) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
 
myFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
 
myUrl - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The current URL being fetched

N

name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
nameField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
namePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Compiled name pattern
nameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Name regexp
next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Get the next one
next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
NODE_ACCESSCREDENTIAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Access control description node
NODE_AUTHPAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page description node
NODE_AUTHPARAMETER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication parameter node
NODE_BINDESC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bin description node
NODE_EXCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude regexps node.
NODE_EXCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude regexps node.
NODE_INCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Include regexps node.
NODE_INCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Include regexps node.
NODE_LIMITTOSEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Limit to seeds.
NODE_MAXCONNECTIONS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The max connections node
NODE_MAXFETCHESPERMINUTE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The max fetch rate node
NODE_MAXKBPERSECOND - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bandwidth node
NODE_SEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The seeds node.
NODE_TRUST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Trust store description node
noteAHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered href
noteConnectionCreation() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Note the creation of an active connection that belongs to this bin.
noteConnectionDestruction() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Note the destruction of an active connection that belongs to this bin.
noteDiscoveredLink(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
Inform the world of a discovered link.
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHandler
Inform the world of a discovered link.
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Override noteDiscoveredLink
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindPreferredRedirectionHandler
Override noteDiscoveredLink
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
Inform the world of a discovered link.
noteDiscoveredTtlValue(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IXMLHandler
Inform the world of a discovered ttl value.
noteDiscoveredTtlValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
Inform the world of a discovered ttl value.
noteEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
noteEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteFormEnd() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note the end of a form
noteFormInput(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note an input tag
noteFormStart(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note the start of a form
noteFRAMESRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered FRAME SRC
noteIMGSRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered IMG SRC
noteInterrupted(Throwable) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Note that the connection fetch was interrupted by something.
noteInterrupted(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Note that the connection fetch was interrupted by something.
noteLINKHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered href
noteMetaTag(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
Inform the world of a discovered metadata tag.
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
Note a meta tag
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
Note a meta tag
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note a meta tag
noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteNonscriptTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteNonscriptTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
noteNonscriptTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
noteNonscriptTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteNormalCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
noteNormalCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.BasicParseState
 
noteTag(String, Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 

O

optionSelected - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
optionValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
optionValueText - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
ordinalField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
org.apache.manifoldcf.crawler.connectors.webcrawler - package org.apache.manifoldcf.crawler.connectors.webcrawler
 
outerTagCount - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Keep track of the number of valid feed signals we saw
outputConfigurationBody(IThreadContext, IHTTPOutput, Locale, ConfigParams, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the configuration body section.
outputConfigurationHeader(IThreadContext, IHTTPOutput, Locale, ConfigParams, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the configuration header section.
outputResource(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, Object>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputSpecificationBody(IHTTPOutput, Locale, DocumentSpecification, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the specification body section.
outputSpecificationHeader(IHTTPOutput, Locale, DocumentSpecification, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the specification header section.

P

PageCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for page-based authentication.
PARAMETER_EMAIL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Email (a parameter)
PARAMETER_PROXYAUTHDOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth domain (parameter)
PARAMETER_PROXYAUTHPASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth password (parameter)
PARAMETER_PROXYAUTHUSERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth username (parameter)
PARAMETER_PROXYHOST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy host name (parameter)
PARAMETER_PROXYPORT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy port (parameter)
PARAMETER_ROBOTSUSAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Robots usage (a parameter)
parameters - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The list of the parameters we want to add for this pattern.
parentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHandler
 
parseRobotsTxt(BufferedReader, String, IVersionActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Parse the robots.txt file using a reader.
password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 
pathField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
pathSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
The bin-matching pattern.
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Url match pattern
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The bin-matching pattern.
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
The bin-matching pattern.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
This is the hash that contains everything.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
This is the hash that contains everything.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
This is the hash that contains everything.
poll() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This method is periodically called for all connectors that are connected but not in active use.
poolLock - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
This global lock protects the "distributed pool" resource, and insures that a connection can get pulled out of all the right pools and wind up in only the hands of one thread.
port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Port
portBlankField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portsToString(int[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a port array to a string.
preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred link pattern, or null if there's no preferred link
preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
 
preferredLinkRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred link regexp
preferredRedirectionPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred redirection pattern, or null if there's no preferred redirection
preferredRedirectionRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred redirection regexp
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
Process this data
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
Process the data accumulated for this item
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
Process this data
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
Process the data accumulated for this item
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
Process this data
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
Process the data accumulated for this item
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
Process this data
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
Process the data accumulated for this item
processConfigurationPost(IThreadContext, IPostParameters, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a configuration post.
processDocuments(String[], String[], IProcessActivity, DocumentSpecification, boolean[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a set of documents.
processSpecificationPost(IPostParameters, Locale, DocumentSpecification) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a specification post.
protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Protocol
proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth domain
proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth password
proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth user name
proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy host
proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy port

R

rateEstimate - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
The inverse rate estimate of the first fetch, in ms/byte
rawQueryPart - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
read() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read a byte.
read(byte[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read lots of bytes.
read(byte[], int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read lots of specific bytes.
READ_CHUNK_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
The read chunk length
readCookies(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Read cookies currently in effect for a given session key.
readCookiesUncached(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Read cookies from database, uncached.
readDNSInfo(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Read DNS data, if it exists.
readRobotsData(String, IVersionActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Read robots data, if it exists.
recordEverything - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
This flag determines whether we record everything to the disk, as a means of doing a web snapshot
records - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
 
redirectionURIPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindPreferredRedirectionHandler
 
refCount - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
This is the reference count for this bin (which records active references)
referralURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The referral URI
regexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
url regexp
REL_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
REL_REDIRECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
releaseDocumentVersions(String[], String[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Free a set of documents.
remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
removeAspSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removeBVSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removeJavaSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removePhpSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
reorder - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
reservedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
reset() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Reset.
resolve(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
responseCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The response code
responseCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
RESULT_NO_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_NO_VERSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_RETRY_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_VERSION_NEEDED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
resultLogFile - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
RESULTSTATUS_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULTSTATUS_NOTYETDETERMINED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULTSTATUS_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ROBOTS_DATA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
robotsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
robotsField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
RobotsManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we write robots.txt files for hosts.
RobotsManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Constructor.
robotsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The robots manager currently used by this instance
RobotsManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a robots host object.
RobotsManager.HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
RobotsManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating robots host objects.
RobotsManager.HostExecutor(RobotsManager, IVersionActivity, RobotsManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Constructor.
RobotsManager.Record - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents a record in a robots.txt file.
RobotsManager.Record() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Constructor.
RobotsManager.RobotsCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
RobotsManager.RobotsCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
 
RobotsManager.RobotsData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a cached data item.
RobotsManager.RobotsData(InputStream, long, String, IVersionActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Constructor.
robotsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Robots usage flag
rules - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
run() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
run() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
rval - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
rval - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 

S

sanityCheck() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
 
ScriptParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class interprets the tag stream generated by the BasicParseState class, and causes script sections to be skipped
ScriptParseState() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
scriptParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
SCRIPTPARSESTATE_INSCRIPT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
SCRIPTPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
secureField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
secureSocketFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Protocol socket factory
seedHosts - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The hash map of seed hosts, to limit urls by, if non-null
selectMultiple - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
selectName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
SequenceCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for sequence-based authentication.
sequenceKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
 
seriesStartTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
The start time of this series
server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Server
sessionKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
sessionPageIterator - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
 
SESSIONSTATE_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
We're in 'login mode'
SESSIONSTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Normal fetch of content document.
setCredential(AuthenticationCredentials) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Set Credentials
setEnabled(boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
setLastFetchTime(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Note a new time for connection fetch for this pool.
setMaxOpenConnections(Integer) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set maximum open connections.
setMinimumMillisecondsPerByte(Double) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set minimum milliseconds per byte.
setMinimumMillisecondsPerFetch(Long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set minimum milliseconds per fetch
setResponseCode(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
setup(ThrottleDescription) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Set up the connection.
setValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Decide whether we should index.
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
 
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
 
skip(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Skip
socketFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
socketFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
This is the javax.net socket factory.
socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Socket timeout, milliseconds
startFetchTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The start of the current fetch
statusCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The status code fetched, if any
stringToArray(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Read a string as a sequence of individual expressions, urls, etc.
stringToBoolean(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a boolean string to a boolean.
stringToPorts(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a string to a port array.
submitMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The form's submit method
SUBMITMETHOD_GET - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 
SUBMITMETHOD_POST - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 

T

tagCleanup() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
takeFromPool(ThrottledFetcher.ThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Activate a connection that should be in the pool.
targetURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHandler
 
theURL - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
thisDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
throttleBinArray - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The connection has resolved pointers to the ThrottleBin structures that help manage bandwidth throttling.
throttleBins - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
This is the static pool of ThrottleBin's, keyed by bin name.
throttledConnection - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
The throttled connection we belong to
ThrottleDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes complex throttling criteria pulled from a configuration.
ThrottleDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Constructor.
throttleDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The throttle description
ThrottleDescription.ThrottleItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual throttle item.
ThrottleDescription.ThrottleItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Constructor.
ThrottledFetcher - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class uses httpclient to fetch stuff from webservers.
ThrottledFetcher() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Constructor.
ThrottledFetcher.ConnectionBin - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Connection pool for a bin.
ThrottledFetcher.ConnectionBin(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionBin
Constructor.
ThrottledFetcher.DataRecorder - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class takes care of recording data and results for posterity
ThrottledFetcher.DataRecorder() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataRecorder
 
ThrottledFetcher.DataSession - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Helper class for the above
ThrottledFetcher.DataSession(ThrottledFetcher.DataRecorder, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
ThrottledFetcher.PoolException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
Pool exception class
ThrottledFetcher.PoolException(String) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.PoolException
 
ThrottledFetcher.SocketCreateThread - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Create a secure socket in a thread, so that we can "give up" after a while if the socket fails to connect.
ThrottledFetcher.SocketCreateThread(SSLSocketFactory, String, int, InetAddress, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
Create the thread
ThrottledFetcher.ThrottleBin - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Throttles for a bin.
ThrottledFetcher.ThrottleBin(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Constructor.
ThrottledFetcher.ThrottledConnection - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Throttled connections.
ThrottledFetcher.ThrottledConnection(String, String, int, PageCredentials, ProtocolFactory, String, ThrottledFetcher.ConnectionBin[]) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Constructor.
ThrottledFetcher.ThrottledConnection.ExecuteMethodThread - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
ThrottledFetcher.ThrottledConnection.ExecuteMethodThread(HttpClient, HostConfiguration, HttpMethodBase) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection.ExecuteMethodThread
 
ThrottledFetcher.ThrottledInputstream - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class throttles an input stream based on the specified byte rate parameters.
ThrottledFetcher.ThrottledInputstream(ThrottledFetcher.ThrottledConnection, InputStream, ThrottledFetcher.DataSession) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Constructor.
ThrottledFetcher.WaitException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
Wait exception class
ThrottledFetcher.WaitException(long) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
ThrottledFetcher.WebSecureSocketFactory - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
HTTPClient secure socket factory, which implements SecureProtocolSocketFactory
ThrottledFetcher.WebSecureSocketFactory(SSLSocketFactory) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WebSecureSocketFactory
Constructor
throwable - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.SocketCreateThread
 
throwable - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The error trace, if any
TIME_15MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_1DAY - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_2HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_5MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_6HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
toASCIIString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
toString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
totalBytesRead - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottleBin
Total actual bytes read in this series; this includes fetches in progress
TrustsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes trust information pulled from a configuration.
TrustsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
Constructor.
trustsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The trusts description
TrustsDescription.TrustsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual credential item.
TrustsDescription.TrustsItem(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Constructor.
trustStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Trust store
trustStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
The credential, or null if this is a "trust everything" item
trustStoreString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Trust store string
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
ttl value
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
ttl value
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
TTL value is set on a per-channel basis
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
ttl value
type - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 

U

understoodProtocols - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
updateCookies(String, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Update cookes that are in effect for a given session key.
url - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
userAgent - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The user-agent for this connector instance
userAgents - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 

V

value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Value
value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
valueField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The version string
viewConfiguration(IThreadContext, IHTTPOutput, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
View configuration.
viewSpecification(IHTTPOutput, Locale, DocumentSpecification) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
View specification.

W

WebcrawlerConfig - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Constants for the Webcrawler connector configuration.
WebcrawlerConfig() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
 
WebcrawlerConnector - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the Web Crawler implementation of the IRepositoryConnector interface.
WebcrawlerConnector() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Constructor.
WebcrawlerConnector.CanonicalizationPolicies - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing a list of canonicalization rules
WebcrawlerConnector.CanonicalizationPolicies() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
WebcrawlerConnector.CanonicalizationPolicy - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing a URL regular expression match, for the purposes of determining canonicalization policy
WebcrawlerConnector.CanonicalizationPolicy(Pattern, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
WebcrawlerConnector.DocumentURLFilter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes the url filtering information (for crawling and indexing) obtained from a digested DocumentSpecification.
WebcrawlerConnector.DocumentURLFilter(DocumentSpecification) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Process a document specification to produce a filter.
WebcrawlerConnector.FeedContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.FeedContextClass(XMLStream, String, String, String, Attributes, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
WebcrawlerConnector.FeedItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.FeedItemContextClass(XMLStream, String, String, String, Attributes) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
WebcrawlerConnector.FindHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is used to discover links in a session login context
WebcrawlerConnector.FindHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHandler
 
WebcrawlerConnector.FindHTMLFormHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for HTML form parsing during state transitions
WebcrawlerConnector.FindHTMLFormHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLFormHandler
 
WebcrawlerConnector.FindHTMLHrefHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for HTML parsing during state transitions
WebcrawlerConnector.FindHTMLHrefHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindHTMLHrefHandler
 
WebcrawlerConnector.FindPreferredRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for redirection handling during state transitions
WebcrawlerConnector.FindPreferredRedirectionHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindPreferredRedirectionHandler
 
WebcrawlerConnector.FindRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for redirection parsing during state transitions
WebcrawlerConnector.FindRedirectionHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FindRedirectionHandler
 
WebcrawlerConnector.NameValue - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Name/value class
WebcrawlerConnector.NameValue(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
WebcrawlerConnector.OuterContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class handles the outermost XML context for the feed document.
WebcrawlerConnector.OuterContextClass(XMLStream, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
 
WebcrawlerConnector.ProcessActivityHTMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes HTML handling
WebcrawlerConnector.ProcessActivityHTMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Constructor.
WebcrawlerConnector.ProcessActivityLinkHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for links that get added into a IProcessActivity object.
WebcrawlerConnector.ProcessActivityLinkHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
Constructor.
WebcrawlerConnector.ProcessActivityRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes redirection handling
WebcrawlerConnector.ProcessActivityRedirectionHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
Constructor.
WebcrawlerConnector.ProcessActivityXMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes XML handling
WebcrawlerConnector.ProcessActivityXMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
Constructor.
WebcrawlerConnector.RDFContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RDFContextClass(XMLStream, String, String, String, Attributes, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
WebcrawlerConnector.RDFItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RDFItemContextClass(XMLStream, String, String, String, Attributes) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
WebcrawlerConnector.RSSChannelContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSChannelContextClass(XMLStream, String, String, String, Attributes, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
WebcrawlerConnector.RSSContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSContextClass(XMLStream, String, String, String, Attributes, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
WebcrawlerConnector.RSSItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSItemContextClass(XMLStream, String, String, String, Attributes) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
WebcrawlerConnector.UrlsetContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.UrlsetContextClass(XMLStream, String, String, String, Attributes, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
WebcrawlerConnector.UrlsetItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.UrlsetItemContextClass(XMLStream, String, String, String, Attributes) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
WebURL - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Replacement class for java.net.URI, which is broken in many ways.
WebURL(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(String, String, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(URI) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(URI, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
write(byte[], int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataSession
 
writeDNSData(String, String, String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Write DNS data, replacing any existing row.
writeResponseRecord(String, int, ArrayList, ArrayList) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.DataRecorder
Atomically write resultlog record, returning data file name to use
writeRobotsData(String, long, InputStream) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Write robots.txt, replacing any existing row.

_

_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 

A B C D E F G H I K L M N O P R S T U V W _