A B C D E F G H I K L M N O P R S T U V W _ 
All Classes All Packages

A

abort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
abortCheck - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Abort checker
abortCheck() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
AbortChecker - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class furnishes an abort signal whenever the job activity says it should.
AbortChecker(IAbortActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
abortThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
acceptNewTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
actionURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The form's action URI
activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
ACTIVITY_FETCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_LOGON_END - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_LOGON_START - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_PROCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ACTIVITY_ROBOTSPARSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
add(WebcrawlerConnector.MappingRule) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
 
addAgent(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add a user-agent.
addAllow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add an allow.
addAuthPage(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Add an auth page
addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
Adds an HTTP cookie, replacing any existing equivalent cookies.
addCookies(Cookie[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
Adds an array of HTTP cookies.
addData(IProcessActivity, String, IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Add a data entry into the cache.
addDisallow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Add a disallow.
addElement(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
addPageParameter(int, String, String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Add a page parameter
addParameter(String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Add parameter
addRule(WebcrawlerConnector.CanonicalizationPolicy) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
addSeedDocuments(ISeedingActivity, Specification, String, long, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Queue "seed" documents.
advance() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
Go on to next token.
allows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
amt - Variable in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
applyFormOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Apply overrides
applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Apply overrides
applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
Apply overrides
applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ATTR_ASPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
aspsessionremoval attribute
ATTR_BINREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bin regular expression
ATTR_BVSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
bvsessionremoval attribute
ATTR_DESCRIPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
description attribute
ATTR_DOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Domain/realm part of credentials (if any)
ATTR_INSENSITIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Whether the match is case insensitive
ATTR_JAVASESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
javasessionremoval attribute
ATTR_LOWERCASE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
map to lower case
ATTR_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Map attribute
ATTR_MATCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Match attribute
ATTR_MATCHREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Form name or link target regexp for authentication page
ATTR_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
name attribute
ATTR_NAMEREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication parameter name regexp
ATTR_OVERRIDETARGETURL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
URL to fetch next in a sequence (an override)
ATTR_PASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Password part of credentials
ATTR_PHPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
phpsessionremoval attribute
ATTR_REGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
regexp attribute
ATTR_REORDER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
reorder attribute
ATTR_TOKEN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
token attribute
ATTR_TRUSTEVERYTHING - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
"Trust everything" attribute - replacing truststore if set to 'true'
ATTR_TRUSTSTORE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Trust store section of authentication record
ATTR_TYPE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type of security
ATTR_URLREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Regexp for access control node
ATTR_USERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Username part of credentials
ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The value attribute (used for maxconnections and maxkbpersecond)
ATTRVALUE_BASIC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for basic authentication
ATTRVALUE_CONTENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Access
ATTRVALUE_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Value false
ATTRVALUE_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Form
ATTRVALUE_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Link
ATTRVALUE_NO - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Value no
ATTRVALUE_NTLM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for NTLM authentication
ATTRVALUE_REDIRECTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page type: Redirection
ATTRVALUE_SESSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Type value for session-based authentication
ATTRVALUE_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Value true
ATTRVALUE_YES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Value yes
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
The credential
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Authentication
AuthenticationCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for all kinds of authentication.
available() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Get available.

B

baseDocumentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
baseFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
BasicCredential(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Constructor
basicRead(byte[], int, int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Basic read, which uses the server object to throttle activity.
beginFetch(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Begin the fetch process.
beginFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Begin the fetch process.
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Handle the tag beginning to set the correct second-level parsing context
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
bodyStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
booleanToString(boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a boolean to a boolean string.

C

cache - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This is where we keep data around between the getVersions() phase and the processDocuments() phase.
cacheData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
calculateDocumentEvents(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate events that should be associated with a document.
canLowercase() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canonicalizationPolicies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Canonicalization policies
CanonicalizationPolicies() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
CanonicalizationPolicy(Pattern, boolean, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemoveAspSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemoveBvSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemoveJavaSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canRemovePhpSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
canReorder() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
check() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check status of connection.
checkException(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
checkFetchAllowed(String, String, long, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Read robots.txt data from the cache or from the database.
checkFetchAllowed(String, String, String, int, PageCredentials, IKeystoreManager, String, String[], long, String, IProcessActivity, int, String, int, String, String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check robots to see if fetch is allowed.
checkIfValidFeed() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Check if feed was valid
checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
 
checkSum - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
clear() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
Clears all cookies.
clearExpired(Date) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
Removes all of cookies in this HTTP state that have expired by the specified date.
clearThreadContext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Clear out any state information specific to a given thread.
close() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Close the connection.
close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Close the connection.
close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Close.
commentField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
commentURLField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
compileList(List<Pattern>, List<String>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Compile all regexp entries in the passed in list, and add them to the output list.
ConnectionPool(IConnectionThrottler, String, String, int, PageCredentials, SSLSocketFactory, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
ConnectionPoolKey(String, String, int, PageCredentials, String, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
connectionPools - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Connection pools.
connections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
The actual pool of connections
connectionThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
Throttler
connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Connection timeout milliseconds
connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Connection timeout, milliseconds.
connManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The http connection manager.
contentBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
contentPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The content pattern, or null if no content is sought for
contentPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
contentRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The content regexp
contentType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The content-type header value
contextDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
contextException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
contextMessage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
cookieException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
cookieList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
cookieManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The cookie manager used by this instance
CookieManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we write cookies.
CookieManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Constructor.
CookieManager.CookiesCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
CookieManager.CookiesDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a session key object.
CookieManager.CookiesExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating cookies session objects.
CookieManager.DynamicCookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a set of cookies, built dynamically.
cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
cookiesCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
CookiesCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
 
CookiesDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
CookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents a bunch of cookies
CookieSet(List<Cookie>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
CookiesExecutor(CookieManager, CookieManager.CookiesDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Constructor.
cookieStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
create(HttpContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.LaxBrowserCompatSpecProvider
 
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Create a set of new objects to operate on and cache.
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Create a set of new objects to operate on and cache.
create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Create a set of new objects to operate on and cache.
credentialsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The credentials description
CredentialsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes credential information pulled from a configuration.
CredentialsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Constructor.
CredentialsDescription.BasicCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Basic type credentials
CredentialsDescription.CredentialsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual credential item.
CredentialsDescription.LoginParameterIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
LoginParameter iterator
CredentialsDescription.NTLMCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
NTLM-style credentials
CredentialsDescription.SessionCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credentials
CredentialsDescription.SessionCredentialItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credential helper class
CredentialsDescription.SessionCredentialParameter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Session credential parameter class
CredentialsItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Constructor.
credentialsObject - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
currentFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
currentIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
currentOne - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 

D

data - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The cache file for the data
DataCache - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is a cache of a specific URL's data.
DataCache() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Constructor.
DataCache.DocumentData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents everything we need to know about a document that's getting passed from the getDocumentVersions() phase to the processDocuments() phase.
DEFAULT_BUNDLE_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
DEFAULT_PATH_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
deflate - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Uninstall the manager.
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Uninstall the manager.
deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Uninstall the manager.
deinstall(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Uninstall the connector.
deleteData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Delete specified item of data.
destroy() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Destroy the connection.
destroy() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Destroy the connection forever
disallows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
discardField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
disconnect() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Close the connection.
discoveredFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
dnsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
DNSCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
 
DNSInfo(String, String, long, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Constructor.
dnsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The DNS manager currently used by this instance
DNSManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we DNS entries for hosts.
DNSManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Constructor.
DNSManager.DNSCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
DNSManager.DNSInfo - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a cached data item.
DNSManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a robots host object.
DNSManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating robots host objects.
doCanonicalization(WebcrawlerConnector.DocumentURLFilter, WebURL) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to canonicalize a URL.
DocumentData(File, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Constructor.
documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
documentIdentifiertoFileName(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Convert a document identifier to filename.
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
The document uri
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
The document identifier
documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
The document identifier
DocumentURLFilter(Specification) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Process a document specification to produce a filter.
doesPathMatch(String, int, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Recursive method for matching specification to path.
doesPathMatch(String, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Check if path matches specification
domain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 
domainField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
domainSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
doneFetch(IProcessActivity) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Done with the fetch.
doneFetch(IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Done with the fetch.
DynamicCookieSet() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 

E

ELEMENTCATEGORY_FIXEDEXCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ELEMENTCATEGORY_FIXEDINCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
ELEMENTCATEGORY_FREEFORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The set of elements
elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Handle the tag ending
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
Convert the individual sub-fields of the item context into their final forms
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
Convert the individual sub-fields of the item context into their final forms
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
Convert the individual sub-fields of the item context into their final forms
equals(Object) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Compare against another object
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
evalExpression - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
 
EvaluatorToken() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
EvaluatorToken(int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
EvaluatorToken(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
EvaluatorTokenStream(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
Constructor.
excludeContentIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
List of content exclusion pattern
excludeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of index exclude patterns
excludePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of exclude patterns
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Perform the desired operation.
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Perform the desired operation.
execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Perform the desired operation.
executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Execute the fetch and get the return code.
executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Execute the fetch and get the return code.
executeMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
ExecuteMethodThread(ThrottledFetcher.ThrottledConnection, IFetchThrottler, HttpClient, HttpHost, HttpRequestBase, CookieStore) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Notify the implementing class of the existence of a cached version of the object.
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Notify the implementing class of the existence of a cached version of the object.
exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Notify the implementing class of the existence of a cached version of the object.
expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
 
expirationDateField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
expireTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
This is when the connection will expire.
extractContentType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
extractEncoding(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
extractLinks(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to extract links from an already-fetched document.
extractMimeType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 

F

FeedContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
 
FeedItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
FETCH_BAD_URI - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_CIRCULAR_REDIRECT - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_INTERRUPTED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_IO_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_NOT_TRIED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_ROBOTS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_SEQUENCE_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
FETCH_STANDARD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
FETCH_UNKNOWN_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
fetchCounter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The current bytes in the current fetch
fetchMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The method object
FetchStatus() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
The fetch throttler
fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Fetch throttler
fetchType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The kind of fetch we are doing
filter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
FindContentHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for HTML content grepping during state transitions
FindContentHandler(String, List<Pattern>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
FindContentHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
findExcludedHeaders(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Read a document specification to get a set of excluded headers
FindHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is used to discover links in a session login context
FindHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
 
findHTMLForm(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find matching HTML form data, if present.
FindHTMLFormHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for HTML form parsing during state transitions
FindHTMLFormHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
FindHTMLHrefHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for HTML parsing during state transitions
FindHTMLHrefHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
 
findHTMLLinkURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find HTML link URI, if present, making sure specified preference is matched.
findLoginParameters(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
For a given login page, specific information may need to be submitted to the server to properly log in.
findLoginParameters(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
For a given login page, specific information may need to be submitted to the server to properly log in.
findMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
findNextOne() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Find next one
FindPreferredRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for redirection handling during state transitions
FindPreferredRedirectionHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
 
findPreferredRedirectionURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find a preferred redirection URI, if it exists
FindRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for redirection parsing during state transitions
FindRedirectionHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindRedirectionHandler
 
findRedirectionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find a redirection URI, if it exists
findSpecifiedContent(String, List<Pattern>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
 
findSpecifiedContent(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Find existence of specific content on the page (never finds a URL)
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Finish up all processing.
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
 
finishUp() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Done with the document.
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
 
flushIdleConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
flushIdleConnections(IThreadContext) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Flush connections that have timed out from inactivity.
FormData - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the form data gleaned from an HTML page.
FormDataAccumulator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class accumulates form data and allows overrides
FormDataAccumulator(String, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
FormDataAccumulator.FormItemIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Iterator over FormItems
FormDataElement - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes individual form data elements, for form submission.
FormItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class provides an individual data item
FormItem(String, String, int, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
FormItemIterator(ArrayList) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The form name pattern, or null if no form is expected
formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
formNameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The form name regexp
formParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FormParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
FormParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_OPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_SELECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_IN_TEXTAREA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
FORMPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
fqdn - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
fqdnField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
from - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The email address for this connector instance

G

generalException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getAcls(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Grab forced acl out of document specification.
getActionURI() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Get the full action URI for this form.
getActionURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Get the full action URI for this form.
getActivitiesList() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Return the list of activities that this connector supports (i.e.
getAttributeJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getAttributeString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBinNames(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the bin name string for a document identifier.
getBodyJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getBodyString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getCanonicalizationPolicies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Get canonicalization policies
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
Get the name of the object class.
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
Get the name of the object class.
getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
Get the name of the object class.
getConnection(IThreadContext, String, String, String, int, PageCredentials, IKeystoreManager, IThrottleSpec, String[], int, String, int, String, String, String, int, int, IAbortActivity) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Obtain a connection to specified protocol, server, and port.
getConnectorModel() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Tell the world what model this connector uses for getDocumentIdentifiers().
getContentPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the content pattern.
getContentPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the content pattern.
getContentType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the contentType
getContentType(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the content type.
getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
getCookie(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
Get the cookie name
getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
 
getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
 
getCookieCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
Get the cookie count
getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
Returns an immutable array of cookies that this HTTP state currently contains.
getCookiesCacheKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Construct a global key which represents an individual session.
getCredential() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Get credential type
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
getData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the data
getData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Fetch binary data entry from the cache.
getDataLength(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Fetch binary data length.
getDNSKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Construct a key which represents an individual host name.
getElementIterator() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Iterate over the active form data elements.
getElementIterator() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Iterate over the active form data elements.
getElementName() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
Get the element name
getElementName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
Get the element name
getElementValue() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
Get the element value
getElementValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
Get the element value
getEnabled() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the expiration time.
getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Get expiration
getFinalURL(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
If the initial url is permanently or temporarly redirected (code 301 or 302), the method returns the destination url
getFirstHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getFormData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
 
getFormNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the form name pattern.
getFormNamePattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the form name pattern.
getFQDN() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the fqdn
getGroupNumber() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
getGroupStyle() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
getHost() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the host name
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
getIPAddress() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
Get the ipaddress
getLastFetchCookies() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the last fetch cookies.
getLastFetchCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the last fetch cookies.
getLimitedResponseBody(int, String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get limited response as a string.
getLimitedResponseBody(int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get limited response as a string.
getMaxDocumentRequest() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the maximum number of documents to amalgamate together into one batch, for this connector.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
Get the maximum LRU count of the object class.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
Get the maximum LRU count of the object class.
getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
Get the maximum LRU count of the object class.
getMaxOpenConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get maximum open connections.
getMaxOpenConnections(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Given a bin name, find the max open connections to use for that bin.
getMinimumMillisecondsPerByte() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get minimum milliseconds per byte.
getMinimumMillisecondsPerByte(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Look up minimum milliseconds per byte for a bin.
getMinimumMillisecondsPerFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get minimum milliseconds per fetch
getMinimumMillisecondsPerFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Look up minimum milliseconds for a fetch for a bin.
getName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
getNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
Get the object class for an object.
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
Get the object class for an object.
getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
Get the object class for an object.
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
Get the cache keys for an object (which may or may not exist yet in the cache).
getOverrideTargetURL() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the override target URL.
getOverrideTargetURL() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the override target URL.
getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Given a URL, find the right PageCredentials object to use.
getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the page credentials for a given document identifier (URL)
getParameter(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the actual parameter
getParameterCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the parameter count
getParameterCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the number of parameters.
getParameterNamePattern(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the name of the i'th parameter.
getParameterNamePattern(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the name of the i'th parameter.
getParameterValue(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the desired value of the i'th parameter.
getParameterValue(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the desired value of the i'th parameter.
getPath() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Get the pattern.
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the pattern
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Get the pattern.
getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Get the pattern.
getPort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getPreferredLinkPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the preferred link pattern.
getPreferredLinkPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the preferred link pattern.
getPreferredRedirectionPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Get the preferred redirection pattern.
getPreferredRedirectionPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
Get the preferred redirection pattern.
getRawQuery() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getReferralURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the referral URI
getReferralURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the referral URI.
getRelationshipTypes() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Return the list of relationship types that this connector recognizes.
getResponseBodyStream() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the response input stream.
getResponseBodyStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the response input stream.
getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
Get the response code
getResponseCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get the http response code.
getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get the http response code.
getResponseCode(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
Get the response code.
getResponseHeader(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get a specified response header, if it exists.
getResponseHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get a specified response header, if it exists.
getResponseHeaders() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Get response headers
getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Get response headers
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
Get the result.
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Get the result.
getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Get the result.
getRobotsKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Construct a key which represents an individual host name.
getSafeInputStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
getScheme() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
Given a URL, find the right SequenceCredentials object to use.
getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the sequence credentials for a given document identifier (URL)
getSequenceKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Fetch the unique key value for this particular credential.
getSequenceKey() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
Fetch the unique key value for this particular credential.
getSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Start a session
getSessionKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
getString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
getSubmitMethod() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
Get the submit method for this form.
getSubmitMethod() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
Get the submit method for this form.
getTargetURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
 
getTextValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
getTrustStore() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Get keystore
getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
Given a URL, build the right trust certificate store, or return null if all certs should be accepted.
getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Get the trust store for a given document identifier (URL)
getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
getVersionString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Get whatever contribution to the version string should come from this data.
getWaitAmount() - Method in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
grab(IAbortActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
groupNumber - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
groupStyle - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
GROUPSTYLE_LOWER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
GROUPSTYLE_MIXED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
GROUPSTYLE_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
GROUPSTYLE_UPPER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
guidField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
gzip - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 

H

handleHTML(String, IHTMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle document references from HTML
handleHTTPException(HttpException, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
 
handleIOException(IOException, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
 
handleIOException(IOException, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
XML handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
The link handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
XML handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
Link handler
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
Link notification interface
handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
XML handler
handleRedirects(String, IRedirectionHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle extracting the redirect link from a redirect response.
handleXML(String, IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Handle document references from XML.
hasExpired(long) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Check whether the connection has expired.
hasExpired(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Check whether the connection has expired.
hashCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Calculate a hash function
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Check for next
hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
headerData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
HostExecutor(DNSManager, DNSManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
Constructor.
HostExecutor(RobotsManager, IProcessActivity, RobotsManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
Constructor.
hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
 
hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
 
httpClient - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
Client and method, all preconfigured
httpClient - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The http client object.
httpsSocketFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Https protocol

I

IDiscoveredLinkHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by a link extractor to note a discovered link.
idleTimeout - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Idle timeout
IHTMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
IMetaTagHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by a parser to handle metadata tags.
includeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of index include patterns
includePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The arraylist of include patterns
inputStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
The stream we are wrapping.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Install the manager.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Install the manager.
install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Install the manager.
install(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Install the connector.
interestingMimeTypeArray - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This represents a list of the mime types that this connector knows how to extract links from.
interestingMimeTypeMap - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ipaddress - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
 
ipaddressField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
IRedirectionHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
isAgentMatch(String, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if user-agent matches.
isAllowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if path is allowed.
isContentInteresting(IFingerprintActivity, String, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Code to check if data is interesting, based on response code and content type.
isDeflateStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
isDisallowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
See if path is disallowed.
isDocumentAndHostLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if both a document and host are legal.
isDocumentContentIndexable(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
 
isDocumentIndexable(String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if the document identifier is indexable, and return the indexing URL if found.
isDocumentLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if the document identifier is legal.
isDocumentText(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Is the document text, as far as we can tell?
isEnabled - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
isFetchAllowed(String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Check if fetch is allowed
isGZipStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
isHostLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Check if a host is legal.
isInitialized - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This flag is set when the instance has been initialized
isMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
 
isStrange(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check if character is not typical ASCII or utf-8.
isText(byte[], int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Test to see if a document is text or not.
isWhiteSpace(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Check if a byte is a whitespace character.
IThrottledConnection - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface represents an established connection to a URL.
IXMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes the functionality needed by an XML processor in order to handle an XML document.

K

keyField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 

L

lastFetchCookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The cookies from the last fetch
LaxBrowserCompatSpecProvider() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.LaxBrowserCompatSpecProvider
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
LinkParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class recognizes and interprets all links
LinkParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
linkType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
logFetchCount(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Log the fetch of a number of bytes, from within a stream.
loginAndFetch(WebcrawlerConnector.FetchStatus, IProcessActivity, String, SequenceCredentials, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
LoginCookies - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes cookies obtained during sequential authentication.
LoginParameterIterator(List<CredentialsDescription.SessionCredentialItem>, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Constructor
LoginParameters - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes login parameters to be used to submit a page during sequential authentication.
lookup(String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Given a host name, look up the ip address and fqdn.
lookupIPAddress(String, IProcessActivity, String, long, StringBuilder) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Look up an ipaddress given a non-canonical host name.
lowercasing - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 

M

makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
Turn this instance into a Credentials object, given the specified target host name
makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Turn this instance into a Credentials object, given the specified target host name
makeCredentialsObject(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
Turn this instance into a Credentials object, given the specified target host name
makeDNSEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate the event name for DNS access.
makeDocumentIdentifier(String, String, WebcrawlerConnector.DocumentURLFilter, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Convert an absolute or relative URL to a document identifier.
makeReadable(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Convert a string from the robots file into a readable form that does NOT contain NUL characters (since postgresql does not accept those).
makeRobotsEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Construct a name for the global web-connector robots event.
makeRobotsKey(String, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Construct the robots key for a host.
makeSessionLoginEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Calculate the event name for session login.
map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
 
map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
 
MappingRule(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
 
MappingRules() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
 
mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
Mapping rules
mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
 
mark(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Mark.
markSupported() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Check if mark is supported.
matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
 
MAX_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
maxOpenConnections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The maximum open connections, or null if no limit.
mcfException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
Messages - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
Messages() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
Constructor - do no instantiate
META_ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
META_ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
MetaParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class recognizes and interprets all meta tags
MetaParseState(IMetaTagHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
metaRobotsTagsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Meta robots tag usage flag
methodThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The thread that is actually doing the work
minimumMillisecondsPerByte - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The minimum milliseconds between bytes, or null if no limit.
minimumMillisecondsPerFetch - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The minimum milliseconds per fetch, or null if no limit
myPool - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Connection pool
myUrl - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The current URL being fetched

N

name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
nameField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
namePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Compiled name pattern
nameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Name regexp
NameValue(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
Get the next one
next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
nextToken() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
 
NODE_ACCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Forced acl access token node.
NODE_ACCESSCREDENTIAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Access control description node
NODE_AUTHPAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication page description node
NODE_AUTHPARAMETER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Authentication parameter node
NODE_BINDESC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bin description node
NODE_EXCLUDEHEADER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude header node.
NODE_EXCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude regexps node.
NODE_EXCLUDESCONTENTINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude any page containing specified regex in their body from index
NODE_EXCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Exclude regexps node.
NODE_FORCEINCLUSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Force the inclusion of redirections.
NODE_INCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Include regexps node.
NODE_INCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Include regexps node.
NODE_LIMITTOSEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Limit to seeds.
NODE_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Map entry specification node.
NODE_MAXCONNECTIONS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The max connections node
NODE_MAXFETCHESPERMINUTE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The max fetch rate node
NODE_MAXKBPERSECOND - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The bandwidth node
NODE_SEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
The seeds node.
NODE_TRUST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Trust store description node
NODE_URLSPEC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Canonicalization rule.
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note discovered href
noteAHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered href
noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered href
noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note discovered base href
noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note discovered base href
noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note discovered base
noteBASEHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note base href
noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered base
noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
 
noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
 
noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
 
noteDiscoveredBase(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
Inform the world of a new base HREF.
noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
 
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
Inform the world of a discovered link.
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Override noteDiscoveredLink
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
Override noteDiscoveredLink
noteDiscoveredLink(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
Inform the world of a discovered link.
noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
Inform the world of a discovered link.
noteDiscoveredTtlValue(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IXMLHandler
Inform the world of a discovered ttl value.
noteDiscoveredTtlValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
Inform the world of a discovered ttl value.
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note the end of a form
noteFormEnd() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note the end of a form
noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note the end of a form
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note an input tag
noteFormInput(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note an input tag
noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note an input tag
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note the start of a form
noteFormStart(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note the start of a form
noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note the start of a form
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered FRAME SRC
noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered FRAME SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered IMG SRC
noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered IMG SRC
noteInterrupted(Throwable) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Note that the connection fetch was interrupted by something.
noteInterrupted(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Note that the connection fetch was interrupted by something.
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note discovered href
noteLINKHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note discovered href
noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note discovered href
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note a meta tag
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note a meta tag
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note a meta tag
noteMetaTag(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
Inform the world of a discovered metadata tag.
noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note a meta tag
noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
 
noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
 
noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteNormalCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
noteTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteTagEnd(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
Note a character of text.
noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
Note a character of text.
noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
Note a character of text.
noteTextCharacter(char) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
Note a character of text.
noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Note a character of text.
NTLMCredential(String, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
Constructor

O

optionSelected - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
optionValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
optionValueText - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
ordinalField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
org.apache.manifoldcf.crawler.connectors.webcrawler - package org.apache.manifoldcf.crawler.connectors.webcrawler
 
OurBasicCookieStore() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
 
OuterContextClass(XMLFuzzyHierarchicalParseState, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
 
outerTagCount - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
Keep track of the number of valid feed signals we saw
outputConfigurationBody(IThreadContext, IHTTPOutput, Locale, ConfigParams, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the configuration body section.
outputConfigurationHeader(IThreadContext, IHTTPOutput, Locale, ConfigParams, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the configuration header section.
outputResource(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, Object>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
 
outputSpecificationBody(IHTTPOutput, Locale, Specification, int, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the specification body section.
outputSpecificationHeader(IHTTPOutput, Locale, Specification, int, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Output the specification header section.
OVERLAP_AMOUNT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
overrideActionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
 
overrideTargetURL - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Override target URL

P

PageCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for page-based authentication.
PARAMETER_EMAIL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Email (a parameter)
PARAMETER_META_ROBOTS_TAGS_USAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Meta robots tags usage (a parameter)
PARAMETER_PROXYAUTHDOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth domain (parameter)
PARAMETER_PROXYAUTHPASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth password (parameter)
PARAMETER_PROXYAUTHUSERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy auth username (parameter)
PARAMETER_PROXYHOST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy host name (parameter)
PARAMETER_PROXYPORT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Proxy port (parameter)
PARAMETER_ROBOTSUSAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
Robots usage (a parameter)
parameters - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The list of the parameters we want to add for this pattern.
parentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
 
parseRobotsTxt(BufferedReader, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Parse the robots.txt file using a reader.
password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 
pathField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
pathSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
The bin-matching pattern.
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Url match pattern
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
The bin-matching pattern.
pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
The bin-matching pattern.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
This is the hash that contains everything.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
This is the hash that contains everything.
patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
This is the hash that contains everything.
peek() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
Get current token.
poll() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
This method is periodically called for all connectors that are connected but not in active use.
PoolException(String) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.PoolException
 
port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Port
portBlankField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
portsToString(int[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a port array to a string.
pos - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
 
potentiallyExcludedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred link pattern, or null if there's no preferred link
preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
 
preferredLinkRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred link regexp
preferredRedirectionPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred redirection pattern, or null if there's no preferred redirection
preferredRedirectionRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
The preferred redirection regexp
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
Process this data
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
Process this data
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
Process this data
process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
Process this data
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
Process the data accumulated for this item
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
Process the data accumulated for this item
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
Process the data accumulated for this item
process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
Process the data accumulated for this item
ProcessActivityHTMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Constructor.
ProcessActivityLinkHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
Constructor.
ProcessActivityRedirectionHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
Constructor.
ProcessActivityXMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
Constructor.
processBuffer() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
 
processConfigurationPost(IThreadContext, IPostParameters, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a configuration post.
processDocument(IProcessActivity, String, String, boolean, Map<String, Set<String>>, String[], WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
processDocuments(String[], IExistingVersions, Specification, IProcessActivity, int, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a set of documents.
processSpecificationPost(IPostParameters, Locale, Specification, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Process a specification post.
protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Protocol
proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Proxy auth domain
proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth domain
proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Proxy auth password
proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth password
proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Proxy auth user name
proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy auth user name
proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Proxy host
proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy host
proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Proxy port
proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Proxy port

R

rawQueryPart - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
RDFContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
 
RDFItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
 
read() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read a byte.
read(byte[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read lots of bytes.
read(byte[], int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Read lots of specific bytes.
READ_CHUNK_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
The read chunk length
readCookies(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Read cookies currently in effect for a given session key.
readCookiesUncached(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Read cookies from database, uncached.
readDNSInfo(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Read DNS data, if it exists.
readRobotsData(String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Read robots data, if it exists.
Record() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
Constructor.
recordEverything - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
This flag determines whether we record everything to the disk, as a means of doing a web snapshot
records - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
 
redirectionURIPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
 
referralURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The referral URI
regexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
url regexp
REL_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
REL_REDIRECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
release(IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
 
removeAspSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removeBVSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removeJavaSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
removePhpSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
reorder - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
 
reservedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
reset() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Reset.
resolve(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
response - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
responseCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
The response code
responseException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
RESULT_NO_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_NO_VERSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_RETRY_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULT_VERSION_NEEDED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
resultSignal - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
RESULTSTATUS_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULTSTATUS_NOTYETDETERMINED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
RESULTSTATUS_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
rethrowExceptions() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ROBOTS_DATA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
robotsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
RobotsCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
 
RobotsData(InputStream, long, String, IProcessActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Constructor.
robotsField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
robotsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The robots manager currently used by this instance
RobotsManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class manages the database table into which we write robots.txt files for hosts.
RobotsManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Constructor.
RobotsManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the object description for a robots host object.
RobotsManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the executor object for locating robots host objects.
RobotsManager.Record - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class represents a record in a robots.txt file.
RobotsManager.RobotsCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Cache class for robots.
RobotsManager.RobotsData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is a cached data item.
robotsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Robots usage flag
RSSChannelContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
 
RSSContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
 
RSSItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
 
rules - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
 
run() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 

S

scriptParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
ScriptParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
ScriptParseState() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
SCRIPTPARSESTATE_INSCRIPT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
SCRIPTPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
 
secureField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
seedHosts - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The hash map of seed hosts, to limit urls by, if non-null
selectMultiple - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
selectName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
 
SequenceCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
This interface describes immutable classes which represents authentication information for sequence-based authentication.
sequenceKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
 
server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Server
serviceInterruption - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
 
SessionCredential(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
Constructor
sessionCredentialIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
SessionCredentialItem(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
Constructor
SessionCredentialParameter(String, Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
 
sessionKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
 
sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
 
sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
 
sessionState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
 
SESSIONSTATE_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
We're in 'login mode'
SESSIONSTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Normal fetch of content document.
setAbortChecker(AbortChecker) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
Set the abort checker.
setAbortChecker(AbortChecker) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Set the abort checker.
setCredential(AuthenticationCredentials) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
Set Credentials
setEnabled(boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
setMaxOpenConnections(Integer) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set maximum open connections.
setMinimumMillisecondsPerByte(Double) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set minimum milliseconds per byte.
setMinimumMillisecondsPerFetch(Long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Set minimum milliseconds per fetch
setValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
Decide whether we should index.
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
 
shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
 
shutdownException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
skip(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Skip
socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
 
socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Socket timeout milliseconds
socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Socket timeout, milliseconds
startFetchTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The start of the current fetch
statusCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The status code fetched, if any
streamCreated - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
streamException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
streamThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Stream throttler
stringToArray(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Read a string as a sequence of individual expressions, urls, etc.
stringToBoolean(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a boolean string to a boolean.
stringToPorts(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Convert a string to a port array.
submitMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
The form's submit method
SUBMITMETHOD_GET - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 
SUBMITMETHOD_POST - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 

T

tagCleanup() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
target - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
targetURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
 
text - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
 
textValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
theConnection - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
The connection
theURL - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
thisDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
 
thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
 
threadStarted - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Set if thread has been started
threadStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
 
throttledConnection - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
The throttled connection we belong to
ThrottledConnection(ThrottledFetcher.ConnectionPool, IFetchThrottler, String, String, int, PageCredentials, SSLSocketFactory, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
Constructor.
throttleDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The throttle description
ThrottleDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes complex throttling criteria pulled from a configuration.
ThrottleDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
Constructor.
ThrottleDescription.ThrottleItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual throttle item.
ThrottledFetcher - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class uses httpclient to fetch stuff from webservers.
ThrottledFetcher.ConnectionPool - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Each connection pool has identical connections we can draw on.
ThrottledFetcher.ConnectionPoolKey - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Connection pool key
ThrottledFetcher.ExecuteMethodThread - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This thread does the actual socket communication with the server.
ThrottledFetcher.LaxBrowserCompatSpecProvider - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class to create a cookie spec.
ThrottledFetcher.OurBasicCookieStore - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
ThrottledFetcher.PoolException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
Pool exception class
ThrottledFetcher.ThrottledConnection - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Throttled connections.
ThrottledFetcher.ThrottledInputstream - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class throttles an input stream based on the specified byte rate parameters.
ThrottledFetcher.WaitException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
Wait exception class
ThrottledInputstream(IStreamThrottler, ThrottledFetcher.ThrottledConnection, InputStream) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
Constructor.
throttleGroupName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Throttle group name
ThrottleItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
Constructor.
throwable - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
The error trace, if any
TIME_15MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_1DAY - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_2HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_5MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
TIME_6HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
toASCIIString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
token - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
 
toString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
 
toString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
trustsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The trusts description
TrustsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes trust information pulled from a configuration.
TrustsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
Constructor.
TrustsDescription.TrustsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing an individual credential item.
TrustsItem(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
Constructor.
trustStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
The credential, or null if this is a "trust everything" item
trustStoreString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
 
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
ttl value
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
ttl value
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
TTL value is set on a per-channel basis
ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
ttl value
type - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
type - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
TYPE_COMMA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
TYPE_GROUP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 
TYPE_TEXT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
 

U

understoodProtocols - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
updateCookies(String, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
Update cookes that are in effect for a given session key.
UrlsetContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
 
UrlsetItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
 
userAgent - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
The user-agent for this connector instance
userAgents - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
 
userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
 
userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
 

V

value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
Value
value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
 
value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
 
valueField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
versionString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
The version string
viewConfiguration(IThreadContext, IHTTPOutput, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
View configuration.
viewSpecification(IHTTPOutput, Locale, Specification, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
View specification.

W

WaitException(long) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
 
WebcrawlerConfig - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Constants for the Webcrawler connector configuration.
WebcrawlerConfig() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
 
WebcrawlerConnector - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This is the Web Crawler implementation of the IRepositoryConnector interface.
WebcrawlerConnector() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
Constructor.
WebcrawlerConnector.CanonicalizationPolicies - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing a list of canonicalization rules
WebcrawlerConnector.CanonicalizationPolicy - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing a URL regular expression match, for the purposes of determining canonicalization policy
WebcrawlerConnector.DocumentURLFilter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class describes the url filtering information (for crawling and indexing) obtained from a digested DocumentSpecification.
WebcrawlerConnector.EvaluatorToken - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Evaluator token.
WebcrawlerConnector.EvaluatorTokenStream - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Token stream.
WebcrawlerConnector.FeedContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.FeedItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.FetchStatus - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.MappingRule - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class representing a mapping rule
WebcrawlerConnector.MappingRules - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that represents all mappings
WebcrawlerConnector.NameValue - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Name/value class
WebcrawlerConnector.OuterContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class handles the outermost XML context for the feed document.
WebcrawlerConnector.ProcessActivityHTMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes HTML handling
WebcrawlerConnector.ProcessActivityLinkHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
This class is the handler for links that get added into a IProcessActivity object.
WebcrawlerConnector.ProcessActivityRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes redirection handling
WebcrawlerConnector.ProcessActivityXMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Class that describes XML handling
WebcrawlerConnector.RDFContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RDFItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSChannelContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.RSSItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.UrlsetContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
WebcrawlerConnector.UrlsetItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
 
webThrottleGroupType - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
Web throttle group type
WebURL - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
Replacement class for java.net.URI, which is broken in many ways.
WebURL(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(String, String, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(URI) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
WebURL(URI, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
 
writeDNSData(String, String, String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
Write DNS data, replacing any existing row.
writeRobotsData(String, long, InputStream) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
Write robots.txt, replacing any existing row.

_

_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
 
_rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
 
_rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
 
A B C D E F G H I K L M N O P R S T U V W _ 
All Classes All Packages