A B C D E F G H I K L M N O P R S T U V W _
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- abort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- abortCheck - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Abort checker
- abortCheck() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- AbortChecker - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class furnishes an abort signal whenever the job activity says it should.
- AbortChecker(IAbortActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- abortThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- acceptNewTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- actionURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The form's action URI
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- ACTIVITY_FETCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ACTIVITY_LOGON_END - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ACTIVITY_LOGON_START - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ACTIVITY_PROCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ACTIVITY_ROBOTSPARSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- add(WebcrawlerConnector.MappingRule) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
- addAgent(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add a user-agent.
- addAllow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add an allow.
- addAuthPage(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Add an auth page
- addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
- addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Adds an
HTTP cookie
, replacing any existing equivalent cookies. - addCookies(Cookie[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Adds an array of
HTTP cookies
. - addData(IProcessActivity, String, IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Add a data entry into the cache.
- addDisallow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add a disallow.
- addElement(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- addPageParameter(int, String, String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Add a page parameter
- addParameter(String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Add parameter
- addRule(WebcrawlerConnector.CanonicalizationPolicy) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
- addSeedDocuments(ISeedingActivity, Specification, String, long, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Queue "seed" documents.
- advance() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Go on to next token.
- allows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
- amt - Variable in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
- applyFormOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- ATTR_ASPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
aspsessionremoval attribute
- ATTR_BINREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bin regular expression
- ATTR_BVSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
bvsessionremoval attribute
- ATTR_DESCRIPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
description attribute
- ATTR_DOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Domain/realm part of credentials (if any)
- ATTR_INSENSITIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Whether the match is case insensitive
- ATTR_JAVASESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
javasessionremoval attribute
- ATTR_LOWERCASE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
map to lower case
- ATTR_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Map attribute
- ATTR_MATCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Match attribute
- ATTR_MATCHREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Form name or link target regexp for authentication page
- ATTR_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
name attribute
- ATTR_NAMEREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication parameter name regexp
- ATTR_OVERRIDETARGETURL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
URL to fetch next in a sequence (an override)
- ATTR_PASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Password part of credentials
- ATTR_PHPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
phpsessionremoval attribute
- ATTR_REGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
regexp attribute
- ATTR_REORDER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
reorder attribute
- ATTR_TOKEN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
token attribute
- ATTR_TRUSTEVERYTHING - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
"Trust everything" attribute - replacing truststore if set to 'true'
- ATTR_TRUSTSTORE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Trust store section of authentication record
- ATTR_TYPE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type of security
- ATTR_URLREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Regexp for access control node
- ATTR_USERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Username part of credentials
- ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The value attribute (used for maxconnections and maxkbpersecond)
- ATTRVALUE_BASIC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for basic authentication
- ATTRVALUE_CONTENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Access
- ATTRVALUE_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value false
- ATTRVALUE_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Form
- ATTRVALUE_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Link
- ATTRVALUE_NO - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value no
- ATTRVALUE_NTLM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for NTLM authentication
- ATTRVALUE_REDIRECTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Redirection
- ATTRVALUE_SESSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for session-based authentication
- ATTRVALUE_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value true
- ATTRVALUE_YES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value yes
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
The credential
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Authentication
- AuthenticationCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for all kinds of authentication.
- available() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Get available.
B
- baseDocumentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- baseFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- BasicCredential(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Constructor
- basicRead(byte[], int, int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Basic read, which uses the server object to throttle activity.
- beginFetch(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Begin the fetch process.
- beginFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Begin the fetch process.
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Handle the tag beginning to set the correct second-level parsing context
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
- beginTag(String, String, String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
- bodyStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- booleanToString(boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a boolean to a boolean string.
C
- cache - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This is where we keep data around between the getVersions() phase and the processDocuments() phase.
- cacheData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- calculateDocumentEvents(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate events that should be associated with a document.
- canLowercase() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canonicalizationPolicies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Canonicalization policies
- CanonicalizationPolicies() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
- CanonicalizationPolicy(Pattern, boolean, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canRemoveAspSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canRemoveBvSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canRemoveJavaSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canRemovePhpSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- canReorder() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- check() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check status of connection.
- checkException(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- checkFetchAllowed(String, String, long, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Read robots.txt data from the cache or from the database.
- checkFetchAllowed(String, String, String, int, PageCredentials, IKeystoreManager, String, String[], long, String, IProcessActivity, int, String, int, String, String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check robots to see if fetch is allowed.
- checkIfValidFeed() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Check if feed was valid
- checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
- checkSum - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- clear() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Clears all cookies.
- clearExpired(Date) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Removes all of
cookies
in this HTTP state that have expired by the specifieddate
. - clearThreadContext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Clear out any state information specific to a given thread.
- close() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Close the connection.
- close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Close the connection.
- close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Close.
- commentField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- commentURLField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- compileList(List<Pattern>, List<String>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Compile all regexp entries in the passed in list, and add them to the output list.
- ConnectionPool(IConnectionThrottler, String, String, int, PageCredentials, SSLSocketFactory, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- ConnectionPoolKey(String, String, int, PageCredentials, String, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- connectionPools - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Connection pools.
- connections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
The actual pool of connections
- connectionThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
Throttler
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Connection timeout milliseconds
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Connection timeout, milliseconds.
- connManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The http connection manager.
- contentBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- contentPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The content pattern, or null if no content is sought for
- contentPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- contentRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The content regexp
- contentType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
The content-type header value
- contextDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- contextException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- contextMessage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- cookieException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- cookieList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
- cookieManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The cookie manager used by this instance
- CookieManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class manages the database table into which we write cookies.
- CookieManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Constructor.
- CookieManager.CookiesCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Cache class for robots.
- CookieManager.CookiesDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the object description for a session key object.
- CookieManager.CookiesExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the executor object for locating cookies session objects.
- CookieManager.DynamicCookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is a set of cookies, built dynamically.
- cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
- cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- cookiesCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- CookiesCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
- CookiesDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- CookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class represents a bunch of cookies
- CookieSet(List<Cookie>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
- CookiesExecutor(CookieManager, CookieManager.CookiesDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Constructor.
- cookieStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- create(HttpContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.LaxBrowserCompatSpecProvider
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Create a set of new objects to operate on and cache.
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Create a set of new objects to operate on and cache.
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Create a set of new objects to operate on and cache.
- credentialsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The credentials description
- CredentialsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class describes credential information pulled from a configuration.
- CredentialsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Constructor.
- CredentialsDescription.BasicCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Basic type credentials
- CredentialsDescription.CredentialsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing an individual credential item.
- CredentialsDescription.LoginParameterIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
LoginParameter iterator
- CredentialsDescription.NTLMCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
NTLM-style credentials
- CredentialsDescription.SessionCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credentials
- CredentialsDescription.SessionCredentialItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credential helper class
- CredentialsDescription.SessionCredentialParameter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credential parameter class
- CredentialsItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Constructor.
- credentialsObject - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- currentFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- currentIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- currentOne - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
D
- data - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
The cache file for the data
- DataCache - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is a cache of a specific URL's data.
- DataCache() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Constructor.
- DataCache.DocumentData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class represents everything we need to know about a document that's getting passed from the getDocumentVersions() phase to the processDocuments() phase.
- DEFAULT_BUNDLE_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- DEFAULT_PATH_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- deflate - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Uninstall the manager.
- deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Uninstall the manager.
- deinstall() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Uninstall the manager.
- deinstall(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Uninstall the connector.
- deleteData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Delete specified item of data.
- destroy() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Destroy the connection.
- destroy() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Destroy the connection forever
- disallows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
- discardField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- disconnect() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Close the connection.
- discoveredFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- dnsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- DNSCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
- DNSInfo(String, String, long, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Constructor.
- dnsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The DNS manager currently used by this instance
- DNSManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class manages the database table into which we DNS entries for hosts.
- DNSManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Constructor.
- DNSManager.DNSCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Cache class for robots.
- DNSManager.DNSInfo - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is a cached data item.
- DNSManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the object description for a robots host object.
- DNSManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the executor object for locating robots host objects.
- doCanonicalization(WebcrawlerConnector.DocumentURLFilter, WebURL) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Code to canonicalize a URL.
- DocumentData(File, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Constructor.
- documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
- documentIdentifier - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- documentIdentifiertoFileName(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Convert a document identifier to filename.
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
The document identifier
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
The document uri
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
The document identifier
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
The document identifier
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
-
The document identifier
- documentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
The document identifier
- DocumentURLFilter(Specification) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Process a document specification to produce a filter.
- doesPathMatch(String, int, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Recursive method for matching specification to path.
- doesPathMatch(String, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Check if path matches specification
- domain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
- domainField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- domainSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- doneFetch(IProcessActivity) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Done with the fetch.
- doneFetch(IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Done with the fetch.
- DynamicCookieSet() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
E
- ELEMENTCATEGORY_FIXEDEXCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- ELEMENTCATEGORY_FIXEDINCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- ELEMENTCATEGORY_FREEFORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The set of elements
- elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Handle the tag ending
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- equals(Object) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- evalExpression - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
- EvaluatorToken() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- EvaluatorToken(int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- EvaluatorToken(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- EvaluatorTokenStream(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Constructor.
- excludeContentIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
List of content exclusion pattern
- excludeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of index exclude patterns
- excludePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of exclude patterns
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Perform the desired operation.
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Perform the desired operation.
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Perform the desired operation.
- executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Execute the fetch and get the return code.
- executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Execute the fetch and get the return code.
- executeMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- ExecuteMethodThread(ThrottledFetcher.ThrottledConnection, IFetchThrottler, HttpClient, HttpHost, HttpRequestBase, CookieStore) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Notify the implementing class of the existence of a cached version of the object.
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Notify the implementing class of the existence of a cached version of the object.
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Notify the implementing class of the existence of a cached version of the object.
- expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
- expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
- expirationDateField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
- expireTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
This is when the connection will expire.
- extractContentType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- extractEncoding(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- extractLinks(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Code to extract links from an already-fetched document.
- extractMimeType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
F
- FeedContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
- FeedItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
- FETCH_BAD_URI - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_CIRCULAR_REDIRECT - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_INTERRUPTED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_IO_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- FETCH_NOT_TRIED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_ROBOTS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- FETCH_SEQUENCE_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- FETCH_STANDARD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- FETCH_UNKNOWN_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- fetchCounter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The current bytes in the current fetch
- fetchMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The method object
- FetchStatus() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
The fetch throttler
- fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Fetch throttler
- fetchType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The kind of fetch we are doing
- filter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- FindContentHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML content grepping during state transitions
- FindContentHandler(String, List<Pattern>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- FindContentHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- findExcludedHeaders(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Read a document specification to get a set of excluded headers
- FindHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is used to discover links in a session login context
- FindHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
- findHTMLForm(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find matching HTML form data, if present.
- FindHTMLFormHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML form parsing during state transitions
- FindHTMLFormHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- FindHTMLHrefHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML parsing during state transitions
- FindHTMLHrefHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
- findHTMLLinkURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find HTML link URI, if present, making sure specified preference is matched.
- findLoginParameters(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
For a given login page, specific information may need to be submitted to the server to properly log in.
- findLoginParameters(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
-
For a given login page, specific information may need to be submitted to the server to properly log in.
- findMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
- findNextOne() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Find next one
- FindPreferredRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for redirection handling during state transitions
- FindPreferredRedirectionHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
- findPreferredRedirectionURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find a preferred redirection URI, if it exists
- FindRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for redirection parsing during state transitions
- FindRedirectionHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindRedirectionHandler
- findRedirectionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find a redirection URI, if it exists
- findSpecifiedContent(String, List<Pattern>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
- findSpecifiedContent(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find existence of specific content on the page (never finds a URL)
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Finish up all processing.
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
- finishUp() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Done with the document.
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
- flushIdleConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- flushIdleConnections(IThreadContext) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Flush connections that have timed out from inactivity.
- FormData - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the form data gleaned from an HTML page.
- FormDataAccumulator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class accumulates form data and allows overrides
- FormDataAccumulator(String, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- FormDataAccumulator.FormItemIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Iterator over FormItems
- FormDataElement - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes individual form data elements, for form submission.
- FormItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class provides an individual data item
- FormItem(String, String, int, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- FormItemIterator(ArrayList) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The form name pattern, or null if no form is expected
- formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- formNameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The form name regexp
- formParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FormParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
- FormParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FORMPARSESTATE_IN_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FORMPARSESTATE_IN_OPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FORMPARSESTATE_IN_SELECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FORMPARSESTATE_IN_TEXTAREA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- FORMPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- fqdn - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
- fqdnField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- from - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The email address for this connector instance
G
- generalException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getAcls(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Grab forced acl out of document specification.
- getActionURI() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Get the full action URI for this form.
- getActionURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Get the full action URI for this form.
- getActivitiesList() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Return the list of activities that this connector supports (i.e.
- getAttributeJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getAttributeJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getAttributeJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getAttributeString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getAttributeString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getAttributeString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBinNames(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the bin name string for a document identifier.
- getBodyJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBodyJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBodyJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBodyString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBodyString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getBodyString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getCanonicalizationPolicies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Get canonicalization policies
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
-
Get the name of the object class.
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
-
Get the name of the object class.
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
-
Get the name of the object class.
- getConnection(IThreadContext, String, String, String, int, PageCredentials, IKeystoreManager, IThrottleSpec, String[], int, String, int, String, String, String, int, int, IAbortActivity) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Obtain a connection to specified protocol, server, and port.
- getConnectorModel() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Tell the world what model this connector uses for getDocumentIdentifiers().
- getContentPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the content pattern.
- getContentPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the content pattern.
- getContentType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the contentType
- getContentType(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the content type.
- getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
- getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
- getCookie(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
-
Get the cookie name
- getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
- getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
- getCookieCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
-
Get the cookie count
- getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Returns an immutable array of
cookies
that this HTTP state currently contains. - getCookiesCacheKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Construct a global key which represents an individual session.
- getCredential() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Get credential type
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- getData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the data
- getData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Fetch binary data entry from the cache.
- getDataLength(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Fetch binary data length.
- getDNSKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Construct a key which represents an individual host name.
- getElementIterator() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Iterate over the active form data elements.
- getElementIterator() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Iterate over the active form data elements.
- getElementName() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
-
Get the element name
- getElementName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
Get the element name
- getElementValue() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
-
Get the element value
- getElementValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
Get the element value
- getEnabled() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the expiration time.
- getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Get expiration
- getFinalURL(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
If the initial url is permanently or temporarly redirected (code 301 or 302), the method returns the destination url
- getFirstHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getFormData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
- getFormNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the form name pattern.
- getFormNamePattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the form name pattern.
- getFQDN() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the fqdn
- getGroupNumber() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- getGroupStyle() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- getHost() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the host name
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- getIPAddress() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the ipaddress
- getLastFetchCookies() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the last fetch cookies.
- getLastFetchCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the last fetch cookies.
- getLimitedResponseBody(int, String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get limited response as a string.
- getLimitedResponseBody(int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get limited response as a string.
- getMaxDocumentRequest() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the maximum number of documents to amalgamate together into one batch, for this connector.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
-
Get the maximum LRU count of the object class.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
-
Get the maximum LRU count of the object class.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
-
Get the maximum LRU count of the object class.
- getMaxOpenConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get maximum open connections.
- getMaxOpenConnections(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Given a bin name, find the max open connections to use for that bin.
- getMinimumMillisecondsPerByte() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get minimum milliseconds per byte.
- getMinimumMillisecondsPerByte(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Look up minimum milliseconds per byte for a bin.
- getMinimumMillisecondsPerFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get minimum milliseconds per fetch
- getMinimumMillisecondsPerFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Look up minimum milliseconds for a fetch for a bin.
- getName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
- getNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
Get the object class for an object.
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
Get the object class for an object.
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
Get the object class for an object.
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
Get the cache keys for an object (which may or may not exist yet in the cache).
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
Get the cache keys for an object (which may or may not exist yet in the cache).
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
Get the cache keys for an object (which may or may not exist yet in the cache).
- getOverrideTargetURL() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the override target URL.
- getOverrideTargetURL() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the override target URL.
- getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Given a URL, find the right PageCredentials object to use.
- getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the page credentials for a given document identifier (URL)
- getParameter(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the actual parameter
- getParameterCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the parameter count
- getParameterCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the number of parameters.
- getParameterNamePattern(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the name of the i'th parameter.
- getParameterNamePattern(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the name of the i'th parameter.
- getParameterValue(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the desired value of the i'th parameter.
- getParameterValue(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the desired value of the i'th parameter.
- getPath() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Get the pattern.
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the pattern
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get the pattern.
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
Get the pattern.
- getPort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- getPreferredLinkPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the preferred link pattern.
- getPreferredLinkPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the preferred link pattern.
- getPreferredRedirectionPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the preferred redirection pattern.
- getPreferredRedirectionPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the preferred redirection pattern.
- getRawQuery() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- getReferralURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the referral URI
- getReferralURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the referral URI.
- getRelationshipTypes() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Return the list of relationship types that this connector recognizes.
- getResponseBodyStream() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the response input stream.
- getResponseBodyStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the response input stream.
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the response code
- getResponseCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the http response code.
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the http response code.
- getResponseCode(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the response code.
- getResponseHeader(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get a specified response header, if it exists.
- getResponseHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get a specified response header, if it exists.
- getResponseHeaders() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get response headers
- getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get response headers
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Get the result.
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Get the result.
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Get the result.
- getRobotsKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Construct a key which represents an individual host name.
- getSafeInputStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- getScheme() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Given a URL, find the right SequenceCredentials object to use.
- getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the sequence credentials for a given document identifier (URL)
- getSequenceKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Fetch the unique key value for this particular credential.
- getSequenceKey() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
-
Fetch the unique key value for this particular credential.
- getSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Start a session
- getSessionKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- getString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- getSubmitMethod() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Get the submit method for this form.
- getSubmitMethod() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Get the submit method for this form.
- getTargetURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
- getTextValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- getTrustStore() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
Get keystore
- getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
-
Given a URL, build the right trust certificate store, or return null if all certs should be accepted.
- getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the trust store for a given document identifier (URL)
- getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
- getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
- getVersionString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Get whatever contribution to the version string should come from this data.
- getWaitAmount() - Method in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
- grab(IAbortActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- groupNumber - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- groupStyle - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- GROUPSTYLE_LOWER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- GROUPSTYLE_MIXED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- GROUPSTYLE_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- GROUPSTYLE_UPPER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- guidField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
- gzip - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
H
- handleHTML(String, IHTMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Handle document references from HTML
- handleHTTPException(HttpException, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
- handleIOException(IOException, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
- handleIOException(IOException, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
XML handler
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
The link handler
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
XML handler
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
Link handler
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
-
Link notification interface
- handler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
XML handler
- handleRedirects(String, IRedirectionHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Handle extracting the redirect link from a redirect response.
- handleXML(String, IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Handle document references from XML.
- hasExpired(long) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Check whether the connection has expired.
- hasExpired(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Check whether the connection has expired.
- hashCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
-
Calculate a hash function
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Calculate a hash function
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Calculate a hash function
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Calculate a hash function
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- hashCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Check for next
- hasNext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- headerData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- HostDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- HostExecutor(DNSManager, DNSManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Constructor.
- HostExecutor(RobotsManager, IProcessActivity, RobotsManager.HostDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Constructor.
- hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- hostField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
- hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
- hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
- hostName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
- httpClient - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
Client and method, all preconfigured
- httpClient - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The http client object.
- httpsSocketFactory - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Https protocol
I
- IDiscoveredLinkHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by a link extractor to note a discovered link.
- idleTimeout - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Idle timeout
- IHTMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
- IMetaTagHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by a parser to handle metadata tags.
- includeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of index include patterns
- includePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of include patterns
- inputStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
The stream we are wrapping.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Install the manager.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Install the manager.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Install the manager.
- install(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Install the connector.
- interestingMimeTypeArray - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This represents a list of the mime types that this connector knows how to extract links from.
- interestingMimeTypeMap - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ipaddress - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
- ipaddressField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- IRedirectionHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
- isAgentMatch(String, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if user-agent matches.
- isAllowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if path is allowed.
- isContentInteresting(IFingerprintActivity, String, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Code to check if data is interesting, based on response code and content type.
- isDeflateStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- isDisallowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if path is disallowed.
- isDocumentAndHostLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if both a document and host are legal.
- isDocumentContentIndexable(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
- isDocumentIndexable(String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if the document identifier is indexable, and return the indexing URL if found.
- isDocumentLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if the document identifier is legal.
- isDocumentText(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Is the document text, as far as we can tell?
- isEnabled - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- isFetchAllowed(String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Check if fetch is allowed
- isGZipStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- isHostLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if a host is legal.
- isInitialized - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This flag is set when the instance has been initialized
- isMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
- isStrange(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check if character is not typical ASCII or utf-8.
- isText(byte[], int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Test to see if a document is text or not.
- isWhiteSpace(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check if a byte is a whitespace character.
- IThrottledConnection - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface represents an established connection to a URL.
- IXMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an XML processor in order to handle an XML document.
K
- keyField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
L
- lastFetchCookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The cookies from the last fetch
- LaxBrowserCompatSpecProvider() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.LaxBrowserCompatSpecProvider
- linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
- linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
- linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
- linkField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
- LinkParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class recognizes and interprets all links
- LinkParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
- linkType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- logFetchCount(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Log the fetch of a number of bytes, from within a stream.
- loginAndFetch(WebcrawlerConnector.FetchStatus, IProcessActivity, String, SequenceCredentials, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- LoginCookies - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes cookies obtained during sequential authentication.
- LoginParameterIterator(List<CredentialsDescription.SessionCredentialItem>, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Constructor
- LoginParameters - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes login parameters to be used to submit a page during sequential authentication.
- lookup(String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Given a host name, look up the ip address and fqdn.
- lookupIPAddress(String, IProcessActivity, String, long, StringBuilder) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Look up an ipaddress given a non-canonical host name.
- lowercasing - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
M
- makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Turn this instance into a Credentials object, given the specified target host name
- makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Turn this instance into a Credentials object, given the specified target host name
- makeCredentialsObject(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
-
Turn this instance into a Credentials object, given the specified target host name
- makeDNSEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate the event name for DNS access.
- makeDocumentIdentifier(String, String, WebcrawlerConnector.DocumentURLFilter, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Convert an absolute or relative URL to a document identifier.
- makeReadable(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Convert a string from the robots file into a readable form that does NOT contain NUL characters (since postgresql does not accept those).
- makeRobotsEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Construct a name for the global web-connector robots event.
- makeRobotsKey(String, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Construct the robots key for a host.
- makeSessionLoginEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate the event name for session login.
- map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
- map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
- MappingRule(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
- MappingRules() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
- mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Mapping rules
- mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
- mark(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Mark.
- markSupported() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Check if mark is supported.
- matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
- MAX_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- maxOpenConnections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The maximum open connections, or null if no limit.
- mcfException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- Messages - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- Messages() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
Constructor - do no instantiate
- META_ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- META_ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- MetaParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class recognizes and interprets all meta tags
- MetaParseState(IMetaTagHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
- metaRobotsTagsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Meta robots tag usage flag
- methodThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The thread that is actually doing the work
- minimumMillisecondsPerByte - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The minimum milliseconds between bytes, or null if no limit.
- minimumMillisecondsPerFetch - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The minimum milliseconds per fetch, or null if no limit
- myPool - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Connection pool
- myUrl - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The current URL being fetched
N
- name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
- nameField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- namePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
Compiled name pattern
- nameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
Name regexp
- NameValue(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
- next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Get the next one
- next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- nextToken() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
- NODE_ACCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Forced acl access token node.
- NODE_ACCESSCREDENTIAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Access control description node
- NODE_AUTHPAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page description node
- NODE_AUTHPARAMETER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication parameter node
- NODE_BINDESC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bin description node
- NODE_EXCLUDEHEADER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude header node.
- NODE_EXCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude regexps node.
- NODE_EXCLUDESCONTENTINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude any page containing specified regex in their body from index
- NODE_EXCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude regexps node.
- NODE_FORCEINCLUSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Force the inclusion of redirections.
- NODE_INCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Include regexps node.
- NODE_INCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Include regexps node.
- NODE_LIMITTOSEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Limit to seeds.
- NODE_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Map entry specification node.
- NODE_MAXCONNECTIONS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The max connections node
- NODE_MAXFETCHESPERMINUTE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The max fetch rate node
- NODE_MAXKBPERSECOND - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bandwidth node
- NODE_SEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The seeds node.
- NODE_TRUST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Trust store description node
- NODE_URLSPEC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Canonicalization rule.
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered href
- noteAHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered base
- noteBASEHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered base
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
- noteDiscoveredBase(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
-
Inform the world of a new base HREF.
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
Inform the world of a discovered link.
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Override noteDiscoveredLink
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
Override noteDiscoveredLink
- noteDiscoveredLink(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
-
Inform the world of a discovered link.
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
Inform the world of a discovered link.
- noteDiscoveredTtlValue(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IXMLHandler
-
Inform the world of a discovered ttl value.
- noteDiscoveredTtlValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
-
Inform the world of a discovered ttl value.
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note the end of a form
- noteFormEnd() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note the end of a form
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note an input tag
- noteFormInput(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note an input tag
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note the start of a form
- noteFormStart(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note the start of a form
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered FRAME SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered IMG SRC
- noteInterrupted(Throwable) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Note that the connection fetch was interrupted by something.
- noteInterrupted(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Note that the connection fetch was interrupted by something.
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered href
- noteLINKHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered href
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
-
Inform the world of a discovered metadata tag.
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note a meta tag
- noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- noteNormalCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- noteTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- noteTagEnd(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note a character of text.
- NTLMCredential(String, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Constructor
O
- optionSelected - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- optionValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- optionValueText - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- ordinalField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- org.apache.manifoldcf.crawler.connectors.webcrawler - package org.apache.manifoldcf.crawler.connectors.webcrawler
- OurBasicCookieStore() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
- OuterContextClass(XMLFuzzyHierarchicalParseState, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
- outerTagCount - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Keep track of the number of valid feed signals we saw
- outputConfigurationBody(IThreadContext, IHTTPOutput, Locale, ConfigParams, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Output the configuration body section.
- outputConfigurationHeader(IThreadContext, IHTTPOutput, Locale, ConfigParams, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Output the configuration header section.
- outputResource(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, Object>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- outputResourceWithVelocity(IHTTPOutput, Locale, String, Map<String, String>, boolean) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
- outputSpecificationBody(IHTTPOutput, Locale, Specification, int, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Output the specification body section.
- outputSpecificationHeader(IHTTPOutput, Locale, Specification, int, List<String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Output the specification header section.
- OVERLAP_AMOUNT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- overrideActionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
- overrideTargetURL - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Override target URL
P
- PageCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for page-based authentication.
- PARAMETER_EMAIL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Email (a parameter)
- PARAMETER_META_ROBOTS_TAGS_USAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Meta robots tags usage (a parameter)
- PARAMETER_PROXYAUTHDOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth domain (parameter)
- PARAMETER_PROXYAUTHPASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth password (parameter)
- PARAMETER_PROXYAUTHUSERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth username (parameter)
- PARAMETER_PROXYHOST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy host name (parameter)
- PARAMETER_PROXYPORT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy port (parameter)
- PARAMETER_ROBOTSUSAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Robots usage (a parameter)
- parameters - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The list of the parameters we want to add for this pattern.
- parentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
- parseRobotsTxt(BufferedReader, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Parse the robots.txt file using a reader.
- password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
- password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
- pathField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- pathSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
The bin-matching pattern.
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Url match pattern
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The bin-matching pattern.
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
The bin-matching pattern.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
This is the hash that contains everything.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
This is the hash that contains everything.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
-
This is the hash that contains everything.
- peek() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Get current token.
- poll() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This method is periodically called for all connectors that are connected but not in active use.
- PoolException(String) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.PoolException
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Port
- portBlankField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- portField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- portSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- portsToString(int[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a port array to a string.
- pos - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
- potentiallyExcludedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred link pattern, or null if there's no preferred link
- preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
- preferredLinkRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred link regexp
- preferredRedirectionPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred redirection pattern, or null if there's no preferred redirection
- preferredRedirectionRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred redirection regexp
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
Process this data
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
Process this data
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
Process this data
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
Process this data
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
-
Process the data accumulated for this item
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
-
Process the data accumulated for this item
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
-
Process the data accumulated for this item
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
-
Process the data accumulated for this item
- ProcessActivityHTMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Constructor.
- ProcessActivityLinkHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
Constructor.
- ProcessActivityRedirectionHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
-
Constructor.
- ProcessActivityXMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
-
Constructor.
- processBuffer() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
- processConfigurationPost(IThreadContext, IPostParameters, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a configuration post.
- processDocument(IProcessActivity, String, String, boolean, Map<String, Set<String>>, String[], WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- processDocuments(String[], IExistingVersions, Specification, IProcessActivity, int, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a set of documents.
- processSpecificationPost(IPostParameters, Locale, Specification, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a specification post.
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Protocol
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth domain
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth domain
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth password
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth password
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth user name
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth user name
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy host
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy host
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy port
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy port
R
- rawQueryPart - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- RDFContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
- RDFItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
- read() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Read a byte.
- read(byte[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Read lots of bytes.
- read(byte[], int, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Read lots of specific bytes.
- READ_CHUNK_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
The read chunk length
- readCookies(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Read cookies currently in effect for a given session key.
- readCookiesUncached(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Read cookies from database, uncached.
- readDNSInfo(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Read DNS data, if it exists.
- readRobotsData(String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Read robots data, if it exists.
- Record() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Constructor.
- recordEverything - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
This flag determines whether we record everything to the disk, as a means of doing a web snapshot
- records - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
- redirectionURIPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
- referralURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
The referral URI
- regexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
url regexp
- REL_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- REL_REDIRECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- release(IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
- remove() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
- removeAspSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- removeBVSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- removeJavaSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- removePhpSession - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- reorder - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
- reservedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- reset() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Reset.
- resolve(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- response - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- responseCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
The response code
- responseException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- RESULT_NO_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- RESULT_NO_VERSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- RESULT_RETRY_DOCUMENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- RESULT_VERSION_NEEDED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- resultSignal - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- RESULTSTATUS_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- RESULTSTATUS_NOTYETDETERMINED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- RESULTSTATUS_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- rethrowExceptions() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
- returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
- returnValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
- ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ROBOTS_DATA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- robotsCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
- RobotsCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
- RobotsData(InputStream, long, String, IProcessActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Constructor.
- robotsField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
- robotsManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The robots manager currently used by this instance
- RobotsManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class manages the database table into which we write robots.txt files for hosts.
- RobotsManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Constructor.
- RobotsManager.HostDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the object description for a robots host object.
- RobotsManager.HostExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the executor object for locating robots host objects.
- RobotsManager.Record - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class represents a record in a robots.txt file.
- RobotsManager.RobotsCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Cache class for robots.
- RobotsManager.RobotsData - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is a cached data item.
- robotsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Robots usage flag
- RSSChannelContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
- RSSContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
- RSSItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
- rules - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
- run() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
S
- scriptParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- ScriptParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
- ScriptParseState() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- SCRIPTPARSESTATE_INSCRIPT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- SCRIPTPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
- secureField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- seedHosts - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The hash map of seed hosts, to limit urls by, if non-null
- selectMultiple - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- selectName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
- SequenceCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for sequence-based authentication.
- sequenceKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Server
- serviceInterruption - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
- SessionCredential(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Constructor
- sessionCredentialIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
- SessionCredentialItem(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Constructor
- SessionCredentialParameter(String, Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
- sessionKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
- sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
- sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
- sessionState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
- SESSIONSTATE_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
We're in 'login mode'
- SESSIONSTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Normal fetch of content document.
- setAbortChecker(AbortChecker) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Set the abort checker.
- setAbortChecker(AbortChecker) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Set the abort checker.
- setCredential(AuthenticationCredentials) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Set Credentials
- setEnabled(boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- setMaxOpenConnections(Integer) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set maximum open connections.
- setMinimumMillisecondsPerByte(Double) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set minimum milliseconds per byte.
- setMinimumMillisecondsPerFetch(Long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set minimum milliseconds per fetch
- setValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Decide whether we should index.
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
- shutdownException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- skip(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Skip
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Socket timeout milliseconds
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Socket timeout, milliseconds
- startFetchTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The start of the current fetch
- statusCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The status code fetched, if any
- streamCreated - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- streamException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- streamThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Stream throttler
- stringToArray(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Read a string as a sequence of individual expressions, urls, etc.
- stringToBoolean(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a boolean string to a boolean.
- stringToPorts(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a string to a port array.
- submitMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The form's submit method
- SUBMITMETHOD_GET - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
- SUBMITMETHOD_POST - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
T
- tagCleanup() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
- target - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- targetURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
- text - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
- textValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- theConnection - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
The connection
- theURL - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- thisDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
- thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
- thisHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
- thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
- thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
- thisManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
- threadStarted - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Set if thread has been started
- threadStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
- throttledConnection - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
The throttled connection we belong to
- ThrottledConnection(ThrottledFetcher.ConnectionPool, IFetchThrottler, String, String, int, PageCredentials, SSLSocketFactory, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Constructor.
- throttleDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The throttle description
- ThrottleDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class describes complex throttling criteria pulled from a configuration.
- ThrottleDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Constructor.
- ThrottleDescription.ThrottleItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing an individual throttle item.
- ThrottledFetcher - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class uses httpclient to fetch stuff from webservers.
- ThrottledFetcher.ConnectionPool - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Each connection pool has identical connections we can draw on.
- ThrottledFetcher.ConnectionPoolKey - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Connection pool key
- ThrottledFetcher.ExecuteMethodThread - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This thread does the actual socket communication with the server.
- ThrottledFetcher.LaxBrowserCompatSpecProvider - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class to create a cookie spec.
- ThrottledFetcher.OurBasicCookieStore - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- ThrottledFetcher.PoolException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Pool exception class
- ThrottledFetcher.ThrottledConnection - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Throttled connections.
- ThrottledFetcher.ThrottledInputstream - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class throttles an input stream based on the specified byte rate parameters.
- ThrottledFetcher.WaitException - Exception in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Wait exception class
- ThrottledInputstream(IStreamThrottler, ThrottledFetcher.ThrottledConnection, InputStream) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Constructor.
- throttleGroupName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Throttle group name
- ThrottleItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Constructor.
- throwable - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The error trace, if any
- TIME_15MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- TIME_1DAY - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- TIME_2HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- TIME_5MIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- TIME_6HRS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- toASCIIString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- token - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
- toString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
- toString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- trustsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The trusts description
- TrustsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class describes trust information pulled from a configuration.
- TrustsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
-
Constructor.
- TrustsDescription.TrustsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing an individual credential item.
- TrustsItem(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
Constructor.
- trustStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
The credential, or null if this is a "trust everything" item
- trustStoreString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
- ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
ttl value
- ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
ttl value
- ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
TTL value is set on a per-channel basis
- ttlValue - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
ttl value
- type - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- type - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- TYPE_COMMA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- TYPE_GROUP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
- TYPE_TEXT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
U
- understoodProtocols - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
- updateCookies(String, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Update cookes that are in effect for a given session key.
- UrlsetContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
- UrlsetItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
- userAgent - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The user-agent for this connector instance
- userAgents - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
- userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
- userName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
V
- value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
Value
- value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
- value - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
- valueField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- versionField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- versionSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- versionString - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The version string
- viewConfiguration(IThreadContext, IHTTPOutput, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
View configuration.
- viewSpecification(IHTTPOutput, Locale, Specification, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
View specification.
W
- WaitException(long) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
- WebcrawlerConfig - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Constants for the Webcrawler connector configuration.
- WebcrawlerConfig() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
- WebcrawlerConnector - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the Web Crawler implementation of the IRepositoryConnector interface.
- WebcrawlerConnector() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Constructor.
- WebcrawlerConnector.CanonicalizationPolicies - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing a list of canonicalization rules
- WebcrawlerConnector.CanonicalizationPolicy - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing a URL regular expression match, for the purposes of determining canonicalization policy
- WebcrawlerConnector.DocumentURLFilter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class describes the url filtering information (for crawling and indexing) obtained from a digested DocumentSpecification.
- WebcrawlerConnector.EvaluatorToken - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Evaluator token.
- WebcrawlerConnector.EvaluatorTokenStream - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Token stream.
- WebcrawlerConnector.FeedContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.FeedItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.FetchStatus - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.MappingRule - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing a mapping rule
- WebcrawlerConnector.MappingRules - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class that represents all mappings
- WebcrawlerConnector.NameValue - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Name/value class
- WebcrawlerConnector.OuterContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class handles the outermost XML context for the feed document.
- WebcrawlerConnector.ProcessActivityHTMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class that describes HTML handling
- WebcrawlerConnector.ProcessActivityLinkHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for links that get added into a IProcessActivity object.
- WebcrawlerConnector.ProcessActivityRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class that describes redirection handling
- WebcrawlerConnector.ProcessActivityXMLHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class that describes XML handling
- WebcrawlerConnector.RDFContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.RDFItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.RSSChannelContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.RSSContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.RSSItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.UrlsetContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- WebcrawlerConnector.UrlsetItemContextClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
- webThrottleGroupType - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Web throttle group type
- WebURL - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Replacement class for java.net.URI, which is broken in many ways.
- WebURL(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- WebURL(String, String, int, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- WebURL(URI) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- WebURL(URI, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
- writeDNSData(String, String, String, long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Write DNS data, replacing any existing row.
- writeRobotsData(String, long, InputStream) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Write robots.txt, replacing any existing row.
_
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
- _rcsid - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
- _rcsid - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
All Classes All Packages