- abort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- abortCheck() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
-
- abortCheck - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Abort checker
- AbortChecker - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class furnishes an abort signal whenever the job activity says it should.
- AbortChecker(IAbortActivity) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
-
- abortThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- acceptNewTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- actionURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The form's action URI
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
-
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
- activities - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
- ACTIVITY_FETCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- ACTIVITY_LOGON_END - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- ACTIVITY_LOGON_START - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- ACTIVITY_PROCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- ACTIVITY_ROBOTSPARSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- add(WebcrawlerConnector.MappingRule) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
-
- addAgent(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add a user-agent.
- addAllow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add an allow.
- addAuthPage(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Add an auth page
- addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
-
- addCookie(Cookie) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Adds an HTTP cookie
, replacing any existing equivalent cookies.
- addCookies(Cookie[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Adds an array of HTTP cookies
.
- addData(IProcessActivity, String, IThrottledConnection) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Add a data entry into the cache.
- addDisallow(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
Add a disallow.
- addElement(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- addPageParameter(int, String, String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Add a page parameter
- addParameter(String, Pattern, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Add parameter
- addRule(WebcrawlerConnector.CanonicalizationPolicy) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
-
- addSeedDocuments(ISeedingActivity, Specification, String, long, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Queue "seed" documents.
- advance() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Go on to next token.
- allows - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
- amt - Variable in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
-
- applyFormOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
Apply overrides
- applyOverrides(LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- ATTR_ASPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
aspsessionremoval attribute
- ATTR_BINREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bin regular expression
- ATTR_BVSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
bvsessionremoval attribute
- ATTR_DESCRIPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
description attribute
- ATTR_DOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Domain/realm part of credentials (if any)
- ATTR_INSENSITIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Whether the match is case insensitive
- ATTR_JAVASESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
javasessionremoval attribute
- ATTR_LOWERCASE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
map to lower case
- ATTR_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Map attribute
- ATTR_MATCH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Match attribute
- ATTR_MATCHREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Form name or link target regexp for authentication page
- ATTR_NAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
name attribute
- ATTR_NAMEREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication parameter name regexp
- ATTR_OVERRIDETARGETURL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
URL to fetch next in a sequence (an override)
- ATTR_PASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Password part of credentials
- ATTR_PHPSESSIONREMOVAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
phpsessionremoval attribute
- ATTR_REGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
regexp attribute
- ATTR_REORDER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
reorder attribute
- ATTR_TOKEN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
token attribute
- ATTR_TRUSTEVERYTHING - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
"Trust everything" attribute - replacing truststore if set to 'true'
- ATTR_TRUSTSTORE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Trust store section of authentication record
- ATTR_TYPE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type of security
- ATTR_URLREGEXP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Regexp for access control node
- ATTR_USERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Username part of credentials
- ATTR_VALUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The value attribute (used for maxconnections and maxkbpersecond)
- ATTRVALUE_BASIC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for basic authentication
- ATTRVALUE_CONTENT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Access
- ATTRVALUE_FALSE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value false
- ATTRVALUE_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Form
- ATTRVALUE_LINK - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Link
- ATTRVALUE_NO - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value no
- ATTRVALUE_NTLM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for NTLM authentication
- ATTRVALUE_REDIRECTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page type: Redirection
- ATTRVALUE_SESSION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Type value for session-based authentication
- ATTRVALUE_TRUE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value true
- ATTRVALUE_YES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Value yes
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
The credential
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- authentication - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Authentication
- AuthenticationCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for all kinds of authentication.
- available() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Get available.
- cache - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This is where we keep data around between the getVersions() phase and the processDocuments() phase.
- cacheData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
- cacheKeys - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
- calculateDocumentEvents(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate events that should be associated with a document.
- canLowercase() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- CanonicalizationPolicies() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
-
- canonicalizationPolicies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Canonicalization policies
- CanonicalizationPolicy(Pattern, boolean, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- canRemoveAspSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- canRemoveBvSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- canRemoveJavaSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- canRemovePhpSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- canReorder() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- check() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check status of connection.
- checkException(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- checkFetchAllowed(String, String, long, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Read robots.txt data from the cache or from the database.
- checkFetchAllowed(String, String, String, int, PageCredentials, IKeystoreManager, String, String[], long, String, IProcessActivity, int, String, int, String, String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check robots to see if fetch is allowed.
- checkIfValidFeed() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Check if feed was valid
- checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- checkMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
-
- checkSum - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
-
- clear() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Clears all cookies.
- clearExpired(Date) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Removes all of cookies
in this HTTP state
that have expired by the specified date
.
- clearThreadContext() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Clear out any state information specific to a given thread.
- close() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Close the connection.
- close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Close the connection.
- close() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Close.
- commentField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- commentURLField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- compileList(List<Pattern>, List<String>) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Compile all regexp entries in the passed in list, and add them to the output
list.
- ConnectionPool(IConnectionThrottler, String, String, int, PageCredentials, SSLSocketFactory, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- ConnectionPoolKey(String, String, int, PageCredentials, String, String, int, String, String, String, int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- connectionPools - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Connection pools.
- connections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
The actual pool of connections
- connectionThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
Throttler
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Connection timeout milliseconds
- connectionTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Connection timeout, milliseconds.
- connManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The http connection manager.
- contentBuffer - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- contentPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The content pattern, or null if no content is sought for
- contentPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- contentRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The content regexp
- contentType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
The content-type header value
- contextDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
- contextException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
-
- contextMessage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
-
- cookieException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- cookieList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
-
- CookieManager - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class manages the database table into which we write cookies.
- CookieManager(IThreadContext, IDBInterface) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Constructor.
- cookieManager - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The cookie manager used by this instance
- CookieManager.CookiesCacheClass - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Cache class for robots.
- CookieManager.CookiesDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the object description for a session key object.
- CookieManager.CookiesExecutor - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is the executor object for locating cookies session objects.
- CookieManager.DynamicCookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This is a set of cookies, built dynamically.
- cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
-
- cookies - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- cookiesCacheClass - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- CookiesCacheClass() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
-
- CookiesDescription(String, StringSet) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- CookieSet - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class represents a bunch of cookies
- CookieSet(List<Cookie>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
-
- CookiesExecutor(CookieManager, CookieManager.CookiesDescription) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Constructor.
- cookieStore - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Create a set of new objects to operate on and cache.
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Create a set of new objects to operate on and cache.
- create(ICacheDescription[]) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Create a set of new objects to operate on and cache.
- create(HttpContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.LaxBrowserCompatSpecProvider
-
- CredentialsDescription - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class describes credential information pulled from a configuration.
- CredentialsDescription(ConfigParams) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Constructor.
- credentialsDescription - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The credentials description
- CredentialsDescription.BasicCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Basic type credentials
- CredentialsDescription.CredentialsItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Class representing an individual credential item.
- CredentialsDescription.LoginParameterIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
LoginParameter iterator
- CredentialsDescription.NTLMCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
NTLM-style credentials
- CredentialsDescription.SessionCredential - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credentials
- CredentialsDescription.SessionCredentialItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credential helper class
- CredentialsDescription.SessionCredentialParameter - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Session credential parameter class
- CredentialsItem(Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Constructor.
- credentialsObject - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
- criticalSectionName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
- currentFormData - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- currentIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
-
- currentOne - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
- ELEMENTCATEGORY_FIXEDEXCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- ELEMENTCATEGORY_FIXEDINCLUSIVE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- ELEMENTCATEGORY_FREEFORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The set of elements
- elementList - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.OuterContextClass
-
Handle the tag ending
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSContextClass
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
- endTag() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
-
Convert the individual sub-fields of the item context into their final forms
- equals(Object) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.AuthenticationCredentials
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Compare against another object
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
- equals(Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- evalExpression - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
-
- EvaluatorToken() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- EvaluatorToken(int, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- EvaluatorToken(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- EvaluatorTokenStream(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Constructor.
- excludeContentIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
List of content exclusion pattern
- excludeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of index exclude patterns
- excludePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of exclude patterns
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Perform the desired operation.
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Perform the desired operation.
- execute() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Perform the desired operation.
- executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Execute the fetch and get the return code.
- executeFetch(String, String, String, boolean, String, FormData, LoginCookies) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Execute the fetch and get the return code.
- executeMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- ExecuteMethodThread(ThrottledFetcher.ThrottledConnection, IFetchThrottler, HttpClient, HttpHost, HttpRequestBase, CookieStore) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Notify the implementing class of the existence of a cached version of the
object.
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Notify the implementing class of the existence of a cached version of the
object.
- exists(ICacheDescription, Object) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Notify the implementing class of the existence of a cached version of the
object.
- expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
- expiration - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
- expirationDateField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
- expirationField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
- expireTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
This is when the connection will expire.
- extractContentType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- extractEncoding(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- extractLinks(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Code to extract links from an already-fetched document.
- extractMimeType(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- FeedContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>, String, IXMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
- FeedItemContextClass(XMLFuzzyHierarchicalParseState, String, String, String, Map<String, String>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
-
- FETCH_BAD_URI - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_CIRCULAR_REDIRECT - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_INTERRUPTED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_IO_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- FETCH_NOT_TRIED - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_ROBOTS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- FETCH_SEQUENCE_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- FETCH_STANDARD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- FETCH_UNKNOWN_ERROR - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
- fetchCounter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The current bytes in the current fetch
- fetchMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The method object
- FetchStatus() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
-
- fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
The fetch throttler
- fetchThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Fetch throttler
- fetchType - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The kind of fetch we are doing
- filter - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
- FindContentHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML content grepping during state transitions
- FindContentHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- FindContentHandler(String, List<Pattern>) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- findExcludedHeaders(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Read a document specification to get a set of excluded headers
- FindHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is used to discover links in a session login context
- FindHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
- findHTMLForm(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find matching HTML form data, if present.
- FindHTMLFormHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML form parsing during state transitions
- FindHTMLFormHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- FindHTMLHrefHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for HTML parsing during state transitions
- FindHTMLHrefHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
- findHTMLLinkURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find HTML link URI, if present, making sure specified preference is matched.
- findLoginParameters(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
For a given login page, specific information may need to be submitted to the server to properly log in.
- findLoginParameters(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
-
For a given login page, specific information may need to be submitted to the server to properly log in.
- findMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicies
-
- findNextOne() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Find next one
- FindPreferredRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for redirection handling during state transitions
- FindPreferredRedirectionHandler(String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
- findPreferredRedirectionURI(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find a preferred redirection URI, if it exists
- FindRedirectionHandler - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class is the handler for redirection parsing during state transitions
- FindRedirectionHandler(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FindRedirectionHandler
-
- findRedirectionURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find a redirection URI, if it exists
- findSpecifiedContent(String, List<Pattern>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
- findSpecifiedContent(String, LoginParameters) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Find existence of specific content on the page (never finds a URL)
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Finish up all processing.
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
- finishUp() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Done with the document.
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
-
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- finishUp() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
- flushIdleConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- flushIdleConnections(IThreadContext) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Flush connections that have timed out from inactivity.
- FormData - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the form data gleaned from an HTML page.
- FormDataAccumulator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class accumulates form data and allows overrides
- FormDataAccumulator(String, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
- FormDataAccumulator.FormItemIterator - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
Iterator over FormItems
- FormDataElement - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes individual form data elements, for form submission.
- FormItem - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class provides an individual data item
- FormItem(String, String, int, boolean) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- FormItemIterator(ArrayList) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
-
- formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The form name pattern, or null if no form is expected
- formNamePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- formNameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The form name regexp
- FormParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
- FormParseState(IHTMLHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- formParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- FORMPARSESTATE_IN_FORM - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- FORMPARSESTATE_IN_OPTION - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- FORMPARSESTATE_IN_SELECT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- FORMPARSESTATE_IN_TEXTAREA - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- FORMPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- fqdn - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
- fqdnField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
- from - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
The email address for this connector instance
- generalException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getAcls(Specification) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Grab forced acl out of document specification.
- getActionURI() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Get the full action URI for this form.
- getActionURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Get the full action URI for this form.
- getActivitiesList() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Return the list of activities that this connector supports (i.e.
- getAttributeJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getAttributeJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getAttributeJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getAttributeString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getAttributeString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getAttributeString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBinNames(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the bin name string for a document identifier.
- getBodyJavascriptString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBodyJavascriptString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBodyJavascriptString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBodyString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBodyString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getBodyString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getCanonicalizationPolicies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Get canonicalization policies
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
-
Get the name of the object class.
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
-
Get the name of the object class.
- getClassName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
-
Get the name of the object class.
- getConnection(IThreadContext, String, String, String, int, PageCredentials, IKeystoreManager, IThrottleSpec, String[], int, String, int, String, String, String, int, int, IAbortActivity) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Obtain a connection to specified protocol, server, and port.
- getConnectorModel() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Tell the world what model this connector uses for getDocumentIdentifiers().
- getContentPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the content pattern.
- getContentPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the content pattern.
- getContentType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the contentType
- getContentType(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the content type.
- getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
-
- getCookie(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
-
- getCookie(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
-
Get the cookie name
- getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.DynamicCookieSet
-
- getCookieCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieSet
-
- getCookieCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginCookies
-
Get the cookie count
- getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.OurBasicCookieStore
-
Returns an immutable array of cookies
that this HTTP
state currently contains.
- getCookiesCacheKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Construct a global key which represents an individual session.
- getCredential() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Get credential type
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
- getCriticalSectionName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
- getData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the data
- getData(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Fetch binary data entry from the cache.
- getDataLength(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Fetch binary data length.
- getDNSKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Construct a key which represents an individual host name.
- getElementIterator() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Iterate over the active form data elements.
- getElementIterator() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Iterate over the active form data elements.
- getElementName() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
-
Get the element name
- getElementName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
Get the element name
- getElementValue() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataElement
-
Get the element value
- getElementValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
Get the element value
- getEnabled() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the expiration time.
- getExpirationTime() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Get expiration
- getFirstHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getFormData() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- getFormNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the form name pattern.
- getFormNamePattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the form name pattern.
- getFQDN() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the fqdn
- getGroupNumber() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- getGroupStyle() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- getHost() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
-
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the host name
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
- getHostName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
- getIPAddress() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
Get the ipaddress
- getLastFetchCookies() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the last fetch cookies.
- getLastFetchCookies() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the last fetch cookies.
- getLimitedResponseBody(int, String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get limited response as a string.
- getLimitedResponseBody(int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get limited response as a string.
- getMaxDocumentRequest() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the maximum number of documents to amalgamate together into one batch, for this connector.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesCacheClass
-
Get the maximum LRU count of the object class.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSCacheClass
-
Get the maximum LRU count of the object class.
- getMaxLRUCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsCacheClass
-
Get the maximum LRU count of the object class.
- getMaxOpenConnections(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Given a bin name, find the max open connections to use for that bin.
- getMaxOpenConnections() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get maximum open connections.
- getMinimumMillisecondsPerByte(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Look up minimum milliseconds per byte for a bin.
- getMinimumMillisecondsPerByte() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get minimum milliseconds per byte.
- getMinimumMillisecondsPerFetch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
Look up minimum milliseconds for a fetch for a bin.
- getMinimumMillisecondsPerFetch() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get minimum milliseconds per fetch
- getName() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
-
- getNamePattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
Get the object class for an object.
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
Get the object class for an object.
- getObjectClass() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
Get the object class for an object.
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
Get the cache keys for an object (which may or may not exist yet in
the cache).
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostDescription
-
Get the cache keys for an object (which may or may not exist yet in
the cache).
- getObjectKeys() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostDescription
-
Get the cache keys for an object (which may or may not exist yet in
the cache).
- getOverrideTargetURL() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the override target URL.
- getOverrideTargetURL() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the override target URL.
- getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Given a URL, find the right PageCredentials object to use.
- getPageCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the page credentials for a given document identifier (URL)
- getParameter(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the actual parameter
- getParameterCount() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the parameter count
- getParameterCount() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the number of parameters.
- getParameterNamePattern(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the name of the i'th parameter.
- getParameterNamePattern(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the name of the i'th parameter.
- getParameterValue(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the desired value of the i'th parameter.
- getParameterValue(int) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the desired value of the i'th parameter.
- getPath() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
-
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Get the pattern.
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the pattern
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Get the pattern.
- getPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
Get the pattern.
- getPort() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
-
- getPreferredLinkPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the preferred link pattern.
- getPreferredLinkPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the preferred link pattern.
- getPreferredRedirectionPattern() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Get the preferred redirection pattern.
- getPreferredRedirectionPattern() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.LoginParameters
-
Get the preferred redirection pattern.
- getRawQuery() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
-
- getReferralURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the referral URI
- getReferralURI(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the referral URI.
- getRelationshipTypes() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Return the list of relationship types that this connector recognizes.
- getResponseBodyStream() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the response input stream.
- getResponseBodyStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the response input stream.
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.DocumentData
-
Get the response code
- getResponseCode(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache
-
Get the response code.
- getResponseCode() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get the http response code.
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getResponseCode() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get the http response code.
- getResponseHeader(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get a specified response header, if it exists.
- getResponseHeader(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get a specified response header, if it exists.
- getResponseHeaders() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Get response headers
- getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getResponseHeaders() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Get response headers
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesExecutor
-
Get the result.
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.HostExecutor
-
Get the result.
- getResults() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.HostExecutor
-
Get the result.
- getRobotsKey(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Construct a key which represents an individual host name.
- getSafeInputStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- getScheme() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebURL
-
- getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
Given a URL, find the right SequenceCredentials object to use.
- getSequenceCredential(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the sequence credentials for a given document identifier (URL)
- getSequenceKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Fetch the unique key value for this particular credential.
- getSequenceKey() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.SequenceCredentials
-
Fetch the unique key value for this particular credential.
- getSession() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Start a session
- getSessionKey() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- getString(Locale, String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getString(Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getString(String, Locale, String, Object[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
- getSubmitMethod() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
Get the submit method for this form.
- getSubmitMethod() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
Get the submit method for this form.
- getTargetURI() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
- getTextValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
-
Given a URL, build the right trust certificate store, or return null if all certs should be accepted.
- getTrustStore() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
Get keystore
- getTrustStore(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Get the trust store for a given document identifier (URL)
- getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- getType() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
- getValue() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
-
- getVersionString() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Get whatever contribution to the version string should come from this data.
- getWaitAmount() - Method in exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.WaitException
-
- grab(IAbortActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- groupNumber - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- groupStyle - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- GROUPSTYLE_LOWER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- GROUPSTYLE_MIXED - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- GROUPSTYLE_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- GROUPSTYLE_UPPER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorToken
-
- guidField - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
-
- gzip - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- IDiscoveredLinkHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by a link extractor to note a discovered link.
- idleTimeout - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
-
Idle timeout
- IHTMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
- IMetaTagHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by a parser to handle metadata tags.
- includeIndexPatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of index include patterns
- includePatterns - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The arraylist of include patterns
- inputStream - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
The stream we are wrapping.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Install the manager.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
Install the manager.
- install() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Install the manager.
- install(IThreadContext) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Install the connector.
- interestingMimeTypeArray - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This represents a list of the mime types that this connector knows how to extract links from.
- interestingMimeTypeMap - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- ipaddress - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager.DNSInfo
-
- ipaddressField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.DNSManager
-
- IRedirectionHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
- isAgentMatch(String, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if user-agent matches.
- isAllowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if path is allowed.
- isContentInteresting(IFingerprintActivity, String, int, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Code to check if data is interesting, based on response code and content type.
- isDeflateStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- isDisallowed(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.Record
-
See if path is disallowed.
- isDocumentAndHostLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if both a document and host are legal.
- isDocumentContentIndexable(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
- isDocumentIndexable(String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if the document identifier is indexable, and return the indexing URL if found.
- isDocumentLegal(String, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if the document identifier is legal.
- isDocumentText(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Is the document text, as far as we can tell?
- isEnabled - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- isFetchAllowed(String, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Check if fetch is allowed
- isGZipStream() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- isHostLegal(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Check if a host is legal.
- isInitialized - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This flag is set when the instance has been initialized
- isMatch(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
-
- isStrange(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check if character is not typical ASCII or utf-8.
- isText(byte[], int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Test to see if a document is text or not.
- isWhiteSpace(byte) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Check if a byte is a whitespace character.
- IThrottledConnection - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface represents an established connection to a URL.
- IXMLHandler - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes the functionality needed by an XML processor in order to handle an XML document.
- makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
Turn this instance into a Credentials object, given the specified target host name
- makeCredentialsObject(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Turn this instance into a Credentials object, given the specified target host name
- makeCredentialsObject(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.PageCredentials
-
Turn this instance into a Credentials object, given the specified target host name
- makeDNSEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate the event name for DNS access.
- makeDocumentIdentifier(String, String, WebcrawlerConnector.DocumentURLFilter, IHistoryActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Convert an absolute or relative URL to a document identifier.
- makeReadable(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager
-
Convert a string from the robots file into a readable form that does NOT contain NUL characters (since postgresql does not accept those).
- makeRobotsEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Construct a name for the global web-connector robots event.
- makeRobotsKey(String, String, int) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Construct the robots key for a host.
- makeSessionLoginEventName(INamingActivity, String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Calculate the event name for session login.
- map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
-
- map(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
-
- MappingRule(Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
-
- MappingRules() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
-
- mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
Mapping rules
- mappings - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRules
-
- mark(int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Mark.
- markSupported() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Check if mark is supported.
- matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.CanonicalizationPolicy
-
- matchPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.MappingRule
-
- MAX_LENGTH - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- maxOpenConnections - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The maximum open connections, or null if no limit.
- mcfException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
-
- Messages - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
- Messages() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.Messages
-
Constructor - do no instantiate
- META_ROBOTS_ALL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- META_ROBOTS_NONE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- MetaParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class recognizes and interprets all meta tags
- MetaParseState(IMetaTagHandler) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
-
- metaRobotsTagsUsage - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Meta robots tag usage flag
- methodThread - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The thread that is actually doing the work
- minimumMillisecondsPerByte - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The minimum milliseconds between bytes, or null if no limit.
- minimumMillisecondsPerFetch - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The minimum milliseconds per fetch, or null if no limit
- myPool - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Connection pool
- myUrl - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The current URL being fetched
- name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- name - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
-
- nameField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- namePattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
Compiled name pattern
- nameRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
Name regexp
- NameValue(String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.NameValue
-
- next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
Get the next one
- next() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator.FormItemIterator
-
- nextToken() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
- NODE_ACCESS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Forced acl access token node.
- NODE_ACCESSCREDENTIAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Access control description node
- NODE_AUTHPAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication page description node
- NODE_AUTHPARAMETER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Authentication parameter node
- NODE_BINDESC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bin description node
- NODE_EXCLUDEHEADER - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude header node.
- NODE_EXCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude regexps node.
- NODE_EXCLUDESCONTENTINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude any page containing specified regex in their body from index
- NODE_EXCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Exclude regexps node.
- NODE_INCLUDES - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Include regexps node.
- NODE_INCLUDESINDEX - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Include regexps node.
- NODE_LIMITTOSEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Limit to seeds.
- NODE_MAP - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Map entry specification node.
- NODE_MAXCONNECTIONS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The max connections node
- NODE_MAXFETCHESPERMINUTE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The max fetch rate node
- NODE_MAXKBPERSECOND - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The bandwidth node
- NODE_SEEDS - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
The seeds node.
- NODE_TRUST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Trust store description node
- NODE_URLSPEC - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Canonicalization rule.
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered href
- noteAHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered href
- noteAHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered base
- noteBASEHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note base href
- noteBASEHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered base
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
- noteDiscoveredBase(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
-
Inform the world of a new base HREF.
- noteDiscoveredBase(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
Inform the world of a discovered link.
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Override noteDiscoveredLink
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindPreferredRedirectionHandler
-
Override noteDiscoveredLink
- noteDiscoveredLink(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
-
Inform the world of a discovered link.
- noteDiscoveredLink(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
Inform the world of a discovered link.
- noteDiscoveredTtlValue(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IXMLHandler
-
Inform the world of a discovered ttl value.
- noteDiscoveredTtlValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
-
Inform the world of a discovered ttl value.
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note the end of a form
- noteFormEnd() - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note the end of a form
- noteFormEnd() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note the end of a form
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note an input tag
- noteFormInput(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note an input tag
- noteFormInput(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note an input tag
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note the start of a form
- noteFormStart(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note the start of a form
- noteFormStart(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note the start of a form
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered FRAME SRC
- noteFRAMESRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered FRAME SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered IMG SRC
- noteIMGSRC(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered IMG SRC
- noteInterrupted(Throwable) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Note that the connection fetch was interrupted by something.
- noteInterrupted(Throwable) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Note that the connection fetch was interrupted by something.
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note discovered href
- noteLINKHREF(String) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note discovered href
- noteLINKHREF(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note discovered href
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note a meta tag
- noteMetaTag(Map) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
-
Inform the world of a discovered metadata tag.
- noteMetaTag(Map) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note a meta tag
- noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- noteNonscriptEndTag(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
-
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
-
- noteNonscriptTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- noteNormalCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- noteTag(String, Map<String, String>) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- noteTagEnd(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IHTMLHandler
-
Note a character of text.
- noteTextCharacter(char) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Note a character of text.
- NTLMCredential(String, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
Constructor
- PageCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for page-based authentication.
- PARAMETER_EMAIL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Email (a parameter)
- PARAMETER_META_ROBOTS_TAGS_USAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Meta robots tags usage (a parameter)
- PARAMETER_PROXYAUTHDOMAIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth domain (parameter)
- PARAMETER_PROXYAUTHPASSWORD - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth password (parameter)
- PARAMETER_PROXYAUTHUSERNAME - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy auth username (parameter)
- PARAMETER_PROXYHOST - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy host name (parameter)
- PARAMETER_PROXYPORT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Proxy port (parameter)
- PARAMETER_ROBOTSUSAGE - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig
-
Robots usage (a parameter)
- parameters - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The list of the parameters we want to add for this pattern.
- parentURI - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
- parseRobotsTxt(BufferedReader, String, IProcessActivity) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
Parse the robots.txt file using a reader.
- password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.BasicCredential
-
- password - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.NTLMCredential
-
- pathField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- pathSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
The bin-matching pattern.
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Url match pattern
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
The bin-matching pattern.
- pattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription.TrustsItem
-
The bin-matching pattern.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription
-
This is the hash that contains everything.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription
-
This is the hash that contains everything.
- patternHash - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.TrustsDescription
-
This is the hash that contains everything.
- peek() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
Get current token.
- poll() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
This method is periodically called for all connectors that are connected but not
in active use.
- PoolException(String) - Constructor for exception org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.PoolException
-
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- port - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Port
- portBlankField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- portField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- portSpecifiedField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- portsToString(int[]) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a port array to a string.
- pos - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.EvaluatorTokenStream
-
- potentiallyExcludedHeaders - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred link pattern, or null if there's no preferred link
- preferredLinkPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLHrefHandler
-
- preferredLinkRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred link regexp
- preferredRedirectionPattern - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred redirection pattern, or null if there's no preferred redirection
- preferredRedirectionRegexp - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
The preferred redirection regexp
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedContextClass
-
Process this data
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FeedItemContextClass
-
Process the data accumulated for this item
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFContextClass
-
Process this data
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RDFItemContextClass
-
Process the data accumulated for this item
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSChannelContextClass
-
Process this data
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.RSSItemContextClass
-
Process the data accumulated for this item
- process() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetContextClass
-
Process this data
- process(IXMLHandler) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.UrlsetItemContextClass
-
Process the data accumulated for this item
- ProcessActivityHTMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, int) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Constructor.
- ProcessActivityLinkHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter, String, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
Constructor.
- ProcessActivityRedirectionHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
-
Constructor.
- ProcessActivityXMLHandler(String, IProcessActivity, WebcrawlerConnector.DocumentURLFilter) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
-
Constructor.
- processBuffer() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FindContentHandler
-
- processConfigurationPost(IThreadContext, IPostParameters, Locale, ConfigParams) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a configuration post.
- processDocument(IProcessActivity, String, String, boolean, Map<String, Set<String>>, String[], WebcrawlerConnector.DocumentURLFilter) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
- processDocuments(String[], IExistingVersions, Specification, IProcessActivity, int, boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a set of documents.
- processSpecificationPost(IPostParameters, Locale, Specification, int) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Process a specification post.
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- protocol - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Protocol
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth domain
- proxyAuthDomain - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth domain
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth password
- proxyAuthPassword - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth password
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy auth user name
- proxyAuthUsername - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy auth user name
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy host
- proxyHost - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy host
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Proxy port
- proxyPort - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Proxy port
- ScriptParseState - Class in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
- ScriptParseState() - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- scriptParseState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- SCRIPTPARSESTATE_INSCRIPT - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- SCRIPTPARSESTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- secureField - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
- seedHosts - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.DocumentURLFilter
-
The hash map of seed hosts, to limit urls by, if non-null
- selectMultiple - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- selectName - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
- SequenceCredentials - Interface in org.apache.manifoldcf.crawler.connectors.webcrawler
-
This interface describes immutable classes which represents authentication information for sequence-based authentication.
- sequenceKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- server - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Server
- serviceInterruption - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.AbortChecker
-
- SessionCredential(String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
Constructor
- sessionCredentialIndex - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
- SessionCredentialItem(String, Pattern, String, String, Pattern, String, Pattern, String, Pattern, String, Pattern) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialItem
-
Constructor
- SessionCredentialParameter(String, Pattern, String) - Constructor for class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredentialParameter
-
- sessionKey - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager.CookiesDescription
-
- sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.LoginParameterIterator
-
- sessionPages - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.SessionCredential
-
- sessionState - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.FetchStatus
-
- SESSIONSTATE_LOGIN - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
We're in 'login mode'
- SESSIONSTATE_NORMAL - Static variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Normal fetch of content document.
- setAbortChecker(AbortChecker) - Method in interface org.apache.manifoldcf.crawler.connectors.webcrawler.IThrottledConnection
-
Set the abort checker.
- setAbortChecker(AbortChecker) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Set the abort checker.
- setCredential(AuthenticationCredentials) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CredentialsDescription.CredentialsItem
-
Set Credentials
- setEnabled(boolean) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- setMaxOpenConnections(Integer) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set maximum open connections.
- setMinimumMillisecondsPerByte(Double) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set minimum milliseconds per byte.
- setMinimumMillisecondsPerFetch(Long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottleDescription.ThrottleItem
-
Set minimum milliseconds per fetch
- setValue(String) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormItem
-
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityHTMLHandler
-
Decide whether we should index.
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityRedirectionHandler
-
- shouldIndex() - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityXMLHandler
-
- shutdownException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- skip(long) - Method in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Skip
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPool
-
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ConnectionPoolKey
-
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
Socket timeout milliseconds
- socketTimeoutMilliseconds - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Socket timeout, milliseconds
- startFetchTime - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The start of the current fetch
- statusCode - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledConnection
-
The status code fetched, if any
- streamCreated - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- streamException - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ExecuteMethodThread
-
- streamThrottler - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher.ThrottledInputstream
-
Stream throttler
- stringToArray(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector
-
Read a string as a sequence of individual expressions, urls, etc.
- stringToBoolean(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a boolean string to a boolean.
- stringToPorts(String) - Static method in class org.apache.manifoldcf.crawler.connectors.webcrawler.CookieManager
-
Convert a string to a port array.
- submitMethod - Variable in class org.apache.manifoldcf.crawler.connectors.webcrawler.FormDataAccumulator
-
The form's submit method
- SUBMITMETHOD_GET - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-
- SUBMITMETHOD_POST - Static variable in interface org.apache.manifoldcf.crawler.connectors.webcrawler.FormData
-