Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
A
B
C
D
E
F
G
H
I
L
M
N
O
P
Q
R
S
T
U
W
A
AbstractDroid
- Class in
org.apache.droids
AbstractDroid()
- Constructor for class org.apache.droids.
AbstractDroid
accept(String)
- Method in class org.apache.droids.helper.factories.
URLFiltersFactory
Run all defined filters.
accept(String, String)
- Method in class org.apache.droids.helper.factories.
URLFiltersFactory
Run a specific filter class.
accept()
- Method in class org.apache.droids.net.
RegexRule
Return if this rule is used for filtering-in or out.
acceptDepth(int)
- Method in class org.apache.droids.queue.
QueueBean
Can we accept the new depth or would it violate the maxDepth allowed for the queue.
acceptSize(int)
- Method in class org.apache.droids.queue.
QueueBean
Can we accept the new size or would it violate the maxSize allowed for the queue.
AGENT
- Variable in class org.apache.http.
PostFile
The AGENT name
B
bufferSize
- Variable in class org.apache.droids.handle.
Save
C
Cli
- Class in
org.apache.droids
The principal class to start droids.
contentTypes
- Static variable in class org.apache.droids.protocol.
MediaType
Deprecated.
Officially known contentTypes
Core
- Class in
org.apache.droids
Core configuration mainly holding the different factories we are using.
Core()
- Constructor for class org.apache.droids.
Core
createCommitDocument()
- Method in class org.apache.droids.handle.
Solr
Creates a commit command.
createOptimizeDocument()
- Method in class org.apache.droids.handle.
Solr
Creates a optimize command.
createUpdateDocument(URL, Parse)
- Method in class org.apache.droids.handle.
Solr
Creates an add command (and returns it as OutputStream) out of the parse result.
currentSize()
- Method in class org.apache.droids.queue.
QueueBean
Return the number of threads that are currently running.
D
DelayTimer
- Interface in
org.apache.droids.api
Define the timer delay interface.
DelayWorker
- Interface in
org.apache.droids.api
Droid
- Interface in
org.apache.droids.api
Interface for a droid.
DroidFactory
- Class in
org.apache.droids.helper.factories
Factory that will lookup a droid by its name and returns it.
DroidFactory()
- Constructor for class org.apache.droids.helper.factories.
DroidFactory
DroidsException
- Exception in
org.apache.droids.exception
Wrapper object to limit the number of different Exception we can throw.
DroidsException(String)
- Constructor for exception org.apache.droids.exception.
DroidsException
Constructs a new exception with the specified detail message.
DroidsException(Throwable)
- Constructor for exception org.apache.droids.exception.
DroidsException
For more information
Exception
E
existUrl(URL)
- Static method in class org.apache.droids.net.
UrlHelper
Does the url exist?
F
FileProtocol
- Class in
org.apache.droids.protocol.file
FileProtocol()
- Constructor for class org.apache.droids.protocol.file.
FileProtocol
filter(String)
- Method in interface org.apache.droids.api.
URLFilter
Transforms the URL: can pass the original URL through or "delete" the URL by returning null
filter(Parse)
- Method in class org.apache.droids.
HelloWorker
filter(String)
- Method in class org.apache.droids.net.
RegexURLFilter
filterLinks(Parse)
- Method in class org.apache.droids.
HelloWorker
findRobotsUrl(URL, String)
- Static method in class org.apache.droids.net.
UrlHelper
Search the url of the robots.txt that is responsible for the given base url.
finishedWorker(long)
- Method in class org.apache.droids.
AbstractDroid
finishedWorker(long)
- Method in interface org.apache.droids.api.
Droid
Notification that we finished a given worker.
from
- Variable in class org.apache.droids.protocol.
HttpBase
G
GaussianRandomDelayTimer
- Class in
org.apache.droids.delay
GaussianRandomDelayTimer()
- Constructor for class org.apache.droids.delay.
GaussianRandomDelayTimer
GenericFactory
<
T
> - Class in
org.apache.droids.helper.factories
Basically all factories till now extend this generic factory.
GenericFactory()
- Constructor for class org.apache.droids.helper.factories.
GenericFactory
getAnchor()
- Method in class org.apache.droids.parse.
Outlink
Get the anchor url.
getContentType(String)
- Method in interface org.apache.droids.api.
Protocol
Returns the content type of the url
getContentType()
- Method in exception org.apache.droids.exception.
ParserNotFoundException
If not constructed via message only it will return the content typee which has caused the problem
getContentType(String)
- Method in class org.apache.droids.protocol.file.
FileProtocol
getContentType(String)
- Method in class org.apache.droids.protocol.
HttpBase
Will analyze and return the content type of the given url.
getCore()
- Method in class org.apache.droids.
AbstractDroid
getCore()
- Method in interface org.apache.droids.api.
Droid
Return the core configuration for the current Droid.
getCurrentSize()
- Method in class org.apache.droids.queue.
QueueBean
What is the current size of all queue items actively worked on.
getData()
- Method in interface org.apache.droids.api.
Parse
Other data extracted from the page.
getData()
- Method in class org.apache.droids.parse.
ParseImpl
getDelay()
- Method in class org.apache.droids.delay.
SimpleDelayTimer
Returns the delay time.
getDelayMillis()
- Method in interface org.apache.droids.api.
DelayTimer
Returns the value of the delay between request.
getDelayMillis()
- Method in class org.apache.droids.delay.
GaussianRandomDelayTimer
getDelayMillis()
- Method in class org.apache.droids.delay.
RandomDelayTimer
getDelayMillis()
- Method in class org.apache.droids.delay.
SimpleDelayTimer
getDelayTimer()
- Method in class org.apache.droids.
HelloCrawler
Get the DelayTimer implementation that we want to use.
getDelayTimer()
- Method in class org.apache.droids.
HelloWorker
getDepth()
- Method in interface org.apache.droids.api.
Task
Which is the depth of the current task.
getDepth()
- Method in interface org.apache.droids.api.
Worker
getDepth()
- Method in class org.apache.droids.
HelloWorker
getDepth()
- Method in class org.apache.droids.parse.
Outlink
getDepth()
- Method in class org.apache.droids.queue.
QueueLink
getDoneTasks()
- Method in class org.apache.droids.queue.
QueueBean
Return an array of all tasks that we already finished.
getDroid(String)
- Method in class org.apache.droids.
Core
Return the droid we want to use identified by the given name.
getDroid()
- Method in class org.apache.droids.
HelloWorker
getDroid(String)
- Method in class org.apache.droids.helper.factories.
DroidFactory
Lookup a droid by its name and return it.
getElements()
- Method in class org.apache.droids.parse.html.
HtmlParser
getEventFactory()
- Method in class org.apache.droids.helper.
StAX
Get the ready to used EventFactory
getEventParser(InputStream)
- Method in class org.apache.droids.helper.
StAX
Get an event Parser based on the incoming stream
getFiltersFactory()
- Method in class org.apache.droids.
Core
Returns the filtersFactory that knows all registered filters.
getFiltersFactory()
- Method in class org.apache.droids.
HelloWorker
getFixedDelay()
- Method in class org.apache.droids.delay.
GaussianRandomDelayTimer
getFreeSlots()
- Method in class org.apache.droids.
AbstractDroid
Get number of slots that we have currently open to accept new workers.
getFrom()
- Method in interface org.apache.droids.api.
Link
From where the link was created
getFrom()
- Method in class org.apache.droids.protocol.
HttpBase
Returns the eMail address of the bot.
getFrom()
- Method in class org.apache.droids.queue.
QueueLink
getHandlerFactory()
- Method in class org.apache.droids.
Core
Returns the handlerFactory that knows all registered handlers.
getHandlerFactory()
- Method in class org.apache.droids.
HelloWorker
getId()
- Method in interface org.apache.droids.api.
Task
The id of the task.
getId()
- Method in interface org.apache.droids.api.
Worker
getId()
- Method in class org.apache.droids.
HelloWorker
getId()
- Method in class org.apache.droids.parse.
Outlink
getId()
- Method in class org.apache.droids.queue.
QueueLink
getLastModifiedDate()
- Method in interface org.apache.droids.api.
Link
last modified date
getLastModifiedDate()
- Method in class org.apache.droids.queue.
QueueLink
getLink()
- Method in class org.apache.droids.
HelloWorker
getMap()
- Method in class org.apache.droids.helper.factories.
GenericFactory
Get the register which contains all components.
getMaxDepth()
- Method in class org.apache.droids.queue.
QueueBean
The limitation of how many loops we want to admit.
getMaxSize()
- Method in class org.apache.droids.queue.
QueueBean
The limitation of how many queue items we want to admit.
getMaxThreads()
- Method in class org.apache.droids.
AbstractDroid
Get number of maximum allowed threads
getOutlinks()
- Method in class org.apache.droids.parse.
ParseData
Get the outlinks of the page.
getOutputDir()
- Method in class org.apache.droids.handle.
Save
Get the directory where we want to save the stream.
getParse(InputStream, Task)
- Method in interface org.apache.droids.api.
Parser
Creates the parse for some content.
getParse()
- Method in class org.apache.droids.
HelloWorker
getParse(InputStream, Task)
- Method in class org.apache.droids.parse.html.
HtmlParser
getParser(String)
- Method in class org.apache.droids.helper.factories.
ParserFactory
Lookup a parser by its identifier (content type) and return it.
getParser(InputStream)
- Method in class org.apache.droids.helper.
StAX
Get a stream Parser based on the incoming stream
getParserFactory()
- Method in class org.apache.droids.
Core
Returns the parserFactory that knows all registered parser.
getParserFactory()
- Method in class org.apache.droids.
HelloWorker
getPool()
- Method in class org.apache.droids.
AbstractDroid
Get our pool.
getProtocol()
- Method in class org.apache.droids.
HelloWorker
getProtocol(String)
- Method in class org.apache.droids.helper.factories.
ProtocolFactory
Will lookup a protocol based on the underlying uri
getProtocolFactory()
- Method in class org.apache.droids.
Core
Returns the protocolFactory that knows all registered protocol.
getProtocolFactory()
- Method in class org.apache.droids.
HelloWorker
getQueue()
- Method in class org.apache.droids.
AbstractDroid
Get the queue implementation that we want to use.
getQueue()
- Method in class org.apache.droids.
HelloWorker
getRefer()
- Method in class org.apache.droids.protocol.
HttpBase
Return the refer URI where the bot is send from.
getResponseBodyAsStream()
- Method in class org.apache.http.
PostFile
getRunningThreads()
- Method in class org.apache.droids.
AbstractDroid
Get number of currently running threads
getRunningWorker()
- Method in class org.apache.droids.
AbstractDroid
Return the map of running workers
getsolrBase()
- Method in class org.apache.http.
PostFile
getSrc()
- Method in class org.apache.http.
PostFile
getStreamWriter(OutputStream)
- Method in class org.apache.droids.helper.
StAX
Get a stream writer based on the incoming stream
getTask(String)
- Method in interface org.apache.droids.api.
Queue
Return the task that is identified with the given id
getTask(String)
- Method in class org.apache.droids.queue.
Simple
getTaskDate()
- Method in class org.apache.droids.
AbstractDroid
When did the task showed up the first time in the queue
getTaskDate()
- Method in interface org.apache.droids.api.
Task
When did the task showed up the first time in the queue
getTaskDate()
- Method in class org.apache.droids.parse.
Outlink
getTaskDate()
- Method in class org.apache.droids.queue.
QueueLink
getText()
- Method in interface org.apache.droids.api.
Parse
The textual content of the page.
getText()
- Method in class org.apache.droids.parse.
ParseImpl
getTimeout()
- Method in class org.apache.droids.protocol.
HttpBase
Get the timeout we want for the connection.
getTo()
- Method in interface org.apache.droids.api.
Link
To where the link is pointing to
getTo()
- Method in class org.apache.droids.queue.
QueueLink
getToDoTasks()
- Method in class org.apache.droids.queue.
QueueBean
Return an array of all tasks that we still need to finish.
getToUrl()
- Method in class org.apache.droids.parse.
Outlink
Get the destination url.
getUpdateUrl()
- Method in class org.apache.droids.handle.
Solr
Get the update url of the Apache Solr server in use
getUri()
- Method in class org.apache.droids.
HelloWorker
getUrl()
- Method in exception org.apache.droids.exception.
ParserNotFoundException
If not constructed via message only it will return the url which has caused the problem
getUrl()
- Method in exception org.apache.droids.exception.
ProtocolNotFoundException
Will return the url which has caused the problem
getUrl()
- Method in class org.apache.droids.
HelloCrawler
Return the initial url
getUrlPrefix(URL)
- Static method in class org.apache.droids.net.
UrlHelper
Creating a valid protocol prefix.
getUserAgent()
- Method in class org.apache.droids.protocol.
HttpBase
Get the name of our UserAgent
getWorker()
- Method in class org.apache.droids.
AbstractDroid
Get the default worker for the class.
getWorker()
- Method in class org.apache.droids.
HelloCrawler
getWriter(OutputStream)
- Method in class org.apache.droids.helper.
StAX
Get an event writer based on the incoming stream
H
handle(InputStream, URL, Parse)
- Method in interface org.apache.droids.api.
Handler
handle(InputStream, URL, Parse)
- Method in class org.apache.droids.handle.
Save
handle(InputStream, URL, Parse)
- Method in class org.apache.droids.handle.
Solr
handle(InputStream, URL, Parse)
- Method in class org.apache.droids.handle.
Sysout
handle(Parse)
- Method in class org.apache.droids.
HelloWorker
handle(InputStream, URL, Parse)
- Method in class org.apache.droids.helper.factories.
HandlerFactory
Will traverse all registered handler and execute them.
Handler
- Interface in
org.apache.droids.api
A handler is a component that uses the stream, the parse and url to invoke arbitrary business logic on the objects.
HandlerFactory
- Class in
org.apache.droids.helper.factories
Factory that will traverse all registered handler and execute them.
HandlerFactory()
- Constructor for class org.apache.droids.helper.factories.
HandlerFactory
hasNext()
- Method in interface org.apache.droids.api.
Queue
Do we have more task waiting for service
hasNext()
- Method in class org.apache.droids.queue.
Simple
HelloCrawler
- Class in
org.apache.droids
Default implementation of a crawler.
HelloCrawler()
- Constructor for class org.apache.droids.
HelloCrawler
HelloWorker
- Class in
org.apache.droids
HelloWorker()
- Constructor for class org.apache.droids.
HelloWorker
HtmlParser
- Class in
org.apache.droids.parse.html
HtmlParser()
- Constructor for class org.apache.droids.parse.html.
HtmlParser
Http
- Class in
org.apache.droids.protocol.http
Simple implementation for http protocol.
Http()
- Constructor for class org.apache.droids.protocol.http.
Http
HttpBase
- Class in
org.apache.droids.protocol
Helper class that provides basic methods like returning the agent string and content type.
HttpBase()
- Constructor for class org.apache.droids.protocol.
HttpBase
I
id
- Variable in class org.apache.droids.
HelloWorker
init(Task[])
- Method in interface org.apache.droids.api.
Queue
Create the initial task list as queue
init(Task[])
- Method in class org.apache.droids.queue.
Simple
initQueue()
- Method in interface org.apache.droids.api.
Droid
Initialize the queue.
initQueue()
- Method in class org.apache.droids.
HelloCrawler
isAllowed(String)
- Method in interface org.apache.droids.api.
Protocol
Some protocols (like http) offer a mechanism to evaluate whether the client can request a given url (in http this is the robots.txt configuration)
isAllowed(String)
- Method in class org.apache.droids.protocol.file.
FileProtocol
isAllowed(String)
- Method in class org.apache.droids.protocol.http.
Http
isAllowed(String)
- Method in interface org.apache.http.norobots.
Rule
Boolean.TRUE means it is allowed.
isContentType(String)
- Static method in class org.apache.droids.protocol.
MediaType
Deprecated.
Test whether a given type is in our array of known media types.
isForceAllow()
- Method in class org.apache.droids.protocol.http.
Http
You can force that a site is allowed (ignoring the robots.txt).
isIncludeHost()
- Method in class org.apache.droids.handle.
Save
Do we want to prefix the export dir with the host name.
isUrlAllowed(URL)
- Method in class org.apache.http.norobots.
NoRobotClient
Decide if the parsed website will allow this URL to be be seen.
L
Link
- Interface in
org.apache.droids.api
Simple extension of a
Task
.
log
- Variable in class org.apache.droids.handle.
WriterHandler
log
- Variable in class org.apache.droids.
HelloWorker
log
- Variable in class org.apache.droids.helper.factories.
GenericFactory
log
- Variable in class org.apache.droids.helper.
Loggable
log
- Variable in class org.apache.droids.queue.
QueueBean
Loggable
- Class in
org.apache.droids.helper
Simple wrapper class to easier debug/log.
Loggable()
- Constructor for class org.apache.droids.helper.
Loggable
M
main(String[])
- Static method in class org.apache.droids.
Cli
Invoke the processing with droids.
main(String[])
- Static method in class org.apache.droids.
SimpleThreads
match(String)
- Method in class org.apache.droids.net.
RegexRule
Checks if a url matches this rule.
MediaType
- Class in
org.apache.droids.protocol
Deprecated.
Soon to be replaced via the tika mediaType support
merge(Task[])
- Method in interface org.apache.droids.api.
Queue
Merge a given list of tasks with the current queue.
merge(Task[])
- Method in class org.apache.droids.queue.
Simple
N
next()
- Method in interface org.apache.droids.api.
Queue
Return the next task that is waiting for service
next()
- Method in class org.apache.droids.queue.
Simple
NoRobotClient
- Class in
org.apache.http.norobots
A Client which may be used to decide which urls on a website may be looked at, according to the norobots specification located at: http://www.robotstxt.org/wc/norobots-rfc.html
NoRobotClient(String)
- Constructor for class org.apache.http.norobots.
NoRobotClient
Create a Client for a particular user-agent name.
NoRobotException
- Exception in
org.apache.http.norobots
Application exception for anything that might go wrong in the checking of a robots.txt file.
NoRobotException(String)
- Constructor for exception org.apache.http.norobots.
NoRobotException
NoRobotException(String, Throwable)
- Constructor for exception org.apache.http.norobots.
NoRobotException
O
openStream(String)
- Method in interface org.apache.droids.api.
Protocol
Return the stream represent of the url
openStream(String)
- Method in class org.apache.droids.protocol.file.
FileProtocol
openStream(String)
- Method in class org.apache.droids.protocol.http.
Http
org.apache.droids
- package org.apache.droids
This package is the principal package for Apache Droids.
org.apache.droids.api
- package org.apache.droids.api
This package defines all interfaces that we are using for droids.
org.apache.droids.delay
- package org.apache.droids.delay
This package is the principal package for Apache Droids Delay Timers.
org.apache.droids.exception
- package org.apache.droids.exception
This package defines some custom exceptions that we are using in droids.
org.apache.droids.handle
- package org.apache.droids.handle
This package contains some basic implementations of various handlers.
org.apache.droids.helper
- package org.apache.droids.helper
This package contains various helper.
org.apache.droids.helper.factories
- package org.apache.droids.helper.factories
This package contains all core factories that we use in Droids.
org.apache.droids.net
- package org.apache.droids.net
This package contains various helper for the work with protocols and network communication.
org.apache.droids.parse
- package org.apache.droids.parse
This package contains various helper and implementations around parsing.
org.apache.droids.parse.html
- package org.apache.droids.parse.html
This package contains various parser.
org.apache.droids.protocol
- package org.apache.droids.protocol
This package contains various class around the support of protocol specific classes.
org.apache.droids.protocol.file
- package org.apache.droids.protocol.file
This package contains various file protocol implementations.
org.apache.droids.protocol.http
- package org.apache.droids.protocol.http
This package contains various http protocol implementations.
org.apache.droids.queue
- package org.apache.droids.queue
This package contains various class around the support of protocol specific classes.
org.apache.http
- package org.apache.http
org.apache.http.norobots
- package org.apache.http.norobots
Using norobots-rfc
Outlink
- Class in
org.apache.droids.parse
An outlink that implements the task interface.
Outlink(String, String, int)
- Constructor for class org.apache.droids.parse.
Outlink
Create a new instance for the given parameters.
Outlink(String, int)
- Constructor for class org.apache.droids.parse.
Outlink
Create a new instance for the given parameters.
P
Parse
- Interface in
org.apache.droids.api
Wrapper object that encapsulate the result of the parsing of the underlying document.
parse(URL)
- Method in class org.apache.http.norobots.
NoRobotClient
Head to a website and suck in their robots.txt file.
ParseData
- Class in
org.apache.droids.parse
The result object that are filled by a parser
ParseData(Outlink[])
- Constructor for class org.apache.droids.parse.
ParseData
Create a new instance of Parse data for the given outlinks
ParseImpl
- Class in
org.apache.droids.parse
Default implementation of Parse
ParseImpl(String, ParseData)
- Constructor for class org.apache.droids.parse.
ParseImpl
Create a new instance of a Parse for the given text and ParseData
Parser
- Interface in
org.apache.droids.api
Simple parser that is only forcing to return a parse object.
ParserFactory
- Class in
org.apache.droids.helper.factories
Factory that will lookup a parser by its identifier and return it.
ParserFactory()
- Constructor for class org.apache.droids.helper.factories.
ParserFactory
ParserNotFoundException
- Exception in
org.apache.droids.exception
ParserNotFoundException gives a detailed exception for problems that can occur while parsing a task.
ParserNotFoundException(String, String)
- Constructor for exception org.apache.droids.exception.
ParserNotFoundException
Create an exception for the given url and content type
ParserNotFoundException(String, String, String)
- Constructor for exception org.apache.droids.exception.
ParserNotFoundException
Create an exception for the given url and content type
ParserNotFoundException(String)
- Constructor for exception org.apache.droids.exception.
ParserNotFoundException
Constructs a new exception with the specified detail message.
parseText(String)
- Method in class org.apache.http.norobots.
NoRobotClient
pipe(Reader, Writer)
- Static method in class org.apache.droids.handle.
WriterHandler
Pipes everything from the reader to the writer via a buffer
post(String, String)
- Method in class org.apache.http.
PostFile
PostFile
- Class in
org.apache.http
PostFile(String, InputStream)
- Constructor for class org.apache.http.
PostFile
prepareConnection(URL)
- Method in class org.apache.droids.protocol.
HttpBase
Will prepare a HttpURLConnection with the userAgent, from, the refer and the timeout
Protocol
- Interface in
org.apache.droids.api
The protocol interface is a wrapper to hide the underlying implementation of the communication at protocol level.
ProtocolFactory
- Class in
org.apache.droids.helper.factories
Factory that will lookup a protocol plugin and return it.
ProtocolFactory()
- Constructor for class org.apache.droids.helper.factories.
ProtocolFactory
ProtocolNotFoundException
- Exception in
org.apache.droids.exception
If we do not have any instance of a protocol registered for the iven url.
ProtocolNotFoundException(String)
- Constructor for exception org.apache.droids.exception.
ProtocolNotFoundException
Create an exception for the given url
ProtocolNotFoundException(String, String)
- Constructor for exception org.apache.droids.exception.
ProtocolNotFoundException
Create an exception for the given url and detailed message
Q
Queue
- Interface in
org.apache.droids.api
A queue is the data structure where the different tasks are waiting for service.
QueueBean
- Class in
org.apache.droids.queue
Simple bean that holds all information (as wrapper bean) that are needed when working with queue objects.
QueueBean()
- Constructor for class org.apache.droids.queue.
QueueBean
QueueLink
- Class in
org.apache.droids.queue
Simple implementation of a task that as well is an implementation of a task
QueueLink(String, String, int)
- Constructor for class org.apache.droids.queue.
QueueLink
Create a new instance of a QueueLink based on the input parameter.
R
random
- Variable in class org.apache.droids.delay.
RandomDelayTimer
RandomDelayTimer
- Class in
org.apache.droids.delay
RandomDelayTimer()
- Constructor for class org.apache.droids.delay.
RandomDelayTimer
refer
- Variable in class org.apache.droids.protocol.
HttpBase
RegexRule
- Class in
org.apache.droids.net
A generic regular expression rule.
RegexRule(boolean)
- Constructor for class org.apache.droids.net.
RegexRule
Constructs a new regular expression rule.
RegexURLFilter
- Class in
org.apache.droids.net
Regular expression implementation of an UrlFilter.
RegexURLFilter()
- Constructor for class org.apache.droids.net.
RegexURLFilter
resolve(String)
- Method in class org.apache.droids.helper.factories.
GenericFactory
Will lookup which component is linked to the name and will return it.
Rule
- Interface in
org.apache.http.norobots
A robots.txt rule.
run()
- Method in interface org.apache.droids.api.
Droid
Invoke an instance of the worker used in the droid
run()
- Method in class org.apache.droids.
HelloCrawler
Do the work (whatever it is defined in the Droid and their workers)
run()
- Method in class org.apache.droids.
HelloWorker
S
Save
- Class in
org.apache.droids.handle
Handler which is writing the stream to the file system.
Save()
- Constructor for class org.apache.droids.handle.
Save
setCore(Core)
- Method in class org.apache.droids.
AbstractDroid
Set the fully configured core and inject it in the
setCurrentSize(int)
- Method in class org.apache.droids.queue.
QueueBean
Set the current size of all queue items actively worked on.
setDelay(long)
- Method in class org.apache.droids.delay.
SimpleDelayTimer
Sets the delay time.
setDelayTimer(DelayTimer)
- Method in interface org.apache.droids.api.
DelayWorker
setDelayTimer(DelayTimer)
- Method in class org.apache.droids.
HelloCrawler
setDelayTimer(DelayTimer)
- Method in class org.apache.droids.
HelloWorker
setDepth(int)
- Method in interface org.apache.droids.api.
Task
The limit of nested task extractions we want to allow
setDepth(int)
- Method in interface org.apache.droids.api.
Worker
setDepth(int)
- Method in class org.apache.droids.
HelloWorker
setDepth(int)
- Method in class org.apache.droids.parse.
Outlink
setDepth(int)
- Method in class org.apache.droids.queue.
QueueLink
setDoneTasks(Task[])
- Method in class org.apache.droids.queue.
QueueBean
Set an array of all tasks that we already finished.
setDroid(Droid)
- Method in interface org.apache.droids.api.
Worker
setDroid(Droid)
- Method in class org.apache.droids.
HelloWorker
setDroids(DroidFactory)
- Method in class org.apache.droids.
Core
Set the droidsFactory we are using.
setElements(Map<String, String>)
- Method in class org.apache.droids.parse.html.
HtmlParser
setFile(String)
- Method in class org.apache.droids.net.
RegexURLFilter
setFiltersFactory(URLFiltersFactory)
- Method in class org.apache.droids.
Core
Set the pre-configured filtersFactory that knows all registered filters.
setFixedDelay(long)
- Method in class org.apache.droids.delay.
GaussianRandomDelayTimer
setForceAllow(boolean)
- Method in class org.apache.droids.protocol.http.
Http
You can force that a site is allowed (ignoring the robot.txt).
setFreeSlots(int)
- Method in class org.apache.droids.
AbstractDroid
Set number of slots that we have currently open to accept new workers
setFrom(String)
- Method in class org.apache.droids.protocol.
HttpBase
Set the eMail address of the bot.
setHandlerFactory(HandlerFactory)
- Method in class org.apache.droids.
Core
Set the pre-configured handlerFactory that knows all registered handlers.
setIncludeHost(boolean)
- Method in class org.apache.droids.handle.
Save
Do we want to prefix the export dir with the host name.
setMap(Map)
- Method in class org.apache.droids.helper.factories.
GenericFactory
Set the register which contains all components.
setMaxDepth(int)
- Method in class org.apache.droids.queue.
QueueBean
The limitation of how many loops items we want to admit.
setMaxSize(int)
- Method in class org.apache.droids.queue.
QueueBean
The limitation of how many queue items we want to admit.
setMaxThreads(int)
- Method in class org.apache.droids.
AbstractDroid
Adjust number of allowed threads
setOutputDir(String)
- Method in class org.apache.droids.handle.
Save
Set the directory where we want to save the stream.
setParserFactory(ParserFactory)
- Method in class org.apache.droids.
Core
Set the pre-configured parserFactory that knows all registered parser.
setPool(ThreadPoolExecutor)
- Method in class org.apache.droids.
AbstractDroid
Set our pool.
setProtocol(Protocol)
- Method in class org.apache.droids.
HelloWorker
setProtocolFactory(ProtocolFactory)
- Method in class org.apache.droids.
Core
Set the pre-configured protocolFactory that knows all registered protocol.
setQueue(Queue)
- Method in class org.apache.droids.
AbstractDroid
setQueue(Queue)
- Method in interface org.apache.droids.api.
Droid
Which implementation of a queue are we using.
setQueue(Queue)
- Method in interface org.apache.droids.api.
Worker
setQueue(Queue)
- Method in class org.apache.droids.
HelloWorker
setRefer(String)
- Method in class org.apache.droids.protocol.
HttpBase
Set the refer URI where the bot is send from.
setRunningThreads(int)
- Method in class org.apache.droids.
AbstractDroid
To set the number of running threads.
setRunningWorker(ConcurrentHashMap<Long, Worker>)
- Method in class org.apache.droids.
AbstractDroid
Set the map of running workers
setTaskDate(String)
- Method in class org.apache.droids.
AbstractDroid
When did the task showed up the first time in the queue
setTimeout(int)
- Method in class org.apache.droids.protocol.
HttpBase
Set the timeout we want for the connection.
setToDoTasks(Task[])
- Method in class org.apache.droids.queue.
QueueBean
Set an array of all tasks that we did not finished yet and add them to the queue.
setUpdateUrl(String)
- Method in class org.apache.droids.handle.
Solr
Set the update url of the Apache Solr server in use
setUri(String)
- Method in class org.apache.droids.
HelloWorker
setUrl(String)
- Method in class org.apache.droids.
HelloCrawler
Set the initial url
setUserAgent(String)
- Method in class org.apache.droids.protocol.
HttpBase
Set the name of our UserAgent
shutdownAndAwaitTermination()
- Method in class org.apache.droids.
AbstractDroid
Shutdown all threads, close the pools and leave.
Simple
- Class in
org.apache.droids.queue
Simple()
- Constructor for class org.apache.droids.queue.
Simple
Simple queue constructor.
SimpleDelayTimer
- Class in
org.apache.droids.delay
SimpleDelayTimer()
- Constructor for class org.apache.droids.delay.
SimpleDelayTimer
SimpleThreads
- Class in
org.apache.droids
SimpleThreads()
- Constructor for class org.apache.droids.
SimpleThreads
Solr
- Class in
org.apache.droids.handle
Handler specialized for the communication with Apache Solr.
Solr()
- Constructor for class org.apache.droids.handle.
Solr
start(String)
- Method in class org.apache.droids.
Core
Start a given Droid.
startWorkers()
- Method in class org.apache.droids.
AbstractDroid
Will start a new worker.
startWorkers()
- Method in class org.apache.droids.
HelloCrawler
statusCode()
- Method in class org.apache.http.
PostFile
StAX
- Class in
org.apache.droids.helper
Helper class that eases the usage of StAX in your plugins.
StAX()
- Constructor for class org.apache.droids.helper.
StAX
Easy helper to get StAX based parser and writer.
streamCopy(InputStream)
- Static method in class org.apache.droids.helper.
Streams
Reads all bytes from the given input stream and returns the result in an array.
streamCopy(InputStream, int)
- Static method in class org.apache.droids.helper.
Streams
Reads the specified number of bytes from the given input stream and returns the result in an array.
Streams
- Class in
org.apache.droids.helper
Helper class for the low level treatment of bytes and streams.
Sysout
- Class in
org.apache.droids.handle
Handler that write the stream to the sysout.
Sysout()
- Constructor for class org.apache.droids.handle.
Sysout
T
Task
- Interface in
org.apache.droids.api
A task is a working instruction for a droid.
threadMessage(String)
- Static method in class org.apache.droids.
Core
Since we are using for now exclusively the command line, the method should be use to send message to the user.
timeout
- Variable in class org.apache.droids.protocol.
HttpBase
totalSize()
- Method in interface org.apache.droids.api.
Queue
How many task do we have
left
in the queue.
totalSize()
- Method in class org.apache.droids.queue.
Simple
U
URLFilter
- Interface in
org.apache.droids.api
Filter to limit the urls that we want to allow in our queue.
URLFiltersFactory
- Class in
org.apache.droids.helper.factories
Factory that will traverse all registered filter and execute them.
URLFiltersFactory()
- Constructor for class org.apache.droids.helper.factories.
URLFiltersFactory
UrlHelper
- Class in
org.apache.droids.net
Helper class that offers a couple of method to work with urls
userAgent
- Variable in class org.apache.droids.protocol.
HttpBase
W
Worker
- Interface in
org.apache.droids.api
A worker is the unit that is doing the actual work.
WriterHandler
- Class in
org.apache.droids.handle
Wrapper that allows you to pipe a stream from a reader to a writer via a buffer
WriterHandler()
- Constructor for class org.apache.droids.handle.
WriterHandler
A
B
C
D
E
F
G
H
I
L
M
N
O
P
Q
R
S
T
U
W
Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
Copyright © 2008 The Apache Software Foundation