Class CrawlerSessionManagerValve

All Implemented Interfaces:
MBeanRegistration, Contained, JmxEnabled, Lifecycle, Valve

public class CrawlerSessionManagerValve extends ValveBase
Web crawlers can trigger the creation of many thousands of sessions as they crawl a site which may result in significant memory consumption. This Valve ensures that crawlers are associated with a single session - just like normal users - regardless of whether or not they provide a session token with their requests.
  • Constructor Details

    • CrawlerSessionManagerValve

      public CrawlerSessionManagerValve()
      Specifies a default constructor so async support can be configured.
  • Method Details

    • setCrawlerUserAgents

      public void setCrawlerUserAgents(String crawlerUserAgents)
      Specify the regular expression (using Pattern) that will be used to identify crawlers based in the User-Agent header provided. The default is ".*GoogleBot.*|.*bingbot.*|.*Yahoo! Slurp.*"
      Parameters:
      crawlerUserAgents - The regular expression using Pattern
    • getCrawlerUserAgents

      public String getCrawlerUserAgents()
      Returns:
      The current regular expression being used to match user agents.
      See Also:
    • setCrawlerIps

      public void setCrawlerIps(String crawlerIps)
      Specify the regular expression (using Pattern) that will be used to identify crawlers based on their IP address. The default is no crawler IPs.
      Parameters:
      crawlerIps - The regular expression using Pattern
    • getCrawlerIps

      public String getCrawlerIps()
      Returns:
      The current regular expression being used to match IP addresses.
      See Also:
    • setSessionInactiveInterval

      public void setSessionInactiveInterval(int sessionInactiveInterval)
      Specify the session timeout (in seconds) for a crawler's session. This is typically lower than that for a user session. The default is 60 seconds.
      Parameters:
      sessionInactiveInterval - The new timeout for crawler sessions
    • getSessionInactiveInterval

      public int getSessionInactiveInterval()
      Returns:
      The current timeout in seconds
      See Also:
    • getClientIpSessionId

      public Map<String,String> getClientIpSessionId()
    • isHostAware

      public boolean isHostAware()
    • setHostAware

      public void setHostAware(boolean isHostAware)
    • isContextAware

      public boolean isContextAware()
    • setContextAware

      public void setContextAware(boolean isContextAware)
    • initInternal

      protected void initInternal() throws LifecycleException
      Description copied from class: LifecycleBase
      Sub-classes implement this method to perform any instance initialisation required.
      Overrides:
      initInternal in class ValveBase
      Throws:
      LifecycleException - If the initialisation fails
    • invoke

      public void invoke(Request request, Response response) throws IOException, ServletException
      Description copied from interface: Valve

      Perform request processing as required by this Valve.

      An individual Valve MAY perform the following actions, in the specified order:

      • Examine and/or modify the properties of the specified Request and Response.
      • Examine the properties of the specified Request, completely generate the corresponding Response, and return control to the caller.
      • Examine the properties of the specified Request and Response, wrap either or both of these objects to supplement their functionality, and pass them on.
      • If the corresponding Response was not generated (and control was not returned, call the next Valve in the pipeline (if there is one) by executing getNext().invoke().
      • Examine, but not modify, the properties of the resulting Response (which was created by a subsequently invoked Valve or Container).

      A Valve MUST NOT do any of the following things:

      • Change request properties that have already been used to direct the flow of processing control for this request (for instance, trying to change the virtual host to which a Request should be sent from a pipeline attached to a Host or Context in the standard implementation).
      • Create a completed Response AND pass this Request and Response on to the next Valve in the pipeline.
      • Consume bytes from the input stream associated with the Request, unless it is completely generating the response, or wrapping the request before passing it on.
      • Modify the HTTP headers included with the Response after the getNext().invoke() method has returned.
      • Perform any actions on the output stream associated with the specified Response after the getNext().invoke() method has returned.
      Parameters:
      request - The servlet request to be processed
      response - The servlet response to be created
      Throws:
      IOException - if an input/output error occurs, or is thrown by a subsequently invoked Valve, Filter, or Servlet
      ServletException - if a servlet error occurs, or is thrown by a subsequently invoked Valve, Filter, or Servlet