HTTP state management
Originally HTTP was designed as a stateless, request / response oriented protocol that
made no special provisions for stateful sessions spanning across several logically related
request / response exchanges. As HTTP protocol grew in popularity and adoption more and more
systems began to use it for applications it was never intended for, for instance as a
transport for e-commerce applications. Thus, the support for state management became a
necessity.
Netscape Communications, at that time a leading developer of web client and server
software, implemented support for HTTP state management in their products based on a
proprietary specification. Later, Netscape tried to standardise the mechanism by publishing
a specification draft. Those efforts contributed to the formal specification defined through
the RFC standard track. However, state management in a significant number of applications is
still largely based on the Netscape draft and is incompatible with the official
specification. All major developers of web browsers felt compelled to retain compatibility
with those applications greatly contributing to the fragmentation of standards
compliance.
HTTP cookies
An HTTP cookie is a token or short packet of state information that the HTTP agent and the
target server can exchange to maintain a session. Netscape engineers used to refer to it
as a "magic cookie" and the name stuck.
HttpClient uses the Cookie interface to represent an
abstract cookie token. In its simplest form an HTTP cookie is merely a name / value pair.
Usually an HTTP cookie also contains a number of attributes such as version, a domain
for which is valid, a path that specifies the subset of URLs on the origin server to
which this cookie applies, and the maximum period of time for which the cookie is valid.
The SetCookie interface represents a
Set-Cookie response header sent by the origin server to the HTTP
agent in order to maintain a conversational state.
The SetCookie2 interface extends SetCookie with
Set-Cookie2 specific methods.
The ClientCookie interface extends
Cookie interface with additional client specific
functionality such as the ability to retrieve original cookie attributes exactly as they were
specified by the origin server. This is important for generating the
Cookie header because some cookie specifications require that the
Cookie header should include certain attributes only if they were
specified in the Set-Cookie or Set-Cookie2
header.
Cookie versions
Cookies compatible with Netscape draft specification but non-compliant with the
official specification are considered to be of version 0. Standard compliant cookies
are expected to have version 1. HttpClient may handle cookies differently depending
on the version.
Here is an example of re-creating a Netscape cookie:
Here is an example of re-creating a standard cookie. Please note that standard
compliant cookie must retain all attributes as sent by the origin server:
Here is an example of re-creating a Set-Cookie2 compliant
cookie. Please note that standard compliant cookie must retain all attributes as
sent by the origin server:
Cookie specifications
The CookieSpec interface represents a cookie management
specification. The cookie management specification is expected to enforce:
rules of parsing Set-Cookie and optionally
Set-Cookie2 headers.
rules of validation of parsed cookies.
formatting of Cookie header for a given host, port and path
of origin.
HttpClient ships with several CookieSpec
implementations:
Netscape draft:
This specification conforms to the original draft specification published
by Netscape Communications. It should be avoided unless absolutely necessary
for compatibility with legacy code.
RFC 2109:
Older version of the official HTTP state management specification
superseded by RFC 2965.
RFC 2965:
The official HTTP state management specification.
Browser compatibility:
This implementation strives to closely mimic the (mis)behavior of common web
browser applications such as Microsoft Internet Explorer and Mozilla
FireFox.
Best match:
'Meta' cookie specification that picks up a cookie policy based on the
format of cookies sent with the HTTP response. It basically aggregates all
above implementations into one class.
Ignore cookies:
All cookies are ignored.
It is strongly recommended to use the Best Match policy and let
HttpClient pick up an appropriate compliance level at runtime based on the execution
context.
HTTP cookie and state management parameters
These are parameters that be used to customize HTTP state management and the behaviour of
individual cookie specifications:
CookieSpecPNames.DATE_PATTERNS='http.protocol.cookie-datepatterns':
defines valid date patterns to be used for parsing non-standard
expires attribute. Only required for compatibility
with non-compliant servers that still use expires defined
in the Netscape draft instead of the standard max-age
attribute. This parameter expects a value of type
java.util.Collection. The collection
elements must be of type java.lang.String compatible
with the syntax of java.text.SimpleDateFormat. If
this parameter is not set the choice of a default value is
CookieSpec implementation specific.
CookieSpecPNames.SINGLE_COOKIE_HEADER='http.protocol.single-cookie-header':
defines whether cookies should be forced into a single
Cookie request header. Otherwise, each cookie is
formatted as a separate Cookie header. This parameter
expects a value of type java.lang.Boolean. If this
parameter is not set, the choice of a default value is CookieSpec
implementation specific. Please note this parameter applies to strict cookie
specifications (RFC 2109 and RFC 2965) only. Browser compatibility and
netscape draft policies will always put all cookies into one request
header.
ClientPNames.COOKIE_POLICY='http.protocol.cookie-policy':
defines the name of a cookie specification to be used for HTTP state
management. This parameter expects a value of type
java.lang.String. If this parameter is not set,
valid date patterns are CookieSpec
implementation specific.
Cookie specification registry
HttpClient maintains a registry of available cookie specifications using
the CookieSpecRegistry class. The following specifications are
registered per default:
compatibility:
Browser compatibility (lenient policy).
netscape:
Netscape draft.
rfc2109:
RFC 2109 (outdated strict policy).
rfc2965:
RFC 2965 (standard conformant strict policy).
best-match:
Best match meta-policy.
ignoreCookies:
All cookies are ignored.
Choosing cookie policy
Cookie policy can be set at the HTTP client and overridden on the HTTP request level
if required.
Custom cookie policy
In order to implement a custom cookie policy one should create a custom implementation
of the CookieSpec interface, create a
CookieSpecFactory implementation to create and
initialize instances of the custom specification and register the factory with
HttpClient. Once the custom specification has been registered, it can be activated the
same way as a standard cookie specification.
Cookie persistence
HttpClient can work with any physical representation of a persistent cookie store that
implements the CookieStore interface. The default
CookieStore implementation called
BasicCookieStore is a simple implementation backed by a
java.util.ArrayList. Cookies stored in an
BasicClientCookie object are lost when the container object
get garbage collected. Users can provide more complex implementations if
necessary.
HTTP state management and execution context
In the course of HTTP request execution HttpClient adds the following state management
related objects to the execution context:
ClientContext.COOKIESPEC_REGISTRY='http.cookiespec-registry':
CookieSpecRegistry instance representing the actual
cookie specification registry. The value of this attribute set in the local
context takes precedence over the default one.
ClientContext.COOKIE_SPEC='http.cookie-spec':
CookieSpec instance representing the actual
cookie specification.
ClientContext.COOKIE_ORIGIN='http.cookie-origin':
CookieOrigin instance representing the actual
details of the origin server.
ClientContext.COOKIE_STORE='http.cookie-store':
CookieStore instance representing the actual
cookie store. The value of this attribute set in the local context takes
precedence over the default one.
The local HttpContext object can be used to customize
the HTTP state management context prior to request execution, or to examine its state after
the request has been executed:
Per user / thread state management
One can use an individual local execution context in order to implement per user (or
per thread) state management. A cookie specification registry and cookie store defined in
the local context will take precedence over the default ones set at the HTTP client
level.