HTTP Proxy Caching

Web proxy caching enables you to store copies of frequently-accessed web objects (such as documents, images, and articles) and then serve this information to users on demand. It improves performance and frees up Internet bandwidth for other tasks.

This chapter discusses the following topics:

Understanding HTTP Web Proxy Caching

Internet users direct their requests to web servers all over the Internet. A caching server must act as a web proxy server so it can serve those requests. After a web proxy server receives requests for web objects, it either serves the requests or forwards them to the origin server (the web server that contains the original copy of the requested information). The Traffic Server proxy supports explicit proxy caching, in which the user’s client software must be configured to send requests directly to the Traffic Server proxy. The following overview illustrates how Traffic Server serves a user request.

Step 1 Traffic Server receives a user request for a web object.

Step 2 Using the object address, Traffic Server tries to locate the requested object in its object database (cache).

Step 3 If the object is in the cache, then Traffic Server checks to see if the object is fresh enough to serve. If it is fresh, then Traffic Server serves it to the user as a cache hit (see the figure below).

A cache hit

Step 4 If the data in the cache is stale, then Traffic Server connects to the origin server and checks if the object is still fresh (a revalidation). If it is, then Traffic Server immediately sends the cached copy to the user.

Step 5 If the object is not in the cache (a cache miss) or if the server indicates the cached copy is no longer valid, then Traffic Server obtains the object from the origin server. The object is then simultaneously streamed to the user and the Traffic Server local cache (see the figure below). Subsequent requests for the object can be served faster because the object is retrieved directly from cache.

A cache miss

Caching is typically more complex than the preceding overview suggests. In particular, the overview does not discuss how Traffic Server ensures freshness, serves correct HTTP alternates, and treats requests for objects that cannot/should not be cached. The following sections discuss these issues in greater detail.

Ensuring Cached Object Freshness

When Traffic Server receives a request for a web object, it first tries to locate the requested object in its cache. If the object is in cache, then Traffic Server checks to see if the object is fresh enough to serve. For HTTP objects, Traffic Server supports optional author-specified expiration dates. Traffic Server adheres to these expiration dates; otherwise, it picks an expiration date based on how frequently the object is changing and on administrator-chosen freshness guidelines. Objects can also be revalidated by checking with the origin server to see if an object is still fresh.

HTTP Object Freshness

Traffic Server determines whether an HTTP object in the cache is fresh by:

Modifying the Aging Factor for Freshness Computations

If an object does not contain any expiration information, then Traffic Server can estimate its freshness from the Last-Modified and Date headers. By default, Traffic Server stores an object for 10% of the time that elapsed since it last changed. You can increase or reduce the percentage according to your needs.

To modify the aging factor for freshness computations:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.heuristic_lm_factor

    Set this variable to specify the aging factor for freshness computations. Traffic Server stores an object for this percentage of the time that elapsed since it last changed.
    The default value is 0.10 (10 percent).

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the traffic_line -x command to apply the configuration changes.

Setting an Absolute Freshness Limit

Some objects do not have Expires headers or do not have both Last-Modified and Date headers. To control how long these objects are considered fresh in the cache, specify an absolute freshness limit.

To specify an absolute freshness limit:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variables:
  3. Variable Description
    proxy.config.http.cache.heuristic_min_lifetime Set this variable to specify the minimum amount of time that HTTP objects without an expiration date can remain fresh in the cache before being considered stale. The default value is 3600 seconds (1 hour).
    proxy.config.http.cache.heuristic_max_lifetime Set this variable to specify the maximum amount of time that HTTP objects without an expiration date can remain fresh in the cache before being considered stale. The default value is 86400 seconds (1 day).
  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Specifying Header Requirements

To further ensure freshness of the objects in the cache, configure Traffic Server to cache only objects with specific headers. By default, Traffic Server caches all objects (including objects with no headers); you should change the default setting only for specialized proxy situations. If you configure Traffic Server to cache only HTTP objects with Expires or max-age headers, then the cache hit rate will be noticeably reduced (since very few objects will have explicit expiration information).

To configure Traffic Server to cache objects with specific headers:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. also described in files.htm -->
    Variable Description
    proxy.config.http.cache.required_headers

    Set this variable to one of the following values:
      0 = no headers required to make document cacheable
      1 = either the Last-Modified header, or an explicit lifetime header, Expires or Cache-Control: max-age, is required
      2 = explicit lifetime is required, Expires or Cache-Control: max-age

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Cache-Control Headers

Even though an object might be fresh in the cache, clients or servers often impose their own constraints that preclude retrieval of the object from the cache. For example, a client might request that a object not be retrieved from a cache, or if it does, then it cannot have been cached for more than 10 minutes. Traffic Server bases the servability of a cached object on Cache-Control headers that appear in both client requests and server responses. The following Cache-Control headers affect whether objects are served from cache:

Traffic Server applies Cache-Control servability criteria after HTTP freshness criteria. For example, an object might be considered fresh but will not be served if its age is greater than its max-age.

Revalidating HTTP Objects

When a client requests an HTTP object that is stale in the cache, Traffic Server revalidates the object. A revalidation is a query to the origin server to check if the object is unchanged. The result of a revalidation is one of the following:

By default, Traffic Server revalidates a requested HTTP object in the cache if it considers the object to be stale. Traffic Server evaluates object freshness as described in HTTP Object Freshness. You can reconfigure how Traffic Server evaluates freshness by selecting one of the following options:

To configure how Traffic Server revalidates objects in the cache, you can set specific revalidation rules in the cache.config file (refer to cache.config).

To configure revalidation options:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.when_to_revalidate

    Set this variable to one of the following values:
      0 = Configures Traffic Server to revalidate an HTTP object whenever it is considered stale in the cache. (Traffic Server checks the headers and the freshness limit, if applicable.) This is the default option.
      1 = Configures Traffic Server to revalidate HTTP objects that do not contain Expires or Cache-control headers.
      2 = Configures Traffic Server to always revalidate HTTP objects; Traffic Server always considers HTTP objects to be stale.
      3 = Configures Traffic Server to never revalidate HTTP objects; Traffic Server always considers HTTP objects to be fresh.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Scheduling Updates to Local Cache Content

To further increase performance and to ensure that HTTP objects are fresh in the cache, you can use the Scheduled Update option. This configures Traffic Server to load specific objects into the cache at scheduled times. You might find this especially beneficial when using Traffic Server as a reverse proxy so you can preload content you anticipate will be in demand.

To use the Scheduled Update option, you must perform the following tasks.

Traffic Server uses the information you specify to determine URLs for which it is responsible. For each URL, Traffic Server derives all recursive URLs (if applicable) and then generates a unique URL list. Using this list, Traffic Server initiates an HTTP GET for each unaccessed URL, ensuring that it remains within the user-defined limits for HTTP concurrency at any given time. The system logs the completion of all HTTP GET operations so you can monitor the performance of this feature.

Traffic Server also provides a Force Immediate Update option that enables you to update URLs immediately without waiting for the specified update time to occur. You can use this option to test your scheduled update configuration (refer to Forcing an Immediate Update).

Configuring the Scheduled Update Option

To configure the scheduled update option, follow the steps below:
  1. In a text editor, open the update.config file located in the Traffic Server config directory.
  2. Enter a line in the file for each URL you want to update (refer to update.config).
  3. Save and close the update.config file.
  4. In a text editor, open the records.config file located in the Traffic Server config directory.
  5. Edit the following variables:
  6. Variable Description
    proxy.config.update.enabled Set this variable to 1 to enable the scheduled update option.
    proxy.config.update.retry_count Set this variable to specify the number of times you want to retry the scheduled update of a URL in the event of failure. The default value is 10.
    proxy.config.update.retry_interval Set this variable to specify the delay in seconds between each scheduled update retry for a URL in the event of failure. The default value is 2.
    proxy.config.update.concurrent_updates Set this variable to specify the maximum simultaneous update requests allowed at any point in time. This option prevents the scheduled update process from overburdening the host. The default value is 100.
  7. Save and close the records.config file.
  8. Navigate to the Traffic Server bin directory.
  9. Run the command traffic_line -x to apply the configuration changes.

Forcing an Immediate Update

Traffic Server provides a Force Immediate Update option that enables you to immediately verify the URLs listed in the update.config file. The Force Immediate Update option disregards the offset hour and interval set in the update.config file and immediately updates the URLs listed.

To configure the Force Immediate Update option, follow the steps below:

  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.update.force

    Set this variable to 1 to enable the Force Immediate Update option.

  4. Make sure that the variable proxy.config.update.enabled is set to 1.
  5. Save and close the records.config file.
  6. Navigate to the Traffic Server bin directory.
  7. Run the command traffic_line -x to apply the configuration changes.

IMPORTANT: When you enable the Force Immediate Update option, Traffic Server continually updates the URLs specified in the update.config file until you disable the option. To disable the Force Immediate Update option, set the variable proxy.config.update.force to 0 (zero).

Pushing Content into the Cache

Traffic Server supports the HTTP PUSH method of content delivery. Using HTTP PUSH, you can deliver content directly into the cache without user requests.

Configuring Traffic Server to Accept PUSH Requests

Before you can deliver content into your cache using HTTP PUSH, you must configure Traffic Server to accept PUSH requests.

To configure Traffic Server to accept PUSH requests:
  1. In a text editor, open the filter.config file located in the Traffic Server config directory.
  2. Add the following filter rules to the file to ensure that only certain IP addresses can deliver PUSH requests to the cache:
    domain=. src_ip=ipaddress method=PUSH action=allow
    domain=. method=PUSH action=deny

    where ipaddress is the IP address of the host or range of IP addresses of the hosts from which Traffic Server accepts PUSH requests.
  3. Save and close the filter.config file.
  4. In a text editor, open the records.config file located in the Traffic Server config directory.
  5. Edit the following variable:
  6. Variable Description
    proxy.config.http.push_method_enabled

    Set this variable to 1 to enable Traffic Server to accept PUSH requests.

  7. Save and close the records.config file.
  8. Navigate to the Traffic Server bin directory.
  9. Run the command traffic_line -x to apply the configuration changes.

Understanding HTTP PUSH

PUSH uses the HTTP 1.1 message format. The body of a PUSH request contains the response header and response body that you want to place in the cache. The following is an example of a PUSH request:

PUSH http://www.company.com HTTP/1.0 
Content-length: 84
HTTP/1.0 200 OK 
Content-type: text/html
Content-length: 17
<HTML> 
a
</HTML>

IMPORTANT: Your header must include Content-length; Content-length must include both header and body byte count.

Pinning Content in the Cache

The Cache Pinning Option configures Traffic Server to keep certain HTTP objects in the cache for a specified time. You can use this option to ensure that the most popular objects are in cache when needed and to prevent Traffic Server from deleting important objects. Traffic Server observes Cache-Control headers and pins an object in the cache only if it is indeed cacheable.

To set cache pinning rules and enable Cache Pinning:
  1. In a text editor, open the cache.config file located in the Traffic Server config directory.
  2. Add a rule in the file for each URL you want Traffic Server to pin in the cache, as shown below.
    url_regex=URL pin-in-cache=12h
    where URL is the URL you want Traffic Server to pin in the cache. The time format can be d for days, h for hours (as shown), m for minutes, and s for seconds. You can also use mixed units: for example, 1h15m20s. You can add secondary specifiers (such as prefix and suffix) to the rule (refer to cache.config for more information).
  3. Save and close the cache.config file.
  4. In a text editor, open the records.config file located in the Traffic Server config directory.
  5. Edit the following variable:
  6. Variable Description
    proxy.config.cache.permit.pinning

    Set this variable to 1 to enable the cache pinning option.

  7. Navigate to the Traffic Server bin directory.
  8. Run the command traffic_line -x to apply the configuration changes.

To Cache or Not to Cache?

When Traffic Server receives a request for a web object that is not in the cache, it retrieves the object from the origin server and serves it to the client. At the same time, Traffic Server checks if the object is cacheable before storing it in its cache to serve future requests.

Caching HTTP Objects

Traffic Server responds to caching directives from clients and origin servers, as well as directives you specify through configuration options and files.

Client Directives

By default, Traffic Server does not cache objects with the following request headers:

Configuring Traffic Server to Ignore Client no-cache Headers

By default, Traffic Server strictly observes client Cache Control:no-cache directives. If a requested object contains a no-cache header, then Traffic Server forwards the request to the origin server even if it has a fresh copy in cache. You can configure Traffic Server to ignore client no-cache directives such that it ignores no-cache headers from client requests and serves the object from its cache.

To configure Traffic Server to ignore client no-cache headers:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.ignore_client_no_cache

    Set this variable to 1 to ignore client requests to bypass the cache.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Origin Server Directives

By default, Traffic Server does not cache objects with the following response headers:

Configuring Traffic Server to Ignore Server no-cache Headers

By default, Traffic Server strictly observes Cache-Control:no-cache directives. A response from an origin server with a no-cache header is not stored in the cache and any previous copy of the object in the cache is removed. If you configure Traffic Server to ignore no-cache headers, then Traffic Server also ignores no-store headers. The default behavior of observing no-cache directives is appropriate in most cases.

To configure Traffic Server to ignore server no-cache headers:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.ignore_server_no_cache

    Set this variable to 1 to ignore server directives to bypass the cache.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Configuring Traffic Server to Ignore WWW-Authenticate Headers

By default, Traffic Server does not cache objects that contain WWW-Authenticate response headers. The WWW-Authenticate header contains authentication parameters the client uses when preparing the authentication challenge response to an origin server.

When you configure Traffic Server to ignore origin server WWW-Authenticate headers, all objects with WWW-Authenticate headers are stored in the cache for future requests. However, the default behavior of not caching objects with WWW-Authenticate headers is appropriate in most cases. Only configure Traffic Server to ignore server WWW-Authenticate headers if you are knowledgeable about HTTP 1.1.

To configure Traffic Server to ignore server WWW-Authenticate headers:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.ignore_authentication

    Set this variable to 1 to cache objects with WWW Authenticate headers.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Configuration Directives

In addition to client and origin server directives, Traffic Server responds to directives you specify through configuration options and files.
You can configure Traffic Server to do the following:

Disabling HTTP Object Caching

By default, Traffic Server caches all HTTP objects except those for which you have set never-cache rules in the cache.config file. You can disable HTTP object caching so that all HTTP objects are served directly from the origin server and never cached, as detailed below.

To disable HTTP object caching manually:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.http

    Set this variable to 0 (zero) to disable HTTP object caching.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Caching Dynamic Content

A URL is considered dynamic if it ends in .asp or contains a question mark (?), a semicolon (;), or cgi. By default, Traffic Server does not cache dynamic content. You can configure Traffic Server to cache dynamic content, although it's recommended for specialized proxy situations only.

To configure Traffic Server to cache dynamic content:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache_urls_that_look_dynamic

    Set this variable to 1 to cache dynamic content.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Caching Cookied Objects

By default, Traffic Server caches objects served in response to requests that contain cookies (unless the object is text). Traffic Server does not cache cookied text content because object headers are stored along with the object, and personalized cookie header values could be saved with the object. With non-text objects, it is unlikely that personalized headers are delivered or used.

You can reconfigure Traffic Server to:

To configure how Traffic Server caches cookied content:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.cache.cache_responses_to_cookies

    Set this variable to specify how Traffic Server caches cookied content:
      0 = Do not cache any responses to cookies.
      1 = Cache all responses to cookies.
      2 = Cache responses to cookies of image type only.
      3 = Cache all responses to cookies except text content-types (the default).

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Forcing Object Caching

You can force Traffic Server to cache specific URLs (including dynamic URLs) for a specified duration, regardless of Cache-Control response headers.

To force document caching:
  1. In a text editor, open the cache.config file located in the Traffic Server config directory.
  2. Add a rule in the file for each URL you want Traffic Server to force cache, as shown below.
    url_regex=URL ttl-in-cache=6h
    where URL is the URL you want Traffic Server to force cache. The time format can be d for days, h for hours (as shown), m for minutes, and s for seconds. You can also use mixed units: for example, 1h15m20s. In addition, you can add secondary specifiers (for example, prefix and suffix) to the rule (refer to cache.config).
  3. Save and close the cache.config file.
  4. Navigate to the Traffic Server bin directory.
  5. Run the command traffic_line -x to apply the configuration changes.

Caching HTTP Alternates

Some origin servers answer requests to the same URL with a variety of objects. The content of these objects can vary widely, according to whether a server delivers content for different languages, targets different browsers with different presentation styles, or provides different document formats (HTML, PDF). Different versions of the same object are termed alternates and are cached by Traffic Server based on Vary response headers. You can specify additional request and response headers for specific content types that Traffic Server will identify as alternates for caching. You can also limit the number of alternate versions of an object allowed in the cache.

Configuring How Traffic Server Caches Alternates

To configure how Traffic Server caches alternates, follow the steps below:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variables:
  3. Variable Description
    proxy.config.http.cache.enable_default_vary_headers

    Set this variable to 1 to cache alternate versions of HTTP objects that do not contain the Vary header.

    proxy.config.http.cache.vary_default_text Set this variable to specify the HTTP header field on which you want to vary if the request is for text: for example, an HTML document.
    proxy.config.http.cache.vary_default_images Set this variable to specify the HTTP header field on which you want to vary if the request is for images: for example, a .gif file.
    proxy.config.http.cache.vary_default_other Set this variable to specify the HTTP header field on which you want to vary if the request is for anything other than text or image.

    Note: If you specify Cookie as the header field on which to vary in the above variables, then make sure that the variable proxy.config.http.cache.cache_responses_to_cookies is set appropriately.
    For example, if you set proxy.config.http.cache.cache_responses_to_cookies to 2 (cache responses to cookies of image type only) and set the proxy.config.http.cache.vary_default_text variable to specify cookie, then alternates by cookie will not apply to text.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Limiting the Number of Alternates for an Object

You can limit the number of alternates Traffic Server can cache per object (the default is 3).

IMPORTANT: Large numbers of alternates can affect Traffic Server cache performance because all alternates have the same URL. Although Traffic Server can look up the URL in the index very quickly, it must scan sequentially through available alternates in the object store.

To limit the number of alternates:
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.cache.limits.http.max_alts

    Set this variable to specify the maximum number of alternate versions of an object you want Traffic Server to cache. The default value is three.

  4. Save and close the records.config file.
  5. Navigate to the Traffic Server bin directory.
  6. Run the command traffic_line -x to apply the configuration changes.

Using Congestion Control

The Congestion Control option enables you to configure Traffic Server to stop forwarding HTTP requests to origin servers when they become congested. Traffic Server then sends the client a message to retry the congested origin server later.

To use the Congestion Control option, you must perform the following tasks:

To enable and configure the Congestion Control option :
  1. In a text editor, open the records.config file located in the Traffic Server config directory.
  2. Edit the following variable:
  3. Variable Description
    proxy.config.http.congestion_control.enabled

    Set this variable to 1 to enable the congestion control option.

  4. Save and close the records.config file.
  5. In a text editor, open the congestion.config file located in the Traffic Server config directory.
  6. Enter rules to specify which origin servers are tracked for congestion and the timeout values Traffic Server uses to determine congestion. Refer to congestion.config for the rule format.
  7. Save and close the congestion.config file.
  8. Navigate to the Traffic Server bin directory.
  9. Run the command traffic_line -x to apply the configuration changes.