eZ component: Webdav, RFC overview, 1.1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Author: Tobias Schlitt
:Revision: $Rev$
:Date: $Date$
:Status: Draft

.. contents::

Scope
~~~~~

This document tries to summarize the major points of `RFC 2518`_ (WebDAV) and
the associated `RFC 2616`_ (HTTP/1.1) in respect to distributed editing. The
main points here are the support of entity tags and locking. This document only
summarizes the RFCs and contains small design related hints here and there. The
main design for the functionality analyzed here is contained in the
design-1.1.txt file.

.. _`RFC 2518`: http://www.ietf.org/rfc/rfc2518.txt
.. _`RFC 2616`: http://www.ietf.org/rfc/rfc2616.txt

Entity Tags
~~~~~~~~~~~

Entity tags are generally used in the HTTP/1.1 protocol to provide a mechanism
of validating that a resource is in the same state. Whenever the state of a
resource changes, its entity tag needs to change. In following, the definition
of the HTTP/1.1 validation mechanisms in general, the definition of an entity
tag and the definition of the ETag-header are described. In addition the usage
of entity tags in the Webdav RFC is described.

Section 3.11 of the HTTP/1.1 RFC describes entity tags. These strings identify
(tag) the state of a resource, named "entities" in the RFC. The entity tag
consists of a quoted string and an optional weakness modifier. The quoted
string must be unique for each state. ::

      entity-tag = [ weak ] opaque-tag
      weak       = "W/"
      opaque-tag = quoted-string

A non-weak entity tag must identify a certain state uniquely. With the added W/
prefix, one and the same tag may identify different states of a resource that
are semantically equivalent.

Entity tags are used in HTTP/1.1 in combination with the following headers:

- Request
  - If-Match
  - If-None-Match
  - If-Range
- Response
  - ETag

Since the Webdav component will generate the entity tag, we should ensure to
only generate strong entity tags.

Headers
=======

The following section describes the headers that are affected by entity tags
and how the server should respect them.

If-Match
--------

The If-Match header is generally used to make the method it is send with
conditional. Only if the conditions defined by the header are met, the action
associated with the method should be performed by the server.

The header format is defined as follows (14.24): ::

       If-Match = "If-Match" ":" ( "*" | 1#entity-tag )

The If-Match header in general assumes that the affected resource exists, if it
does not, the request must fail since no entity is there to compare the given
criteria too (no entity exists). The header either specifies "*", to indicate
that an entity must exist, whichever that is. Alternatively any number of
entity tags can be given, divided by ",". If one of the given tags match the
current state of the resource, the method is performed as if not If-Match
header was given. Else the method must fail with 412 (Precondition failed).

In case the request would have failed anyway (not result in a 2xx or 412
status), the If-Match condition is not even checked, but the error response
generated by the request is returned. The reaction of a server to a combination
of multiple If-* headers is undefined.

.. Note::
   We should just throw all If-* headers away if a combination of the occurs,
   so the back-end does not need to deal with it.

Examples: ::

       If-Match: "xyzzy"
       If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
       If-Match: *

If-None-Match
-------------

This header behaves similar to the If-Match header, except that the operation
is only performed if none of the submitted entity tags matches. In case the
operation is a GET or a HEAD operation and one of the tags matched, the server
should return a 304 (Not Modified) code, in all other cases a 412 (Precondition
Failed) must be returned.

The header is defined like this: ::

       If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )

The '*' again means that any entity tag exists (as for If-Match). In case of
the If-None-Match header, the operation will be executed only of no entity of
the resource exists. In fact this means that the resource does not exist.

Examples: ::

       If-None-Match: "xyzzy"
       If-None-Match: W/"xyzzy"
       If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
       If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz"
       If-None-Match: *

If-Range
--------

The If-Range header does only make sense to be respected, if the server supports
partial GET requests and resuming of such. Since the Webdav component does not
support this, yet, the header will be ignored.

.. Note::
   If we support partial GET sometimes, support for this header must be
   considered, too.

ETag
----

The ETag header is send by a server to give the client an entity-tag that
identifies the current state of a certain resource. This tag can later be used
by the client with any of the headers described above.  The ETag header is
therefore a response header, while the headers described above are request
headers.

The ETag header is build up like this: ::
      
      ETag = "ETag" ":" entity-tag

Examples::

      ETag: "xyzzy"
      ETag: W/"xyzzy"
      ETag: ""

.. Attention::
   The following is an assumption that should be verified somehow. It seems that
   the ETag header is only defined for a single resource, which indicates that
   responses that affect multiple resources should not contain it.  For Webdav
   this includes several methods like COPY and MOVE, but also PROPFIND with a
   Depth header other than 0.

Validation
==========

The purpose of entity tags is to ensure, that a certain operation is only
performed, if a resource resides in a certain state. This state is defined by
an entity tag. Most likely, this applies to the GET operation in combination
with caching for pure HTTP/1.1. However, in combination with the WebDAV
extension, validation of entity tags might also be necessary for operations
like PUT and others.

The HTTP/1.1 RFC defines 2 different validator schemes: Strong and weak
validators. Entity tags are generally considered strong validators, since they
should change as soon as the affected resource changes its state. However, the
protocol provides a way to declare entity tags to be weak validators, as
described in the ETag header above.

.. Note::
   We should not make use of this way of "weaking" entity tags, but always provide
   the strong method.

Caches (like proxy servers and browser caches) use additional methods to
validate their content. The most common way here is to use the Last-Modified
header in addition, which indicates the last modification time of the resource.


-- Note::
   We need to decide if we should support this validator in addition. This
   would also involve more headers to react on, like If-Modified-Since. This is
   not mandatory.


Locking
~~~~~~~

This section tries to summarize the important facts about locking mentioned in
RFC 2518 in various places, enhanced by first pre-considerations of design and
implementation issues associated with them.

Lock types
==========

The WebDAV RFC distinguishes between read and write locks, while only write
locks are defined in detail. The RFC explains, that "the syntax is extensible,
and permits the eventual specification of locking for other access types".

A write lock determines that the locking principle may exclusively write to the
affected resources. Reading is possibly for every other principle, too. If a
principle that does not hold a specific lock tries to perform a writing operation
to a resource which is locked by another principle, this operation must fail.

Lock scopes
===========

The WebDAV RFC specifies 2 different scopes for locking:

- Exclusive
- Shared

For an exclusive lock exactly 1 principle may hold a lock on a specific resource
and only this principle is allowed to perform the affected operations on the
locked resource. For a shared lock it is possible that multiple principles take
part in the lock (group editing). Every principle that takes part in a shared lock
may perform the affected operations on the locked resource.

WebDAV does not provide any channel to allow communication between the
principles involved into a shared lock. The communication of these principles must
be handled externally.

The following table shows lock compatibility:

+----------------------+-----------------+--------------+
| Current lock state/  |   Shared Lock   |   Exclusive  |
| Lock request         |                 |   Lock       |
+----------------------+-----------------+--------------+
| None                 |   True          |   True       |
+----------------------+-----------------+--------------+
| Shared Lock          |   True          |   False      |
+----------------------+-----------------+--------------+
| Exclusive Lock       |   False         |   False*     |
+----------------------+-----------------+--------------+
   
*Legend*: True = lock may be granted.  False = lock MUST NOT be granted. \*=It is
illegal for a principal to request the same lock twice.

Lock tokens
===========

A lock token identifies a specific lock uniquely across all resources for all
times. Whenever a successful LOCK request was processed, it returns the
specific lock token for this lock. The lock token associates the locking principle
with the locked resource. Therefore multiple lock tokens might be assigned to a
single resource, if the lock is a shared lock.

A lock token must be unique throughout all resources for all times. The WebDAV
RFC therefore defines a lock token scheme, which can optionally be used by the
server. The so called "opaquelocktoekn" scheme makes use of UUID_, as defined
in `ISO-11578`_. A `PECL package for UUIDs`_ is available.

.. _`UUID`: http://en.wikipedia.org/wiki/UUID
.. _`ISO-11578`: http://www.iso.ch/cate/d2229.html
.. _`PECL package for UUIDs`: http://pecl.php.net/package/uuid

Since the opaquelocktoken scheme is not mandatory, the code snippet ::

    $token = md5( uniqid( rand(), true ) ); 

could be used as an alternative to provide the necessary amount of uniqueness. 
To create a lock token that, a this way generated ID could be appended to the
URI of the affected resource to provide transparency of the source of a lock
token. For example: http://webdav/foo/bar.txt#<id>.

Every principle that can access the WebDAV server has access to lock tokens
through the LOCKDISCOVERY request method, so the lock must be bound to a
different authentication mechanism. An owner string is submitted with the LOCK
request, which might help here.

Affected requests
=================

Locks affect several request, beside the explicitly lock related requests. The
following 2 sections summarize the affected request methods and give a short
overview about how these are affected.

LOCK
----

The LOCK method is a new method, which needs to be supported. The request body
of the LOCK method contains a dedicated XML element. Both, the request
abstraction object and objects for the conent, already exist in the Webdav
component.

The method supports the Depth header, but only with the values 0 and INFINITY.
The value 1 is not supported. 0 means that only the resource itself is affected
and INFITY includes all descendant resources. For non-collection resources both
mean the same, For collection resources 0 means that only the collection should
be locked and INFINITY reciusively locks all descendants of the collection in
addition to the collection itself. No Depth header means that INFINITY is
asumed.

A LOCK method must only return a single lock token for all resources locked
with this request. If an UNLOCK method is successfully executed with this lock
token, all affected resources must be unlocked.

If a LOCK operation fails because there is a conflict with one of the resources
to LOCK, the complete operation needs to fail (no partial success). The
response code 409 (Conflict) must be returned, the body must be a multistatus
XML element that contains the resource that is responsible for the conflict.

Status codes returned by the LOCK method are:

200 (OK) - The lock request succeeded and the value of the lockdiscovery
property is included in the body.

412 (Precondition Failed) - The included lock token was not enforceable on this
resource or the server could not satisfy the request in the lockinfo XML
element.

423 (Locked) - The resource is locked, so the method has been rejected.

Example - Simple LOCK request: ::

   >>Request

   LOCK /workspace/webdav/proposal.doc HTTP/1.1
   Host: webdav.sb.aol.com
   Timeout: Infinite, Second-4100000000
   Content-Type: text/xml; charset="utf-8"
   Content-Length: xxxx
   Authorization: Digest username="ejw",
      realm="ejw@webdav.sb.aol.com", nonce="...",
      uri="/workspace/webdav/proposal.doc",
      response="...", opaque="..."

   <?xml version="1.0" encoding="utf-8" ?>
   <D:lockinfo xmlns:D='DAV:'>
     <D:lockscope><D:exclusive/></D:lockscope>
     <D:locktype><D:write/></D:locktype>
     <D:owner>
          <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href>
     </D:owner>
   </D:lockinfo>

   >>Response

   HTTP/1.1 200 OK
   Content-Type: text/xml; charset="utf-8"
   Content-Length: xxxx

   <?xml version="1.0" encoding="utf-8" ?>
   <D:prop xmlns:D="DAV:">
     <D:lockdiscovery>
          <D:activelock>
               <D:locktype><D:write/></D:locktype>
               <D:lockscope><D:exclusive/></D:lockscope>
               <D:depth>Infinity</D:depth>
               <D:owner>
                    <D:href>
                         http://www.ics.uci.edu/~ejw/contact.html
                    </D:href>
               </D:owner>
               <D:timeout>Second-604800</D:timeout>
               <D:locktoken>
                    <D:href>
               opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4
                    </D:href>
               </D:locktoken>
          </D:activelock>
     </D:lockdiscovery>
   </D:prop>

Example - Refreshing LOCK request: ::

   >>Request

   LOCK /workspace/webdav/proposal.doc HTTP/1.1
   Host: webdav.sb.aol.com
   Timeout: Infinite, Second-4100000000
   If: (<opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4>)
   Authorization: Digest username="ejw",
      realm="ejw@webdav.sb.aol.com", nonce="...",
      uri="/workspace/webdav/proposal.doc",
      response="...", opaque="..."

   >>Response

   HTTP/1.1 200 OK
   Content-Type: text/xml; charset="utf-8"
   Content-Length: xxxx

   <?xml version="1.0" encoding="utf-8" ?>
   <D:prop xmlns:D="DAV:">
     <D:lockdiscovery>
          <D:activelock>
               <D:locktype><D:write/></D:locktype>
               <D:lockscope><D:exclusive/></D:lockscope>
               <D:depth>Infinity</D:depth>
               <D:owner>
                    <D:href>
                    http://www.ics.uci.edu/~ejw/contact.html
                    </D:href>
               </D:owner>
               <D:timeout>Second-604800</D:timeout>
               <D:locktoken>
                    <D:href>
               opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4
                    </D:href>
               </D:locktoken>
          </D:activelock>
     </D:lockdiscovery>
   </D:prop>

.. Note::
   The server must not honor the principles Timeout header!

Example - Multi resource LOCK Request: ::

   >>Request

   LOCK /webdav/ HTTP/1.1
   Host: webdav.sb.aol.com
   Timeout: Infinite, Second-4100000000
   Depth: infinity
   Content-Type: text/xml; charset="utf-8"
   Content-Length: xxxx
   Authorization: Digest username="ejw",
      realm="ejw@webdav.sb.aol.com", nonce="...",
      uri="/workspace/webdav/proposal.doc",
      response="...", opaque="..."

   <?xml version="1.0" encoding="utf-8" ?>
   <D:lockinfo xmlns:D="DAV:">
     <D:locktype><D:write/></D:locktype>
     <D:lockscope><D:exclusive/></D:lockscope>
     <D:owner>
          <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href>
     </D:owner>
   </D:lockinfo>

   >>Response
   
   HTTP/1.1 207 Multi-Status
   Content-Type: text/xml; charset="utf-8"
   Content-Length: xxxx

   <?xml version="1.0" encoding="utf-8" ?>
   <D:multistatus xmlns:D="DAV:">
     <D:response>
          <D:href>http://webdav.sb.aol.com/webdav/secret</D:href>
          <D:status>HTTP/1.1 403 Forbidden</D:status>
     </D:response>
     <D:response>
          <D:href>http://webdav.sb.aol.com/webdav/</D:href>
          <D:propstat>
               <D:prop><D:lockdiscovery/></D:prop>
               <D:status>HTTP/1.1 424 Failed Dependency</D:status>
          </D:propstat>
     </D:response>
   </D:multistatus>

UNLOCK
------

The UNLOCK method handles the removal of a lock established via the LOCK
method. A lock may also disappear by itself, for example when a timeout is
reached. If the principle holding a lock finished its operation on the locked
resources it should use the UNLOCK method to release the lock. 

The UNLOCK method only receives a lock token via the Lock-Token header. The
lock identified by this token is to be released.

.. Note::
   Not only the resource identified by the resource URI must be unlocked, but
   all other resources that are locked by the given lock token!

Example - UNLOCK request: ::

   >>Request

   UNLOCK /workspace/webdav/info.doc HTTP/1.1
   Host: webdav.sb.aol.com
   Lock-Token: <opaquelocktoken:a515cfa4-5da4-22e1-f5b5-00a0451e6bf7>
   Authorization: Digest username="ejw",
      realm="ejw@webdav.sb.aol.com", nonce="...",
      uri="/workspace/webdav/proposal.doc",
      response="...", opaque="..."

   >>Response

   HTTP/1.1 204 No Content

Affected base methods
---------------------

The following methods may only be performed on a locked resource if the
performing principle owns the specific lock.

- PUT
- POST
- PROPPATCH
- MOVE
- COPY
- DELETE
- MKCOL

In addition the PROPFIND request is affected by lock support, since lock
information is visible to every principle through the LOCKDISCOVERY and
SUPPORTEDLOCK properties.

Locking resources
=================

Both types of resources (non-collection and collection resources) can be
locked. This section describes differences for both types and other points
directly related to locking resources.

Non-collection resources
------------------------

A non collection resource can be affected directly or indirectly by a lock. In
the first case a principle has issued a LOCK request explicitly for this resource,
only locking this single resource. The second case occurs, if a principle locked a
collection resource and the non-collection resource is a direct or in-direct
descendant of it. For detailed information on this topic see the next section
about locking if collection resources.

Collections
-----------

The LOCK request allows the 'Depth' header to be set to specify the depth of
the created lock. A depth value of ZERO means, that only the affected
collection itself is locked. This might be sensible to add new resources to
this collection. The depth value INFINITY means that the created lock
recursively affects all descendants of the collection. This way it is possible
to lock a complete sub-tree of the WebDAV repository.

Any lock (no matter which depth) on a collection prevents the addition and
removal of direct members of this collection by non-lock-owners. This affects
the following methods:

- PUT
- POST
- MKCOL
- DELETE

If a collection should be locked and any of its members is already locked, this
conflicts with the lock to be set and must result in an 423 error (Locked).
Members that are newly created inside a locked collection or copied/moved to
it are automatically included to the lock. This affects infinity-depth locks as
well as zero-depth ones for direct children of the locked collection.

Lock null resources
-------------------

A write lock might be acquired to a resource that does not (yet) exist. This
is called a "lock null resource". A lock null resource only supports the
methods:

- PUT
- MKCOL
- OPTIONS
- PROPFIND
- LOCK
- UNLOCK

All other methods must return 404 (Not Found) or 405 (Method Not Allowed).

The properties of a null resource are mostly empty, except there must be
LOCKDISCOVERY and SUPPORTEDLOCK properties. If PUT or MKCOL are issued, a null
resource becomes a normal one. The RFC does not state if the lock stays on this
newly created (real) resource or if it is removed.

COPY/MOVE
---------

Both methods destroy locks. If a resource is moved to a locked collection, it
is automatically added to the lock (same principle assumed). Both will not work,
if the destination is locked but no lock is owned by the principle.

Refresh
=======

A LOCK request must not occur twice. To refresh a lock, principles send a LOCK
request with empty body and an If header that specifies the lock tokens to
refresh locks for. If this occurs, the timers of the lock must be reset. A
Timeout header might be send by the principle, but the server may safely ignore
these and simply perform a refresh as it desires.

If header
~~~~~~~~~

In addition to the headers If-Match and If-None-Match, which are described in
the HTTP/1.1 RFC, RFC 2518 (WebDAV) describes the If-Header. The If header is
used to define conditional actions by the client, similar to the 2 headers
named before. However, it is constructed in a much more complex and weird way.

The If header is described with the following pseudo EBNF: ::

   If = "If" ":" ( 1*No-tag-list | 1*Tagged-list)
   No-tag-list = List
   Tagged-list = Resource 1*List
   Resource = Coded-URL
   List = "(" 1*(["Not"](State-token | "[" entity-tag "]")) ")"
   State-token = Coded-URL
   Coded-URL = "<" absoluteURI ">"

It may contain entity tags (see `Entity tag support`_), lock tokens (see `Lock
tokens`_) and combination of both. In addition the header may contain
additional resource URIs to affect not only the main request URI. Luckily, the
If header either containes a tagged list (including affected resource URIs) or
a no-tag list (without resource URIs). It cannot contain a combination of
those.

Both lists (tagged and no-tag) contain a not limmited number of lock tokens
and/or entity tags and maybe prefixed by the keyword "Not". This indicates that the
affected method may only be executed if the condition defined in the list does
not match. This works similar to the If-None-Match header, specified by the
HTTP/1.1 RFC.

To illustrate this complex definition some more, some examples are presented
and explained in following.

No-tag list
------------

::

   If: (<locktoken:a-write-lock-token> ["I am an ETag"]) (["I am another ETag"])

This If header consists of 2 no-tag lists (it does not contain any resource
URIs). The first list consists of a lock token and an entity tag, the second
only contains an entity tag. The semantics of this example is, that the method
containing this If header may only be executed if

- either the first combination of lock token and entity tag is matched
- or if the second entity tag is matched.

Note that the first list item describes a logical AND operation, while the
whole list concatenates its items by logical OR.

Tagged list
-----------

::

   COPY /resource1 HTTP/1.1
   Host: www.foo.bar
   Destination: http://www.foo.bar/resource2
   If: <http://www.foo.bar/resource1> (<locktoken:a-write-lock-token>
   [W/"A weak ETag"]) (["strong ETag"])
   <http://www.bar.bar/random>(["another strong ETag"])

This example does not only show an If header with a tagged list, but also a
context where this could make some sense: The COPY method affects several
resources at once. It works at least on a source (the request URI) and a
destination (see Destination header), Additionally it can affect whole
sub-tress, using the Depth header.

The If header in this case affects 2 resources, while one of the defined
conditions will be checked and the other won't. The first list affects the
request URI and contains 2 elements concatenated with logical OR. The first
item consists of an AND-combination of a lock token and an entity tag. The
second only contains an entity tag. This is similar to the example shown above
for the no-tag list. Except that it contains the "tagging" URI. For the
second tag only an entity tag is listed. Anyway, this condition would not be
checked in the request but simply be ignored, since the resource in the tag is
not affected by the resource.

Not
---

::

   If: (Not <locktoken:write1> <locktoken:write2>)

This simple If header only shows the definition of a "Not" affected list. The
keyword must occur at the very begining of the affected list item. This item
contains 2 lock tokens combined with logical AND. In clear words the requested
affected by this If header will be executed if non of the affected resources is
locked by either of the specified lock tokens.


..
   Local Variables:
   mode: rst
   fill-column: 79
   End: 
   vim: et syn=rst tw=79