-*- text -*- This file documents what users should expect from path-based authz, and the responsibilities of the implementor of said feature. ============================================================================ WHAT USERS SHOULD EXPECT FROM PATH-BASED AUTHZ ============================================================================ 1. CHECKOUTS Unreadable paths will not be downloaded into a working copy. However, 'svn update' may cause working paths to disappear or re-appear based on changing server authorization policies. (Note: the .svn/entries file may still leak the name of the unreadable path; see the 'Known Leakage' section below.) 2. LOG MESSAGES Log information may be restricted, based on readability of changed-paths. * If the target of 'svn log' wanders into unreadable territory, then log output will simply stop at the last readable revision. If the log is tracing backwards through time, as the plain "svn log" command does, the target will appear to be added (without history) in that revision. * If a revision returned by 'svn log' contains a mixture of readable/unreadable changed-paths, then the log message is suppressed, along with the unreadable changed-paths. Only the revision number, author, date, and readable paths are displayed. * If a revision returned by 'svn log' contains only unreadable changed-paths, then only the revision number is displayed. It's an official recommendation ("best practice") to avoid the "mixed" changed-path situation; users should avoid making a single commit that includes changes to files in both readable and unreadable areas. This scenario is quite annoying for people who can't read all the changed-paths. 3. COPIES (BRANCHING AND TAGGING) Subversion does O(1) copies of entire trees, but unfortunately, this isn't completely compatible with path-based access control. In order to copy an entire tree, every path in the tree must be checked for readability: this is an O(N) operation. Depending on the specific path-based authz module being used, however, there are sometimes solutions that aren't quite so expensive as O(N). 4. TRACING PATH CHANGES If a Subversion 1.1 client attempts to fetch an older version of a file or directory, e.g.: svn cat -r5 foo.c svn diff -r10:28 bar.c ...then there is a potential for failure, should older versions of the file exist at unreadable paths. In other words, the tracing of copies/renames is subject to readability checks. If history-tracing wanders into unreadable territory, the process halts; no further information is retrieved. Example 1: while 'bar.c' might be perfectly readable in both revisions 10 and 28, the 'svn diff' command (above) will return error if the file has an unreadable ancestor somewhere between those two revisions. Example 2: 'svn blame bar.c' will not be able to retrieve unauthorized versions of a file, or any ancestors that precede it. So it will appear that 'bar.c' was wholly added -- without history -- in the first public version *after* the unreadable version. So again, an official recommendation ("best practice") is to avoid renaming or copying files between public and private areas. For users without omnipotent read permissions, this will make renames difficult to follow, and client commands which attempt to trace history are likely to fail. 5. REVISION PROPERTIES Users are allowed to attach arbitrary, unversioned properties to revisions. Additionally, most revisions also have "standard" revision props (revprops), such as svn:author, svn:date, and svn:log. Access to revprops may be restricted, based on readability of changed-paths. * If a revision contains nothing but unreadable changed-paths, then all revprops are unreadable and unwritable. * If a revision has a mixture of readable/unreadable changed-paths, then all revprops are unreadable, except for svn:author and svn:date. All revprops are unwritable. It's an official recommendation ("best practice") to avoid the latter situation; users should avoid making a single commit that includes changes to files in both readable and unreadable areas. This situation is quite annoying for people who can't read all the changed-paths. Notice that for the purposes of gating read and write access to revision properties, Subversion never considers the user's *write* access to the changed-paths. To understand the reason behind this, it helps to understand why revprop access is gated at all. Subversion assumes that revprops for a given revision -- especially the log message (svn:log) property -- are likely to reveal paths modified in that revision. It is precisely because Subversion tries not to reveal unreadable paths to users that revprop access is limited as described above. So as long as the user has the requisite read access to the changed-paths, it's okay if he or she lacks write access to one or more of those paths when attempting to set or change revprops -- the information Subversion is trying to protect through its revprop access control is considered safe to reveal to that user. 6. KNOWN LEAKAGE OF UNREADABLE PATHS Subversion may (occasionally) leak knowledge of the existence of an unreadable path. However, the *contents* of an unreadable file or directory will never be leaked. Here are the known times when this happens: * 'svn ls directory-URL': an unreadable directory entry is still listed along with other entries. * 'svn checkout/update': an unreadable child doesn't appear in the working copy, but the .svn/entries file still contains an entry for it (marked 'absent'). 7. LOCKING If a client attempts to lock or unlock an unreadable path, the command will fail. If a client attempts to retrieve a lock on one path, or a list of all locks "below" a directory, only readable paths will ever be returned; unreadable locked paths remain unknown. ============================================================================ HOW TO IMPLEMENT PATH-BASED AUTHZ ============================================================================ If an RA server implementation wants to implement path-based authz, here are its responsibilities: 1. Implement a read-authz callback (see svn_repos_authz_read_func_t), and pass it to the following svn_repos.h functions: svn_repos_begin_report() svn_repos_dir_delta() svn_repos_history2() svn_repos_get_logs3() svn_repos_trace_node_locations() svn_repos_get_file_revs() svn_repos_fs_get_locks() svn_repos_fs_change_rev_prop2() svn_repos_fs_revision_prop() svn_repos_fs_revision_proplist() svn_repos_get_commit_editor3() svn_repos_replay2() 2. Manually implement authz for incoming network requests that represent calls to: RA->get_file() RA->get_dir() RA->check_path() RA->stat() RA->lock() RA->lock_many() RA->unlock() RA->unlock_many() RA->get_lock() (These concepts aren't wrapped by libsvn_repos because it's just as easy to call an authz func directly on a single path, rather than pass it to a repos wrapper.) 3. Manually implement authz when receiving network requests that represent calls to a commit editor: - do write checks for most editor operations - do read *and* write checks for copy operations. (Note that doing full-out authz on whole trees fundamentally contradicts Subversion's O(1) copy philosophy; in practice, however, specific authz implementations are able to get the same effect while being less expensive than O(N).)