-*-text-*- If you're looking for a task to take on, please see the TASKS file. This file merely lists immediate, short-term (next few days) stuff on people's stacks. It may make reference to phone conversations and private emails, so a given item might not make much sense unless one is familiar with its out-of-band context. We keep this file under version control mainly for convenience -- feel free to add your own short-term tasks to it, if that's helpful. ------------------------------------------------------------------------ General Problems (someone grab these): - updates (right now) don't bump the revision of every entry as they should; instead we're only bumping the changed entries. - unable to update specific targets right now; we need to distinguish between specific targets and general full recursive updates. - don't use stdio's BUFSIZ (portability?). define an SVN specific constant and use it throughout. - refactor client/main.c's cmdline handling and place it into libsvn_subr. use the new functions within svnadmin/main.c. - svn_string.c::svn_string_createf(): follow the comments about refactoring to prevent double memory usage - update authentication mechanisms so that multiple systems can be defined, and the client will choose the "best" one (given whatever consideration it chooses). Ben todo: - finish svn_fs_is_different, make status command use it. Karl: - Propagate changes down into libsvn_client. Currently, the client usually takes a `targets' list and iterates over it, invoking some libsvn_client routine each time. There are cases where the libsvn_client routine should get the target list directly and do the iteration internally, or even pass the list on down into libsvn_wc. - create .alt files instead of .rej for those that `patch' can't handle. - with Ben, review locking protocol in wc adm directories for sanity. - check apr_open calls, do they assume failure implies null handle? - fix working copy identification/allergy code - make sure type-changing replacements work right Ben and Karl (longer term): - revision numbers in URLs? Why are we depending on the implied `HEAD'?? Ben points out that this solves the "Where does the repos begin problem" too. :-) - xml dtd, for both xml deltas and wc formats Greg: - mod_dav_svn: - implement copy/move in the repos - liveprop hooks: rest of liveprops - what is missing? - what is needed for SVN vs DAV/DeltaV interop? - vsn hooks: any others for DAV/DeltaV interop? - version resource URL prep should look up the node - change the dav_svn_ prefix to ??? (svn_mod_dav_svn too long; svn_mds? svn_dav? leave it? - mod_dav: - switch property handling to buckets/brigades - switch dav_stream to buckets/brigades - complete the propdb API conversion (step 1; step 2 is the brigade thing above) M3 ITEMS: - VERSION stuff. VPATH builds. - cvs2svn - Python bindings - just enough for cvs2svn - the rest - DB_RUN_RECOVERY handling - pool review WHENEVER: - APR: move apr_copy_file() from SVN. and apr_append_file, etc. - security checks (e.g. system() usage in wc/get_editor) Joe: - implementing authentication callbacks for ra_dav/client layers [ still needed? maybe this could be client certs? ] Working Copy: TBDesigned: The WC will need to have some knowledge of "the repository" associated with any given resource. Given a working copy, the client cannot know whether a two URLs are within the same repository or not, so it doesn't know whether one or multiple commit sequences are required to commit the whole working copy. The RA layer (in conjunction with the server) will need to somehow tell the WC about the repository associated with a given directory (or long term, each file?). At commit time, the WC sorts out how many repositories are involved, and performs a commit per repository. [ We never really had a story for discriminating multiple repositories within a given WC. Stopping to think about it, it was just a hand-wave. Given the flexibility in URLs and virtual hosting on the server and whatnot, static analysis of a URL will never be sufficient, but the server can always state definitively the repository for any given resource. ] Filesystem: Anyone, WRT filesystem: We need a function that will remove the *first*, not the last, component from a path. We need this to support full-path lookups in the new FS interface, specifically, for open_path. Suggest something like this: int svn_path_first_component (const char **name, char **path, enum svn_path_style style); This would point *NAME to the first component in *PATH, and modify *PATH to split the first component off. Return 0 if *NAME is empty, otherwise 1. See the pseudocode in open_path in tree.c to see how this would be used. Changes to svn_delta_edit_fns_t: These are the summary emails Ben sent to the dev list (lightly edited), concerning changes proposed and accepted at the meetings in Chicago on January 14-16. A few of these are still being discussed on the list; see the "STATUS" lines for more. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Change #6: Move textdeltas to the "other side" of the editor. STATUS: Being discussed on list; holding off on making this change. The Way It Works Now: The driver of an editor takes a source and target stream, puts them together via svn_txdelta() to produce a stream of "windows". The driver then pushes these windows at the editor's window-handler. Proposed Change: Move this process to the other side of the interface, into the editor implementation, giving the editor the power to deal with the source and target streams directly. Specifically: - remove the editor's apply_txdelta() routine - create two new routines: apply_delta (filebaton, [src_stream], target_stream) set_file_contents (filebaton, [src_stream], delta, enum delta_type) apply_delta() sends svndiff deltas. set_file_contents() can send plain text or other types. It becomes the obligation of the editor implementation to implement at least one of these two routines; if one routine is NULL, the driver must use the other. Rationale: It's too restrictive to force every editor implementation to accept and deal with small svndiff windows. For example: Greg Stein wants to send plain text while debugging his commit-editor and network layer. It's best to allow the RA layer to make it's own choice about how to break up the two streams most efficiently. The reason [src_stream] is optional is that it may be NULL; this presumably means that the editor already has access to the src stream. (Greg and Jim, did I get this explanation totally wrong? My notes here aren't perfectly clear. Please elaborate if you need to.) Problem: Editor composition becomes more difficult if we use streams. A window is a discrete chunk of data that can be used by several consumers, but streams are different: if consumer A reads some data off a stream, then when consumer B reads, she'll get different results. You'd have to design your streams in a funky way to make this not be a problem. In some circumstances, this isn't an issue. After all, usually a set of composed editors is a bunch of lightweight editors, that don't do much, surrounding a core editor that does the real work. For example, an editor that prints out filenames wrapped with an editor that actually updates those files. In such cases, the lightweight editor simply never reads data off the stream, so the core editor is not deprived of anything. But other editors (say, a commit guard?) might want to actually examine file data. That could have bad consequences if we switch from windows to streams. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Change #7: New filesystem "copy" nodes (and renaming add_*() args) STATUS: Will be done by Brane, now that he's dug what this is all about. The Way It Works Now: When we copy (or move) a node in the filesystem, we're currently losing information about where it came from. Our "lazy cloning" model goes a long way, but in the end we get just a bunch of duplicated pointers to the same node revision numbers. E.g.: if a node A points to the same node revision that two others (B and C) point to, there's no way to know whether A was copied from B or C. Proposed Changes: 1. In the filesystem model, create a 3rd node type called a "copy" node. (This is in addition to our "file" and "dir" nodes.) A copy node contains a pointer to a revision and a path. When we create node B as a "copy" of node A, we create a new copy node. This copy node allows us to discover the proper node-revision, but it also tracks the history of the copy. 2. Remember that a copy command is really just an "add with history", and a move command is really just a "delete, followed by an add with history". Thus it's the *add* command, which, when given an ancestor path and revision, creates a copy node. (If there's no history given, then the add command creates a regular node.) For clarity sake, the history arguments to the editor's add() function should reflect this copying. Instead of "ancestor_path" and "ancestor_revision", we'd like to call the arguments "copyfrom_url" and "copyfrom_revision". Rationale: I think it's all been explained above. Other folks, feel free to add to this explanation. Anyone: These three mails between Greg Stein and Ben explain the issue: ---------------------------------------------------------------------- From: Greg Stein Subject: Re: CVS update: subversion/subversion/libsvn_ra_local \ Makefile.am ra_local.h ra_plugin.c split_url.c To: dev@subversion.tigris.org Date: Fri, 9 Feb 2001 17:27:06 -0800 On Sat, Feb 10, 2001 at 12:59:19AM -0000, sussman@tigris.org wrote: >... > @@ -283,7 +296,10 @@ > apr_pool_t *pool, > const svn_ra_plugin_t **plugin) > { > - *plugin = ra_local_plugin; > + svn_ra_plugin_t *p = apr_pcalloc (pool, sizeof (*p)); > + memcpy (p, &ra_local_plugin, sizeof (ra_local_plugin)); > + > + *plugin = p; Overkill. Use: *plugin = &ra_local_plugin; > /* are we ever going to care about abi_version? */ Yes. >... [ split_url ] This function would be a *lot* simpler, if you: 1) make a dup of the URL 2) strip the leading file:// portion (leaving "/abs/path/foo/bar") 3) loop: a) try to open FS with the path. succeed: break b) chop the last component off 4) fs_path is what remains, repos_path is &URL[strlen(fs_path) + 7] (the +7 is to account for "file://" at the start of the URL) The whole algorithm only requires one string dup to hold the shrinking path. stripping the leading "file://" is simply advancing string->data (we should have a utility function for this, because string->blocksize must shrink). The chopping just drops in '\0' into the dup'd path. The return is done by duplicating the input URL, or altering the input ->data and ->blocksize field values. And warning: I'm not sure whether a file URL is in "local" or "URL" style separators. Strictly speaking, our "URL style" is really "http scheme URL style". If the file URL uses "/" no matter what, then we would (strictly) need to convert the dup'd path (the fs_path) to local style before beginning the loop/test. Cheers, -g ---------------------------------------------------------------------- From: Ben Collins-Sussman Subject: Re: CVS update: subversion/subversion/libsvn_ra_local \ Makefile.am ra_local.h ra_plugin.c split_url.c To: Greg Stein Cc: dev@subversion.tigris.org Date: 09 Feb 2001 20:06:42 -0600 Greg Stein writes: > Overkill. Use: > > *plugin = &ra_local_plugin; > Heh, sure. :) > > >... [ split_url ] > > This function would be a *lot* simpler, if you: > > 1) make a dup of the URL > 2) strip the leading file:// portion (leaving "/abs/path/foo/bar") > 3) loop: > a) try to open FS with the path. succeed: break > b) chop the last component off > 4) fs_path is what remains, repos_path is &URL[strlen(fs_path) + 7] > (the +7 is to account for "file://" at the start of the URL) Oh, this is the *easy* way, which I purposely avoided. I wanted to be "correct" by searching from the other direction, thereby always finding the repository with the shortest path, not the longest. Yes, I know, we agreed that we will never allow nested repositories. I guess I was being paranoid and trying to emulate Apache's search methods. :) Do you think it's worth re-writing? ---------------------------------------------------------------------- From: Greg Stein Subject: Re: CVS update: subversion/subversion/libsvn_ra_local \ Makefile.am ra_local.h ra_plugin.c split_url.c To: dev@subversion.tigris.org Date: Fri, 9 Feb 2001 19:19:57 -0800 On Fri, Feb 09, 2001 at 08:06:42PM -0600, Ben Collins-Sussman wrote: > Greg Stein writes: >... > > >... [ split_url ] > > > > This function would be a *lot* simpler, if you: > > > > 1) make a dup of the URL > > 2) strip the leading file:// portion (leaving "/abs/path/foo/bar") > > 3) loop: > > a) try to open FS with the path. succeed: break > > b) chop the last component off > > 4) fs_path is what remains, repos_path is &URL[strlen(fs_path) + 7] > > (the +7 is to account for "file://" at the start of the URL) > > Oh, this is the *easy* way, which I purposely avoided. > > I wanted to be "correct" by searching from the other direction, > thereby always finding the repository with the shortest path, not the > longest. Ah. Right. Sorry... But still a simple change. Search from the left for '/'; replace with '\0'; test for an FS; if not found, then put the '/' back and look for the next '/' (repeat). >... > Do you think it's worth re-writing? Yes, given how much simpler it could be, I think it would be a STACK item that anybody could pick up. It works now, but for long-term maintenance, it would be nice to have a simplifed version. Cheers, -g ----------------------------------------------------------------------