The Problem We're Solving ------------------------- Subversion users typically edit sets of files in their working copy. If a working copy contains a set of edited files which represents a single logical change, then commands like 'svn diff', 'svn status', 'svn revert' and 'svn commit' automatically discover the edited files and act on them. A common problem, however, is that users often work on more than one set of logical changes at a time. The user is required to remember which edited file belongs to which set, and carefully run 'diff', 'revert', or 'commit' commands only on lists of files which belong together. One workaround for this problem is to checkout multiple working copies, and have one task per working copy. Of course, this uses a lot of disk space, and it's sometimes inconvenient to move around between working copies. The simple solution we're proposing here is to teach the svn client (and working copy) do some simple management of local, human-named sets of files, known as 'changesets'. The goal is to allow users to create, view, and manipulate sets of files in a working copy by referring to them by name. Doesn't Perforce Do This? ------------------------- Perforce performs changelist management, and it's a large motivation for this new feature. But there's no way to emulate Perforce's feature exactly; it has a different network model than Subversion. So instead, we'll examine the use-cases that Perforce enables, and discuss how to solve those same use-cases in Subversion. Non-Problems ------------- Here are problems/features that are NOT in our list of goals: * Server management of changesets Subversion prides itself on being disconnected; that's why it scales so well. A changeset is an ephemeral thing created by a single user in a single working copy, whose only purpose is to make it easier to manipulate a change-in-progress. It's not a "named revision", or a long-lived object in the repository. That's what global revision numbers are for. Some people aren't happy with the way tags work in Subversion, and have asked for the ability to identify repository revisions by human name. While everybody wants to see the ability to search over revprops (for many reasons!), that whole issue is out of scope for this changelist feature. It's been suggested that when a changelist gets committed, it become a searchable revprop; sounds fine, but lets get changelists and searchable revprops implemented independently first! * Enforcement of groupings Changesets don't exist as a prescriptive SCM process. Some have suggested that the client not allow people to commit individual files in a changelist, or to do some side of server-side process enforcement revolving around changelists. This is definitely not in the Subversion spirit, which allows teams to create whatever policies they wish. The only purpose of changelists is to do provide some convenient bookkeeping to the user. * Overlapping changelists A number of people ask "but what if two different changes within a single file belong to different logical changes?" My reply is: either "tough luck" or "don't do that" or "checkout a separate working copy". My feeling is that trying to create a UI to manipulate individual diff-hunks within a file is a HUGE can of worms, probably best suited for a GUI. While I wouldn't rule it out as a future *enhancement* to a changelist feature, it's certainly not worth the initial effort in the first draft of changelist management. Overlapping changelists do occasionally happen, but they're rare enough that's it's not worth spending 90% of our time on a 10% case -- at least not in the beginning. * "Shelving" of changes Distributed version control systems don't have this sort of problem; one could just do a 'local commit' of each changeset-in-progress, create local branches, and magically swap patches in and out as needed. To that end, many have talked about making subversion working copies into "deep" objects containing some degree of history, or to write a nice 'svn patch' command to read custom 'svn diff' output. My response is: nice ideas, and those sort of really advanced designs are certainly things that simple changelist management can grow to take advantage of, but aren't prerequisites for tackling this problem. Use-Cases --------- A. Define a changelist by explicitly adding/removing paths to it. B. See all existing changelist names (and their member paths) C. Destroy a changelist definition all at once. D. Examine all edits within a changelist (svn diff) E. Revert all edits within a changelist (svn revert) F. Commit all edits within a changelist (svn commit) G. Receive server changes only paths within a changelist (svn update) H. See the history of all paths within a changelst (svn log) I. Fetch or set props on every path within a changelist (svn pl/ps/pe/pg/pd) How Perforce Tackles the Use-Cases ---------------------------------- A. Defining changelists The Perforce server tracks each and every working copy, as well as every changelist within every working copy. All working copy files are read-only until the user declares the intent to edit ('p4 edit') one. The server then makes the file read-write and places it into a changelist with the name 'default'. Users aren't allowed to invent their own names for changelists, as this might lead to namespace overlaps. (This is a side effect of having the server track all changelists.) 'p4 change' creates a new changelist by prompting the user for a log message, at which point the server yanks the 'next' global global revision number and assigns it as a name for the changelist. The server not only tracks the changelist via some number, but also tracks the log-message-in-progress for the list. ('p4 describe' can show the log message attached to a changelist.) B. Viewing changelists At any time, the 'p4 open' command shows all files that are being edited, and which changelists they belong to. It's quite similar to the 'svn status' command, except that the output is somewhat harder to read, due to non-aligned columns. The response time is also quite fast, since p4 doesn't need to crawl the working copy to discover edited files. On the other hand, p4 doesn't scale so well when the server tries to track thousands of users. C. Destroying changelists 'p4 change -d' will delete a changelist, but only if the edited files within the changelist have been reverted. D. Viewing edits in a changelist 'p4 diff' shows contextual diffs for all edited files. This is actually a bit weak, as it shows diffs for *all* changelists in a working copy. Subversion should improve on this by allowing one to 'diff' just a single changelist. E. Reverting a changelist 'p4 revert -c NNN' reverts all edited files within changelist #NNN. Note that it's also possible to revert single files ('p4 revert foo.c'). If a single file within a changelist is reverted, its path is removed from the changelist. F. Committing a changelist 'p4 submit -c NNN' atomically commits changelist #NNN to the repository. If the commit succeeds, a *new* global revision number is assigned to the final commit, and the old 'NNN' number is discarded. (This means that p4 actually burns through global revnums at twice the speed as subversion!) After the commit, the working copy no longer has any record of the changelist. G. Updating a changelist 'p4 sync' is equivalent to 'svn up'. Like subversion, 'p4 sync' can be restricted to specific path targets, but amazingly not restricted to a set of paths that make up a changelist. This may be something subversion can improve upon. H. Examining the history of changelist members 'p4 changes' is the closest thing to 'svn log'. With no arguments, it shows all changelists ever submitted. With specific path arguments, it limits the response to showing only changelists that affected those paths. Again, a changelist number cannot be supplied, which is surprising. I. Propgets/sets on a changelist Perforce has no versioned metadata. Proposal for Subversion's Tackling of Use-Cases ----------------------------------------------- A. Defining changelists Subversion's changelist feature will be entirely client-side bookkeeping. The purpose is to allow users to 'talk about' a set of local paths via a convenient name, often restricting subcommands to operate only on those paths. The 'svn changelist' command allows a user to define a changelist with an arbitrary UTF-8 name, as well as add member paths. (At the moment, a --remove flag is used to remove member paths.) Unversioned items may not be added to changelists. $ svn changelist MYCHANGE foo.c bar.c Path 'foo.c' is now part of changelist 'mychange'. Path 'bar.c' is now part of changelist 'mychange'. $ svn changelist bar.c --remove Path 'bar.c' is no longer associated with a changelist. ### Open question: should we add a UI which allows the working copy to manage a log-message-in-progress for each changelist, the way p4 does? This could be something stored in ~/.subversion/ area. B. Viewing changelists 'svn status' currently shows changelist definitions by crawling the working copy. Output is much more readable than perforce, because we're still preserving column alignment. $ svn st ? 1.2-backports.txt M notes/wc-improvements --- Changelist 'status-cleanup': M subversion/svn/main.c subversion/svn/revert-cmd.c M subversion/svn/info-cmd.c --- Changelist 'status-printing': M subversion/svn/status-cmd.c Note that unlike perforce, changelist membership is orthogonal to whether or not the file has local modifications. So it's possible for 'svn status' to show a changelist containing unmodified files. Conversely, it's possible for a file to be modified, but unassociated with any changelist. 'svn status' considers changelist membership to be inherently "interesting enough" to justify displaying a path, regardless of whether it's modified. Note that merely upgrading subversion won't break scripts that parse 'svn status' output. Such scripts might break *only* if users begin to use the new changelist feature. This is a good balance between allowing subversion's development to progress, while not automatically punishing users for upgrading. (Either way, the "---" characters should prevent scripts from accidentally detecting conflicts with "^C" regular expressions.) ### Open question: at the moment, changelists are implemented by simply storing a new attribute in the .svn/entries file. Rather than having the svn client crawl and 'discover' changelists, should we take a hint from p4 and have them centrally managed in the ~/.subversion/ area? Pros: - much faster than crawling - whole changelist definition available, regardless of CWD Cons: - breaks the 'portable WC' ideal. (If WC moves to another box, changelist definition is lost.) ### Open question: should 'svn status' be able to restrict its output to a single changelist, a la 'svn status --changelist mychange'? C. Destroying changelists Commands can be restricted to operate only on changelist members by specifying the "--changelist NAME" flag. (Perhaps it can be shortened to '--cl' also?) To destroy a changelist, one would need to remove all member-paths from it. There's no good UI for this yet, other than to use 'svn changelist --remove path1 path2 path3 ...'. ### Improve this? D. Viewing edits in a changelist Improve on perforce by allowing 'svn diff' to restrict its output to only members of a certain changelist: $ svn diff --changelist mychange [...] E. Reverting a changelist Allow 'svn revert' to restrict its effect just to members of a changelist: $ svn revert --changelist mychange [...] Again, note that this won't destroy the changelist. The changelist would now contain just a set of unmodified paths, and 'svn status' would continue to display them. (This differs from perforce, whereby local-edits are intimately tied to changelist membership.) F. Committing a changelist 'svn commit' should be able to commit only changelist members, just as if the paths had been typed on the commandline individually: $ svn commit --changelist mychange Modifying foo.c Adding bar.c [...] Committed revision YYY. After the commit succeeds, the committed files are NO LONGER associated with the changelist, and so the changelist definition ceases to exist. (Note: we probably want to have a switch to 'preserve the changelist' after a commit, similar to the way in which the '--no-unlock' switch preserves locks after a commit.) If the user chooses to commit just a single member of a changelist, that member is removed from the changelist after the commit. G. Updating a changelist ### Open question: is this a useful use-case? Perforce doesn't have it, and I've never missed it. I always want to update the entire working copy, not just some small set of files. H. Examining the history of changelist members 'svn log' should be able to restrict its history retrieval to only revisions which affected members of the changelist. So running $ svn log --changelist mychange ...should produce output equivalent to $ svn log member1 member2 member3 ... 'svn log' already knows not to print log messages more than once (i.e. it prints the union of all revisions). Note that this feature would be an improvement over perforce, which allows multiple targets on the commandline, but no changelist shorthand for them. I. Propgets/sets on a changelist 'svn proplist', 'svn propget', 'svn propset', 'svn propdel' should all work with the --changelist switch as well, so that a user can quickly perform metadata operations on a whole set of files. ### Open question: should we also allow 'svn lock/unlock' to operate on changelists? It might be just as convenient in certain scenarios. ---------------- ### Open UI question: If one's CWD is deep within a working copy, how should $ svn subcommand --changlist mychange ...behave? Should it operate on *all* members of the changelist, or only those members within the CWD (and recursively "below")? --> malcolmr and dlr believe that it's perfectly fine to use only parts of changelists 'below' the target path. -------------------- ==> Finished items: * svn changelist [--remove] * svn status shows grouped changelists - 'svn status --changelist' works too * 'svn info' shows changelists * svn commit --changelist * svn revert --changelist * svn log --changelist * svn diff --changelist (wc-wc and wc-repos cases) * svn update --changelist * svn lock/unlock --changelist * svn propget/propset --changelist ### * svn proplist/propdel --changelist ==> TO-DO: * make --cl the same as --changelist, for convenience? * questions about commits: - how does 'svn ci --changelist' interact with nonrecursive commits? - how does it interact with a list of specific targets? - how does it deal with a schedule-delete folder? ---------------------------- Commandline UI use-cases: 1. add path(s) to a CL: svn cl CLNAME foo.c bar.c baz.c 2. remove path(s) from whatever CLs they each belong to. svn cl --remove foo.c bar.c baz.c 3. move path(s) from CL1 to CL2. svn cl CL2 foo.c 4. undefine a CL all at once (by removing all members) svn cl --remove --changelist CLNAME 5. rename a CL svn cl NEWNAME --changelist OLDNAME ================================================================== Feature Revamp: sussman and cmpilato. Goal: changelists should be treated as 'filters' everywhere, not as a way to just add targets to a commandline. The basic syntax of commands will be: svn subcommand target1 target2 ... targetN \ --changelist foo1 --changelist foo2 ... --changelist fooM The CLI parses the targets as usual: possibly inserting an implicit '.' target, canonicalizing the list, etc. The CLI now passes a list of changelist-names down into each svn_client_subcommand() routine as a "bunch of filters" to apply while working. If svn_client_subcommand() decides to process a target -- either one it got explicitly, or one it discovered through recursion -- it first checks that the target is a member of one of the changelists. If not, it skips the target and keeps going. (This is the way 'svn commit' currently works: harvest_committables() only harvests things that are committable *and* a member of the passed-in changelist.) This means that the UI use-cases listed above change slightly: 4. undefine a CL all at once (by removing all members) svn cl TARGET --remove --changelist CLNAME (TARGET might be implicit '.' or not, and depth is empty by default; use --depth to override.) 5. rename a CL svn cl NEWNAME TARGET --changelist OLDNAME (TARGET might be implicit '.' or not, and depth is empty by default; use --depth to override.) TO-DO list: [X] allow multiple --changelist args [X] svn status should display grouped changelists [X] 'svn info' should display a target's changelist field [X] rename --keep-changelist option to --keep-changelists [X] fix --changelist and allow multiple changelists in subcommands: No-problem subcommands: [X] svn changelist --changelist [X] svn commit --changelist [X] svn diff --changelist (only wc-wc and wc-repos cases) [X] svn info --changelist [X] svn propget --changelist [X] svn proplist --changelist [X] svn propset --changelist [X] svn propdel --changelist [X] svn revert --changelist [X] svn status --changelist Problem subcommands (see below): [X] svn update --changelist [X] svn lock --changelist ### removed changelist support [X] svn log --changelist ### removed changelist support [X] svn unlock --changelist ### removed changelist support [ ] ensure that the bindings implementations of these APIs are up to snuff [X] write tests! Problem Subcommands: Using a definition of --changelist as a filter means that subcommands which are, by default, non-recursive in nature, have a somewhat odd interface. For example, 'svn info --changelist FOO' (which ultimately translates to 'svn info . --depth empty --changelist FOO') will either return exactly one info result, or exactly none, depending on whether or not the current working directory is in changelist FOO. This is trivially worked around by deepening the invocation: 'svn info -R --changelist FOO'. But what about subcommands for which there is no --depth support, such as 'lock', 'log', 'unlock'? Do we lose the changelist support, or grow some sort of depth-crawling ability for these things? [RESOLUTION: We've removed changelist support from 'lock', 'unlock', and 'log'.] 'svn update' presents an interesting challenge, too. The public svn_client_update3() API takes a list of paths, and returns a list of revision numbers to which those paths were updated. Each path is treated as, effectively, a separate update -- complete with output line that notes the updated-to revision. So, if we do changelist expansion outside the API, we might turn a single-target operation into a multi-target one, and the user sees N full updates processes happen. If we push 'changelists' down into the API, we can fake a single update with notification tricks. But that starts to get nasty when we look at non-file changelist support later and the interactions with externals and such. And if we push 'changelists' all the way down into the update editor, then we've got a mess of a whole 'nuther type, downloading tons of server data we won't use, and so on. [RESOLUTION: Let the command-line client do the changelist path expansion.]