CollabNet Merge Tracking Summit

On Tuesday, January 17th, 2006, CollabNet held a merge tracking summit for some of their Subversion-using customers to solicit use cases and requirements for the enterprise. The results are below; see also the pre-summit survey questionnaire and responses.

(Note that this material is still being integrated with the rest of the stuff in the Subversion merge-tracking area.)

Summit Results

Summary

Enterprise users have branches which last longer than a typical open source project (e.g. more than a decade). They tend to do merges more often, involving more files. They may have dedicated staff and tool suites for performing merges and answering inquires about merges. Fundamentally, however, these are differences of scale, not kind. There were no real surprises in terms of functional requirements: it appears that enterprise users' needs are of the same nature as the needs of open source communities, individuals, and small- to medium-sized shops. Enterprise users also have a greater need for auditability/traceability, largely due to compliance requirements (e.g. FDA, Sarbanes Oxley, etc.). Based on the notes below and elsewhere in the merge-tracking area, it does not appear that this would change what meta-data gets recorded, but it does imply more sophisticated data-mining and reporting features than the average user needs.

Details

  • Even basic "merge memory" would be incredibly helpful in shops where minimal, inconsistent process prevails. Users are often very average developers/maintenance programmers.

  • Very long-lived branches (e.g. 10+ years) containing very large binaries, tons of merges between them, and subject to duplicate bug reports should be handled.

  • The ability to manually adjust any merge meta data is crucial. For instance, someone may hand-edit one or more resources to achieve the same result as a merge, or hand-edit the result of a merge in a way which significantly alters the change. Either way, one might still want to represent that the change does still exist in the target line. In general, humans who claim that they know better than the tool should be trusted.

  • The question "What branches contain this exact version of file X?" is important. (In Subversion, that would be "What paths contain this exact version of X?") This may imply exposing some sort of unique object identifier, more than we offer so far. This desire was also noted at the EuroOSCON 2005 Version Control BOF session, search for "What branches/tags is this version of this file present in?".

  • Same as above, but for changes: "What branches include change C?"

  • The ability to push a given change out to multiple branches. Selection mechanisms for destinations should be sophisticated. For example, select all branches that have a common ancestor with the source branch of the change, all branches except an exclusion set, all branches matching a regular expression, etc. The medical systems manufacturer does this type of operation daily.

    The exact same desire was expressed at the EuroOSCON 2005 Version Control BOF session, see the item containing the string "find my descendants" in that document.

  • Surprisingly, svn blame improvements were not a high priority. The current output (which presumably would show the revision in which a merge took place, rather than the original source revision), was deemed generally okay. Possibly, revisions which are marked as merges should be displayed as such, so that the user knows to drill down further.

  • svn log on a merged revision should be able to automatically provide both who made the original change and who performed the merge.

  • A portable, human-readable change format for review would be useful. Programmatic parsability was not a high priority, however. An extended patch program was not considered crucial. The desire for a "Smart patch format" was also noted at the EuroOSCON 2005 Version Control BOF session.

  • Visual interfaces to branch/merge management are very useful. Such a tool could graphically show which branch/revision have been merged where, and the ancestry of a branch. Araxis Merge was cited as an excellent example of a tool providing the merge managment portion of this type of UI. The need for GUI interfaces to branch topology was also discussed at the EuroOSCON 2005 Version Control BOF session, search for "display topology". TODO: Follow-up sessions via WebEx will add more detail here.

  • The ability to do foreign branch is very good (for similar reasons to those in the open source world -- many commercial organizations are adopting open source-style practices internally).

  • Partial change merging: The merge meta data itself should be atomic. We record that you either applied a whole change, or not. Of course, hand edits and the log message may tell a more detailed story.

  • Edited merges should be disentanglable and viewable as separate pieces, i.e. [merge] + [edits]. However, this was a nice-to-have than a must-have.

  • An option to merging to say "If anything conflicted, then show all merged regions in a conflicted style, even those which did not conflict."

  • TortoiseSVN merge is apparently good (it has the Araxis Merge-style "take my version or their version"), but needs a built-in text editor for more involved conflict resolution.

  • Format-specific merge tools is an important need.

  • The ability to automate merges (e.g. from a stable branch to a development branch), including interfaces for resolving conflicts and handling other errors, is important. Customers who do multi-thousand file merges stress this.

  • Questions users want to ask: "Is this version of foo.c the 'latest' version? Are there changes out there which are applicable to foo.c, that have not been applied? What are they?"

  • Any action that can be done to a tag/branch (e.g. merge a change, remove a change, etc.) should be hook-protected. Merging itself should be a first-class hook operation (e.g. commit, revprop-change, etc.)

  • The ability to "shadow" versioned resources: Make a branch, but have most of the branch follow the original source, while only a few files are selected for branch development (e.g. "unshadowed"). Is this different from systems which allow you to branch individual resources in the first place? Can this be achieved with a finer-grained svn switch? Note: This is related to the shared file storage issue in Subversion's own issue tracker, issue #2286. It was also expressed at the EuroOSCON 2005 Version Control BOF session, search for "The One-File-In-Many-Branches Problem" there, which discusses how p4 and ClearCase do this.

  • Roll-back should be convenient, and should be recorded as a subtraction event, visible as such in the meta data. Roll-back was common at a few shops, and more rare at others.

  • Asking change C to where it has been ported was not a strong need (e.g. reporting).

  • One group (the financial information company) wanted merge resolution to be distributable across the developer team, rather than limited to one person's working copy. The same group said they do most of their merging in GUIs, and that the merge process saves reports (when they use the commandline tools, they save the logfiles similarly). The longest they've looked back is a few weeks, though.

  • The financial information company also stressed the importance of merge previews, dry runs that would allow them to see conflicts and "non-trivial, non-conflicting" (NTNC) changes in advance. These previews should be exportable and parseable, so they can be passed around to others.