Merge tracking functional specification. Describes Subversion 1.5.0, except where noted as unimplemented. This is a living specification, which will change as features are added or refined.
Merge operations involving a single source URL (e.g.
merge -cN URL) allow the revision range and source URL
parameters to be optional. The revision range defaults to "all
unmerged revisions", while the source URL is inferred using a
combination of merge info and copy history. When a revision range is
not provided, merge operations are not able to "revert" changes
(e.g. a la
svn merge -c -7 URL).
See the repeated merge section below for discussion of various merge algorithms, and details on the merge algorithm used.
Merge info is not taken into consideration for three-way
merges, merge operations which do not specify identical "from"
and "to" URLs (e.g.
svn merge FROM_URL@REV1 TO_URL@REV2).
In the future, Subversion will likely support this, but currently
lacks sufficient history and merge info between the repository and
client to perform this operation in a reasonable manner. The primary
use case this will impact is vendor branches.
Output is shown the same as pre-Merge Tracking, except for:
Copy and move operations handle two types of merge info:
svn:mergeinfoproperty on the source path.
Copy/move operations which contact the repository include:
These operations always propagate both explicit and implicit merge info. Other than the inclusion of merge info, operation is effectively the same as pre-Merge Tracking.
Pre-Merge Tracking, WC to WC operations occurred offline (e.g. with no repository access). This is a typical behavior of refactoring tools (e.g. IDEs like Eclipse), and is very useful when offline (e.g. on an airplane or subway, or at a cafe).
However, to propagate merge info during copy/move operations, access to both a path's comprehensive merge info and its history is necessary. To preserve offline operation, the Merge Tracking implementation supports two modes:
This behavior is comparable to the difference between
svn status -u.
While some state indicating delayed merge info retrieval and handling could instead be stored in WC to preserve offline operation, there are complications with this when subsequent uncommitted revert operations should change the merge info (we'd have to store negative merge info in the WC).
When merging to a WC with sparsely populated directories, non-inheritable mergeinfo for the merge is set on the deepest directories present.
[JAF] The above may be the actual behaviour but sounds too simplistic to be the desired behaviour. Although a simple depth attribute is recorded for each dir in the WC, the ambient depth of a dir in the WC is not simply one of empty/files/immediates/infinity, but rather is a tree in which different branches are populated to different depths. Surely the merge ought to respond to the ambient depth rather than the simple depth attribute.
[JAF] It's not clear whether an incoming addition should be honoured if the added node would fall outside the ambient depth. The sparse- directories design doesn't seem to address this case for updates, let alone merges. It would be silly if a merge could delete such a node (that was present in the WC despite being outside its parent's 'depth' attribute) and could not then re-add a node of the same name in order to perform both halves of an incoming replacement. Issue #4164 "inconsistencies in merge handling of adds vs. edits in shallow targets" is related.
Switched paths are treated as the root of a working copy regarding mergeinfo inheritance, recording, and elision. Specifically, for a merge target with an arbitrary switched subtree:
WC-Root | Target | SSP | \ | \ SS SSS Target - The WC target of the merge operation (may be same as WC-Root) SSP - Switched Subtree Parent (may be same as Target) SSPS - Switched Subtree Sibling (zero or more) SS - Switched Subtree
Note: If SS is itself the target of the merge, then the no special handling is needed, the merge takes place as if SS is the root of the WC.
WC-Root | Target | V SSP SSRP.. | \ \ | \ V x V SS SSS SSRP - Switched subtree's repository parent --> Mergeinfo inheritance --x No mergeinfo inheritance
WC-Root | Target ('/SRC:N') | SSP('/SRC:N*') | \ | \ | SSS('/SRC:N') | SS('/SRC:N' and possibly the mergeinfo inherited from SSRP if SS had no pre-existing explicit mergeinfo)
WC-Root ^ | Target ^ | SSP x ^ | \ | \ SS SSS --> Mergeinfo elision possible --x No mergeinfo elision possible
Issue #4163 "merged deletion of switched subtrees records non-inheritable mergeinfo": If a merge deletes the path SS, the desired behaviour is currently undefined and the actual behaviour is that a commit will delete both SS (from SSRP) and SS@BASE (from SSP).
Why does merging work this way with switched subtrees?
If a subtree (SS) is switched, that means the user has chosen for the time being to work with a substitute for the original subtree (SS@BASE), knowing that any modifications made in SS can be committed only to the repository location of SS and the original subtree SS@BASE remains hidden and unaffected.
The general semantics of a merge is to apply local modifications to the working copy and record the merge as having been applied to the tree that is represented by the working copy.
Merge tracking should ensure that the subtree of the merge that goes into SS is recorded as being applied to SS, while the subtree SS@BASE should be recorded as not having received that merge.
Since the working copy represents parts of two different branches, two parts of the merge are thus applied to the two different branches, and recorded as such when the user commits the result.
If the user is doing a merge that may affect SS, it is reasonable to assume that SS is an alternative variant of SS@BASE rather than some totally unrelated item. So, in terms of Subversion's loose branching semantics, SS is a 'branch' of SS@BASE. If the user chooses to merge when the assumption is false and SS doesn't have a sensible branching relationship with SS@BASE, the result will be nonsensical or, in concrete terms, there will be merge conflicts.
Note: Many typical branching policies would forbid committing to two branches at once, let alone committing merges to two branches at once. However, the user may have reasons for doing this merge without intending to commit the result as-is.
Property changes from
propdel operations can be used to change merge info.
However, as these operations do not attempt to address merge info
inheritance, changes to merge info on a directory affects merge info
on any child paths.
Merge info set on a working copy "child" path as a result of a merge, switch, or update, may fully/partially elide to the path's nearest working copy or repository ancestor with fully/partially equivalent merge info. Elision is attempted as part of any merge/switch/update:
Properties on '/A_COPY_2': svn:mergeinfo : /A:4-9 /A_COPY:3 Properties on '/A_COPY_2/B/E': svn:mergeinfo : /A/B/E:4-9 /A_COPY/B/E:3The merge info on 'A_COPY_2/B/E' elides to 'A_COPY_2' because the only differences between the merge source paths on each is 'B/E' which is the same as the relative path difference between 'A_COPY_2/B/E' and 'A_COPY_2'.
Properties on '/A_COPY_2': svn:mergeinfo : /A:4-9 /A_COPY: Properties on '/A_COPY_2/B/E': svn:mergeinfo : /A/B/E:4-9 Properties on '/A_COPY_2': svn:mergeinfo : /A:4-9 Properties on '/A_COPY_2/B/E': svn:mergeinfo : /A/B/E:4-9 /A_COPY/B/E:In both of the above examples the merge info on 'A_COPY_2/B/E' elides to 'A_COPY_2'.
Properties on '/A_COPY_2': svn:mergeinfo : /A:4-6 Properties on '/A_COPY_2/B/E': svn:mergeinfo : /A/B/E:5 /A_COPY/B/E:The empty revision range merge info from 'A_COPY/B/E' on 'A_COPY_2/B/E' elides, leaving:
Properties on '/A_COPY_2': svn:mergeinfo : /A:4-6 Properties on '/A_COPY_2/B/E': svn:mergeinfo : /A/B/E:5
The above rules apply only to mergeinfo without non-inheritable revision ranges. Mergeinfo with non-inheritable revision ranges cannot elide or be elided to.
Merge Tracking meta data is stored in housekeeping properties
While direct manipulation of housekeeping properties can be used to change merge info, commands to manipulate this information have been provided. Either style of operation supports adjustment of merge info when manual merges occur, and can also be used to fulfill block changes undesired for merge (later, this might be better-addressed by a separate housekeeping property).
merge --record-onlyadds (or subtracts, if a reversed revision range is supplied) merge info for a path without performing the actual merge.
propsetchanges merge info for a path.
propdelremoves merge info for a path.
The Commutative Author and Revision Reporting feature has been implemented, and will be included in 1.5.0.
These features may or may not be completed for 1.5.0.
Show changesets available for merge/already merged from one or more
merge source(s). The command-line client's default output format
should be equivalent to that of
svn log, and allow for
XML-formatted output (for machine parsing). Blue sky, the
command-line could also produce an output format equivalent to that of
Show where a changeset has been merged from/merged to, providing merging revision, URL, and rangelist. The command-line client should allow for XML-formatted output (for machine parsing).
The Find Paths containing Specific Incarnation of Versioned Resource portion of this feature is not yet scheduled for implementation.
There are two general schemes for solving the repeated merge problem. Subversion 1.5 uses the Most Recent Common Ancestor (MRCA) approach. If a later version of Subversion (e.g. 2.0) overhauls the Merge Tracking implementation, it'll likely use the Ancestry Set (AS) approach.
In this scheme, An optional set of merge sources in each
node-revision. When asked to do a merge with only one source (that
svn merge URL, with no second argument), you
compute the most recent ancestor and do a three-way merge between the
common ancestor, the given URL, and the WC.
To compute the most recent ancestor, you chain off the immediate predecessors of each node-revision. The immediate predecessors are the direct predecessor (the most recent node-revision within the node) and the merge sources. An interleaved breadth-first search should find the most recent common ancestor.
In this scheme, you record the full ancestry set for each node-revision -- that is, the set of all changes which are accounted for in that node-revision. (How you store this ancestry set is unimportant; the point is, you need a reasonably efficient way of determining it when asked.) If you are asked to "svn merge URL", you apply the changes present in URL's ancestry but absent in WC's ancestry. Note that this is not a single three-way merge; you may have to apply a large number of disjoint changes to the WC.
Make 'hunks' of contextually-merged text sensitive to ancestry.
A high-resolution version of repeated merge. Rather than tracking whole changesets, we track the lineage of specific lines of code within a file. The basic idea is that when re-merging a particular hunk of code, the contextual-merging process is aware that certain lines of code already represent the merging of particular lines of development. Jack Repenning has a great example of this from ClearCase (see ASCII diagram below).
See the variance
adjusted patching document for an extended discussion of how to
implement this by composing diffs; see
svn_diff_diff4() for an implementation of same. We
may be closer to ancestry-sensitive merging than we think.
Here's an example demonstrating how individual lines of code can be tracked. In this diagram, we're drawing the lineage of a single file, with time flowing downwards. The file begins life with three lines of text, "1\n2\n\3\n". The file then splits into two lines of development.
1 2 3 / \ / \ / \ one 1 two 2.5 three 3 | \ | | \ | | \ | | \ | | \ one ## This node is a human's | two-point-five ## merge of two sides. | three | | | | | | one one Two two-point-five three newline \ three \ | \ | \ | \ | \ | \ | \ | \ | one ## This node is a human's Two-point-five ## merge of the changes newline ## since the last merge. three
It's the second merge that's important here.
In a system like Subversion, the second merge of the left branch to the right will fail miserably: the whole file's contents will be placed within conflict markers. That's because it's trying to dumbly apply a patch that changes "1\n2\n3" to "one\nTwo\nthree", and the target file has no matching lines at all.
A smarter system (like Clearcase) would remember that the previous merge had happened, and specifically notice that the lines "one" and "three" are the results of that previous merge. Therefore, it would ask the human only to deal with the "Two" versus "two-point-five" conflict; the earlier changes ("1\n2\n3" to "one\ntwo\nthree") would already be accounted for.
AS allows you to merge changes from a branch out of order, without doing any bookkeeping. MRCA requires you to merge changes from a branch in order.
MRCA is simpler to implement, since it results in a three-way merge (which is well-understood by Subversion). However, it may not handle all edge cases. For instance, it may break down faster if the merging topology is not hierarchical.
MRCA may be easier for users to understand, even though AS is probably simpler to a mathematician.
Consistency with other modern version controls systems is desirable.
If a user asks to merge a directory, should we apply MRCA or AS to each subdirectory and file to determine what ancestor(s) to use? Or should we apply MRCA or AS just once, to the directory itself? The latter approach seems simpler and more efficient, but will break down quickly if the user wants to merge subdirectories of a branch in advance of merging in the whole thing.
Merging inevitably produces conflicts which cannot be resolved by an algorithm alone. In such a case, human intervention is required to resolve the conflicts. The merge algorithm used by Subversion's Merge Tracking implementation makes this problem worse, since it breaks a requested merge range into several merges to avoid repeating merges which have already been applied to a merge target or its children. After a conflict is encountered, merges of subsequent revision ranges must be aborted, since tree conflicts or previous content conflicts cannot be reliably merged into (e.g. you can't merge into a file that either isn't there or which you could potentially merge inside one side of a conflict marker).
To help alleviate the pain of conflict resolution, a merge conflict resolution callback can be employed by Subversion clients. This callback is invoked whenever merge conflicts are encountered, and can takes steps like launching a graphical merge tool (for interactive conflict resolution), or following a pre-specified directive like "always use the version from my merge source". This last implementation can be used to support the SCM automated merge use case.
The command-line client includes a merge conflict resolution
callback which behaves much like svk, when in interactive
mode prompting for how to resolve each conflicted file or property
value. When in non-interactive mode (or configured to disallow
interactive conflict resolution via
interactive-conflicts = no), conflict resolution is postponed
until post-merge (as in pre-1.5 releases). See the 1.5 release notes for an
In a post-1.5 release, the command-line client will provide an interactive conflict resolution option to display some context for each conflict in a path or property value, and prompt for how to resolve it. The merge algorithm will attempt to continue applying more of the requested merge after conflict is encountered, merging what it can around the conflicted area of the WC, and possibly supporting an option to complete the remainder of an unfinished merge operation after conflicts have been resolved manually.
Related discussion from the dev@ mailing list can be found here:
Issue #2022 is loosely related.
No explicit facility is provided for distribution of conflict resolution. To support this use case, developers can co-ordinate with each other to resolve merge conflicts on portions of a tree, and trade patches.
No explicit steps are necessary to migrate the content of a pre-Merge Tracking repository. Only an upgrade to Subversion 1.5.0 is necessary.
TODO: Merge meta data from svnmerge.py. Dan Berlin has written
Python code to perform this migration; it needs to be made available
tools/server-side/ area of the distribution .
Executive summary for client/repository inter-op:
Gory detail for client/repository inter-op:
Subversion dump files continue to be fully portable between pre- and post-Merge Tracking versions of Subversion.