This section describes how the Obliterate Transaction works, using the model of obliteration in which an obliterated node-rev is deleted from its revision (and not replaced with other content). The functional spec diagrams for this model are in <notes/obliterate/fspec-dd1/>.
Let us obliterate the file node-rev called "tuna" in r2, where the content said "Fried". We don't want anyone to see that content any longer.
More precisely, let us obliterate the "tuna" entry in the directory known as "fish" in r2. If a copy of the file node whose content is "Fried", and that was called "tuna" in r2, also exists elsewhere (in a tag, a branch, a copy or another revision), let us choose to let all such copies remain in the repository for now. (We can obliterate them separately if we wish.) In this example, there are no such copies of the file node.
______________________________________________________ |___1_______2________3________4________5_________6_____... | \ | \_____ ___|____ __\_____ |D | |D | Key: | | | | Each rectangle with 'D' or | A | | A | 'F' in top-left corner is | \ | | \ | a separate node - a dir or | B \ | | B \ | file node respectively. |__/___\_| |__/___\_| Text shown in a node is its / \ / \ content, not its name. | ___\________/ \ | / \ \ ___|__/ ___\____ ___\____ |D | |D | |D | | | | | | | | | | fish | | fish | |_______| |___\____| |___\____| \ \ \ \ ___\____ ___\____ |D | |D | | | | | | tuna | | tuna | <- entry to obliterate |___\____| |___\____| \ \ \ \ ___\____ ___\____ |F | |F | | | | | |"Fresh" | |"Fried" | |________| |________|
We construct a transaction containing a new version of the r2 tree. It is like the old r2 tree except we have a new node to hold the directory that no longer contains tuna, and a new node for each of its parent directories up to the root.
We will do this in two steps. First, a begin_obliteration_txn() function will create a new transaction that is a mutable clone of the old r2. Second, we will use the normal transaction editing functions to delete the "tuna" entry.
______________________________________________________ |___1_______2________3________4________5_________6_____... | \ | \_____ ___|____ __\_____ ________ |D | |D | |D | | | | | | | <- new r2 tree | A | | A | | A | | \ | | \ | | \ | | B \ | | B \ | | B \ | |__/___\_| |__/___\_| |__/___\_| / ______________/_____\__________/ \ | / ___\________/ \ \ | / / \ \ \ ___|/_/ ___\____ ___\____ ___\____ |D | |D | |D | |D | | | | | | | | | | | | fish | | fish | | fish | |_______| |___\____| |___\____| |___\____| \ \ \ \ \ \ ___\____ ___\____ ___\____ |D | |D | |D | | | | | | | | tuna | | tuna | | | |___\____| |___\____| |________| \ \ \ \ ___\____ ___\____ |F | |F | | | | | |"Fresh" | |"Fried" | |________| |________|
We link the new transaction in as r2, replacing the old r2. The old r2 tree does not yet go away; its nodes are still present and other revisions and transactions may be referring to them. This operation is analogous to the "finalize" stage of a commit, but it is not the final operation in an obliteration.
______________________________________________________ |___1_______2________3________4________5_________6_____... | \______________________ | \ ___|____ ________ __\_____ |D | |D | |D | <- r2 is now this tree | | | | | | | A | | A | | A | | \ | | \ | | \ | | B \ | | B \ | | B \ | |__/___\_| |__/___\_| |__/___\_| / ______________/_____\__________/ \ | / ___\________/ \ \ | / / \ \ \ ___|/_/ ___\____ ___\____ ___\____ |D | |D | |D | |D | | | | | | | | | | | | fish | | fish | | fish | |_______| |___\____| |___\____| |___\____| \ \ \ \ \ \ ___\____ ___\____ ___\____ |D | |D | |D | | | | | | | | tuna | | tuna | | | |___\____| |___\____| |________| \ \ \ \ ___\____ ___\____ |F | |F | | | | | |"Fresh" | |"Fried" | |________| |________|
Now the old r2 tree is orphaned. Some of its nodes are no longer referenced and can therefore be deleted. In particular, the old file-rev containing "Fried" is no longer referenced through r2 - and in our example it is not referenced from any other revision either.
We could delete the orphaned nodes right away, or we could leave them in place for now and at some later time trawl through the list of all nodes looking for orphans and deleting them. The latter scheme is known as "garbage collection".
Let us assume for now that we will delete the orphaned nodes right away. In fact, we will do it during the "finalization" so as not to leave any orphaned nodes in the tree. (The "no dead nodes" rule in <www/design.html#server.fs.struct>.)
Starting from the root directory node of the old r2, we traverse that tree, deleting orphaned nodes from the top down. First we delete that root directory node. (This is just an illustration; the implementation can delete them in whatever order or manner it chooses, as long as it is transactional.)
______________________________________________________ |___1_______2________3________4________5_________6_____... | \______________________ | \ ___|____ __\_____ |D | * * * * |D | | | * GONE! * | | | A | * * * * | A | | \ | | \ | | B \ | | B \ | |__/___\_| |__/___\_| / _______________________________/ \ | / \ \ | / \ \ ___|/__ ___\____ ________ ___\____ |D | |D | |D | |D | | | | | | | | | | | | fish | | fish | | fish | |_______| |___\____| |___\____| |___\____| \ \ \ \ \ \ ___\____ ___\____ ___\____ |D | |D | |D | | | | | | | | tuna | | tuna | | | |___\____| |___\____| |________| \ \ \ \ ___\____ ___\____ |F | |F | | | | | |"Fresh" | |"Fried" | |________| |________|
After deleting the root directory node, we leave the directory node known as "B" in place because it still has other parents, but delete that known as "A" because it becomes orphaned. Similarly we delete the dir known as "fish" and the file known as "tuna".
______________________________________________________ |___1_______2________3________4________5_________6_____... | \______________________ | \ ___|____ __\_____ |D | |D | | | | | | A | | A | | \ | | \ | | B \ | | B \ | |__/___\_| |__/___\_| / _______________________________/ \ | / \ \ | / \ \ ___|/__ ___\____ ___\____ |D | |D | |D | | | | | | | | | | fish | | fish | |_______| |___\____| |___\____| \ \ \ \ ___\____ ___\____ |D | |D | | | | | | tuna | | | |___\____| |________| \ \ ___\____ |F | | | |"Fresh" | |________|
That is the end of a simple obliteration, in which there were no other references in the repository to the obliterated node or its parents.
Note that there is no restriction on references to other sub-trees of the original r2. For example, existing or future references to the node called "B" are unaffected, even if a transaction using such a reference is being constructed while this obliteration is happening.
Consider a case in which a normal commit transaction is being constructed at the same time as the above obliteration is carried out. The new commit transaction includes a pointer to the node called "fish" in old r2, which exists at the time of construction. By the time we try to finalize this commit, the obliteration has been finalized and the node pointed to no longer exists. We cannot allow a transaction with a broken link to be committed.
Finalization must include a check that all referenced existing nodes do in fact still exist. This check must be performed both in normal commit finalization and in obliterate finalization.
That check could be costly if implemented naïvely, but there are ways to make it cheap. First, ensure no invalid node is added during construction. Second, detect whether any obliteration has been performed in the repository since the transaction was started. ### TODO: "Obliteration Serial Number".
We don't mean to obliterate the "node" but rather the directory entry that points to the node. If another directory entry in this same revision or in another revision points to the node, that is no problem: we will simply re-write the directory so as to remove the entry, but we will not delete the node itself because it will not be orphaned.
Diagram of a schema for MYSQL by Edmund Horner, which is like the BDB schema except with exploding of skels into table columns - transaction_props, transaction_copies, representation_windows.
FSFS structure doc
FS-BDB structure doc: nodes, node-revs, keys, skels, reps, deltifying, txns, changes, copies, locks, merge rules, uuid, summary (structure syntax), misc-table (forward-delta support)
FS-BDB fs-history doc: DAG node-revs, node-rev-ids, copies, copy-ids