= Internals of Aries Transaction Manager
:toc:
:icons: font
== Transaction log configuration
Geronimo transaction-manager component uses http://howl.ow2.org/[HOWL Logger] to manage transaction log.
Transaction log is critical part of transaction management when 2PC protocol is involved. The log
stores prepared and not yet completed transactions so recovery process is possible.
TODO
== Transaction log reference
Transaction logs are stored in binary _files_ consisting of fixed size _blocks_. Each _transaction_ is
stored inside single block. If size of transaction data exceeds blocks size, an exception is thrown like
this:
.Exception thrown when logging large transaction (12 branches) when block size = 1KB
----
java.lang.IllegalStateException
at org.apache.geronimo.transaction.log.HOWLLog.prepare(HOWLLog.java:295)
...
Caused by: org.objectweb.howl.log.LogRecordSizeException: maximum user data record size: 935
at org.objectweb.howl.log.BlockLogBuffer.put(BlockLogBuffer.java:215)
at org.objectweb.howl.log.LogBufferManager.put(LogBufferManager.java:691)
at org.objectweb.howl.log.Logger.put(Logger.java:207)
at org.objectweb.howl.log.xa.XALogger.putCommit(XALogger.java:420)
at org.apache.geronimo.transaction.log.HOWLLog.prepare(HOWLLog.java:290)
... 31 more
----
Before describing the structure of log file, let's have a look at 3 important parameters:
* `maxLogFiles` – number of transaction log files (default: `2`). These are created upfront and their number doesn't change.
* `maxBlocksPerFile` – number of blocks that may be stored in each file (default: `-1`, which means `0x7fffffff` blocks). Mind that
when using default value, 2+++nd+++ transaction log file will be used *only* after writing `2+++31+++-1` transaction
records!
* `bufferSizeKBytes` – a size of single block in kilobytes (default: `4`)
[NOTE]
====
.XIDs
A _transaction_ stored inside a block of a log file is generally an opaque data structure dependant on the
transaction manager and logger used.
====
Now let's have a look at how transaction log file is structured.
When transaction log is completely empty, the first record (block) written may be related to the call
to `javax.transaction.TransactionManager.commit()` or `javax.transaction.UserTransaction.commit()`. Internally,
when using 2PC, `org.apache.geronimo.transaction.manager.TransactionImpl.internalPrepare()` is called
and first log record is stored. This is done *after* calling `javax.transaction.xa.XAResource.prepare()`.
Each _block_ (with size = `bufferSizeKBytes`) may contain more than one _data records_. For example, when
_data record_ related to `prepare()` (`XACOMMIT`) will reside in first _block_ of a file, the first
_data record_ in this _block_ will be of `FILE_HEADER` type.
Here's the structure of each _block_, where the dotted fragment contains arbitrary data:
----
00000000 48 4f 57 4c 00 00 00 01 00 00 04 00 00 00 01 1f |HOWL............|
00000010 95 2b 2d bd 00 00 01 5b e7 37 86 8a 0d 0a .. .. |.+-....[.7.... |
........
000003e0 .. .. .. .. .. .. .. .. .. .. .. .. .. .. 4c 57 | LW|
000003f0 4f 48 00 00 00 01 00 00 01 5b e7 37 86 8a 0d 0a |OH.......[.7....|
00000400
----
* `48 4f 57 4c` is `HOWL` identifier, which is block header magic number
* `00 00 00 01` is the block number (1)
* `00 00 04 00` is the block size (`bufferSizeKBytes`) in bytes. `0x0400` = 1KB
* `00 00 01 1f` indicates end of _data records_ inside _block_. If there's at least 4 bytes remaining before
_block's_ footer, `EOB\n` is stored at this position
* `95 2b 2d bd` is the checksum
* `00 00 01 5b e7 37 86 8a` is the timestamp
* `0d 0a` is supposed to make log easier to investigate in text editor (`\r\n`)
* `...` is the data inside a block, which may consist of several _data records_
* `4c 57 4f 48` is `LWOH` identifier, which is block footer magic number
* `00 00 00 01` is again the block number (1)
* `00 00 01 5b e7 37 86 8a` is the same timestamp as in header
* `0d 0a` again
Each _data record_ is just a list of byte arrays. Before each byte array is written, two shorts have to
be written:
* data type
* data length
Each byte array in a list is written directly, prepended with a short indicating a size of single array.
So a length of entire _data record_ is: `2 + 2 + [(2 + length of byte array)]*` bytes.
When _data records_ are read, the list of arrays is filled up to the point where _data record_ length is reached.
For example, if the _block_ is first block inside transaction file, the first _data record_ is of type `FILE_HEADER`:
----
00000010 .. .. .. .. .. .. .. .. .. .. .. .. .. .. 48 00 | H.|
00000020 00 25 00 23 00 00 00 00 00 01 00 00 00 00 00 00 |.%.#............|
00000030 00 01 00 00 00 00 00 01 5b e7 37 86 8b 00 00 00 |........[.7.....|
00000040 02 7f ff ff ff 0d 0a .. .. .. .. .. .. .. .. .. |....... |
----
* `48 00` is `org.objectweb.howl.log.LogRecordType.FILE_HEADER`
* `00 25` is the size of entire `FILE_HEADER` data without 2 bytes for `48 00` and for the length itself
* `00 23` is the size of first array in list, so the first array of bytes starts at offset `0x24` and
ends at (including) offset `0x46`.
So the single byte array inside _data record_ of `FILE_HEADER` type is:
* `00` means no _auto mark_
* `00 00 00 00 01 00 00 00` is the value of _activeMark_, which is the log key for the oldest active entry in the log
* `00 00 00 00 01 00 00 00` is the log key for beginning of new block sequence number as high mark for current file
* `00 00 01 5b e7 37 86 8b` timestamp for file
* `00 00 00 02` is the number of files of entire transaction log (`maxLogFiles`)
* `7f ff ff ff` is the `maxBlocksPerFile` parameter
* `0d 0a`
_Data record_ written to transaction log is created by HOWL itself.
Here's sample `XACOMMIT` _data record_, which is created by Geronimo Transaction Manager during _prepare_
phase of 2PC:
----
00000040 .. .. .. .. .. .. .. 40 80 00 d4 00 04 47 65 52 | @.....GeR|
00000050 6f 00 40 22 86 37 e7 5b 01 00 00 6f 72 67 2e 61 |o.@".7.[...org.a|
00000060 70 61 63 68 65 2e 61 72 69 65 73 2e 74 72 61 6e |pache.aries.tran|
00000070 73 61 63 74 69 6f 6e 00 00 00 00 00 00 00 00 00 |saction.........|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 |....@...........|
000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000d0 00 00 00 00 00 00 40 01 00 00 00 22 86 37 e7 5b |......@....".7.[|
000000e0 01 00 00 61 70 61 63 68 65 2e 61 72 69 65 73 2e |...apache.aries.|
000000f0 74 72 61 6e 73 61 63 74 69 6f 6e 00 00 00 00 00 |transaction.....|
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000110 00 00 00 00 00 00 00 00 06 72 65 73 2d 30 31 .. |.........res-01 |
----
* `40 80` means `org.objectweb.howl.log.LogRecordType.XACOMMIT`
* `00 d4` is a lenght of all byte arrays inside data created by Geronimo Transaction Manager plus 2x number of byte arrays
So we have the following byte arrays stored:
* 4 bytes `47 65 52 6f` is `GeRo`
* 64 bytes starting with `22 86 37 e7` at offset 0x53
* 64 bytes starting with `00 00 00 00` at offset 0x95
* 64 bytes starting with `01 00 00 00` at offset 0xd7
* 6 bytes `72 65 73 2d 30 31 is `res-01`
And this array of byte arrays (sizes: 4, 64, 64, 64, 6) is exactly:
* javax.transaction.xa.Xid.getFormatId()
* javax.transaction.xa.Xid.getGlobalTransactionId()
* javax.transaction.xa.Xid.getBranchQualifier()
* javax.transaction.xa.Xid.getBranchQualifier() of transaction branch 1
* resource name of transaction branch 1 (`res-01`)
Each additional transaction branch (representing another transactional resource enlisted in transaction)
adds two more byte arrays (`Xid.getBranchQualifier()` and resource name).
Where does this `org.apache.aries.transaction` bytes come from inside _data record_ of `XACOMMIT`?
When `org.apache.geronimo.transaction.manager.XidFactory` instance is created, it is passed some
_transaction manager identifier_ which is arbitrary byte array of maximum 56 bytes size.
Each XID produced by such `XidFactory` uses global transaction id with these bytes:
* 8 bytes of transaction id, which is increasing 32-bit number starting from `System.currentTimeMillis()`
written in little endian (e.g., `22 86 37 e7 5b 01 00 00` == `0x015be7378622`)
* 56 bytes of _transaction manager identifier_
Each transaction branch created by such `XidFactory` is based on globack transaction id:
* 4 bytes if branch number written in little endian (e.g., `01 00 00 00` == `0x01`)
* 8 bytes of `System.currentTimeMillis()` from `XidFactory` initialization (little endian)
* 52 bytes from _transaction manager identifier_ starting from byte 4 (that's why global Id of XID contains
`org.apache.aries.transaction` and branch Ids of XID contain `apache.aries.transaction`
When Geronimo Transaction Manager commits 2PC transaction, two _data records_ are written. First, there's
`USER` record (`org.apache.geronimo.transaction.log.HOWLLog.commit()`):
----
00000430 .. .. .. .. .. .. .. 00 00 00 8d 00 01 02 00 04 | .........|
00000440 47 65 52 6f 00 40 b9 b1 9a e7 5b 01 00 00 6f 72 |GeRo.@....[...or|
00000450 67 2e 61 70 61 63 68 65 2e 61 72 69 65 73 2e 74 |g.apache.aries.t|
00000460 72 61 6e 73 61 63 74 69 6f 6e 00 00 00 00 00 00 |ransaction......|
00000470 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000480 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 |.......@........|
00000490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004c0 00 00 00 00 00 00 00 00 .. .. .. .. .. .. .. .. |........ |
----
* `00 00` means `org.objectweb.howl.log.LogRecordType.USER`
* `00 8d` is length
* 1 byte array with `02` which means `org.apache.geronimo.transaction.log.HOWLLog.COMMIT`
* XID's 4 bytes with `47 65 52 6f` which is `GeRo`, XID's format id
* XID's 64 bytes with global transaction ID (8 bytes of little endian of ID and _transaction manager identifier_)
* XID's 64 bytes with branch qualifier - all zeros, because branches are stored in transaction branches.
And then, there's `XADONE` record:
----
000004c0 .. .. .. .. .. .. .. .. 40 40 00 10 00 08 00 00 | @@......|
000004d0 00 00 01 00 00 47 00 04 00 00 00 00 .. .. .. .. |.....G...... |
----
* `40 40` means `org.objectweb.howl.log.LogRecordType.XADONE`
* `00 10` is length
* 8 bytes array with `00 00 00 00 01 00 00 47` - `org.objectweb.howl.log.xa.XACommittingTx.logKeyBytes`
* 4 bytes array with `00 00 00 00` - `org.objectweb.howl.log.xa.XACommittingTx.indexBytes`
The `{ logKeyBytes, indexBytes }` arrays reference existing `XACOMMIT` record.
`logKey` is `((long)bsn << 24) | buffer.position()`, so we have (see the hexdump of the above `XACOMMIT` _data record_):
* block sequence number = `1`
* position inside this block = `0x47`