class TxnMonitorTask extends RetryTask implements TransactionConstants, TimeConstants
The retry mechanism is subtle, so bear with me. The purpose is to ensure that if any activity is being blocked by a given transaction, that transaction will be tested at some point in the future (if necessary, i.e., if it still is thought to be active). We assume it to be rare that a transactions that the space thinks is active is, in fact, aborted, so the algorithm is designed to guarantee the detection without a lot of overhead, specifically without a lot of RMI calls.
Each task has three values: a nextQuery
time, a
mustQuery
boolean that force the next query to be
made, and deltaT
, the time at which the following
query will be scheduled. When the task is awakened at its
nextQuery
time, it checks to see if it must make an
actual query to the transaction manager, which it will do if either
mustQuery
is true
, or if we know about
any in progress queries on the space that are blocked on the
transaction. Whether or not an actual query is made,
deltaT
is added to nextQuery
to get the
nextQuery
time, deltaT
is doubled, and
mustQuery
boolean is set to false
.
There are two kinds of requests that a with which transaction
can cause a conflict -- those with long timeouts (such as
blocking reads and takes) and those that are under short timeouts
(such as reads and takes with zero-length timeouts). We will
treat them separately at several points of the algorithm. A
short timeout is any query whose expiration time is sooner than
the nextQuery
time. Any other timeout is long
If a short query arrives, mustQuery
is set to
true
.
The result is that any time a transaction causes a conflict, if
the query on the space has not ended by the time of the
nextQuery
we will attempt to poll the transaction manager.
There will also poll the transaction manager if any conflict occurred
on a query on the space with a short timeout.
The first time a transaction causes a conflict, we schedule a
time in the future at which we will poll its status. We do not
poll right away because often a transaction will complete on
its own before we get to that time, making the check
unnecessary. An instant poll is, therefore, unnecessarily
aggressive, since giving an initial grace time will usually mean
no poll is made at all. So if the first conflict occurs at
T0, the nextQuery
value will be
T0+INITIAL_GRACE
, the boolean
will be true
to force that poll to happen, and
deltaT
will be set to INITIAL_GRACE
.
TxnMonitor
Modifier and Type | Field and Description |
---|---|
private static long |
BETWEEN_EXCEPTIONS
The retry time when we have an encountered an exception
|
private long |
deltaT
next value added to
nextQuery |
private int |
failCnt
count of RemoteExceptions
|
private static long |
INITIAL_GRACE
The initial grace period before the first query.
|
private static Logger |
logger
Logger for logging transaction related information
|
private static long |
MAX_DELTA_T
The largest value that
deltaT will reach. |
private static int |
MAX_FAILURES
The maximum number of failures allowed in a row before we simply
give up on the transaction and consider it aborted.
|
private TxnMonitor |
monitor
the monitor we were made by
|
private boolean |
mustQuery
When we're given an opportunity to poll the transaction manager
for the
txn 's state, do so. |
private long |
nextQuery
The next time we need to poll the transaction manager
to get
txn 's actual state. |
private Map |
queries
All the queries on the space (not queries to the transaction
manager) waiting for
txn to be resolved. |
private Txn |
txn
transaction being monitored
|
ABORTED, ACTIVE, COMMITTED, NOTCHANGED, PREPARED, VOTING
DAYS, HOURS, MINUTES, SECONDS
Constructor and Description |
---|
TxnMonitorTask(Txn txn,
TxnMonitor monitor,
TaskManager manager,
WakeupManager wakeupMgr)
Create a new TxnMonitorTask.
|
Modifier and Type | Method and Description |
---|---|
(package private) void |
add(QueryWatcher query)
Add in a resource.
|
(package private) void |
addSibling(Txn txn)
Add a ``sibling'' transaction, one that is now blocking progress
on one of the same entries.
|
private void |
logUnpackingFailure(String exceptionDescription,
Level level,
boolean terminal,
Throwable t)
Log failed unpacking attempt attempt
|
long |
retryTime()
Return the time of the next query, bumping
deltaT as
necessary for the next iteration. |
boolean |
runAfter(List tasks,
int size)
We can run in parallel with any task, so just return
false . |
boolean |
tryOnce()
Try to see if this transaction should be aborted.
|
private final Txn txn
private final TxnMonitor monitor
private Map queries
txn
to be resolved.
null
until we have at least one. Represented by
QueryWatcher
objects.private int failCnt
private long nextQuery
txn
's actual state.private boolean mustQuery
txn
's state, do so.private long deltaT
nextQuery
private static final long INITIAL_GRACE
private static final long BETWEEN_EXCEPTIONS
private static final long MAX_DELTA_T
deltaT
will reach.private static final int MAX_FAILURES
private static final Logger logger
TxnMonitorTask(Txn txn, TxnMonitor monitor, TaskManager manager, WakeupManager wakeupMgr)
public long retryTime()
deltaT
as
necessary for the next iteration. If the transaction has voted
PREPARED
or the manager has been giving us a
RemoteException
, we should retry on short times;
otherwise we back off quickly.public boolean runAfter(List tasks, int size)
false
.runAfter
in interface TaskManager.Task
tasks
- the tasks to consider. A read-only List, with all
elements instanceof Task.size
- elements with index less than size should be consideredvoid addSibling(Txn txn)
read
, another transaction can read the same
entry, thereby also blocking that same client. This means that
the transaction for the second read
must be
watched, too. The list of queries for the second transaction
might be less that the list of those in this transaction, but
the process of figuring out the subset is too expensive, since
we have tried to make the checking process itself cheap,
anyway. So we add all queries this task is currently monitoring
to the task monitoring the second transaction. If there are
no queries, then the blocking occurred because of a short query
or all the queries have expired, in which case the second transaction
isn't blocking the way of anything currently, so this method does
nothing.
Of course, in order to avoid blocking the thread that is calling
this (which is trying to perform a read
, after
all), we simply add each lease in this task to the monitor's
queue.
public boolean tryOnce()
true
(don't repeat the task) if it knows that the
transaction is no longer interesting to anyone.void add(QueryWatcher query)
mustQuery
to
true
.Copyright 2007-2013, multiple authors.
Licensed under the Apache License, Version 2.0, see the NOTICE file for attributions.