It can be difficult to determine why some operations are taking longer than expected. For example, you may be looking up items with very low latency, but sometimes the lookups take much longer. Determining the cause of the delay is difficult because the system is distributed, and the typical lookup is fast.
To provide insight into what accumulo is doing during your scan, you can turn on tracing before you do your operation:
DistributedTrace.enable(instance, zooReader, hostname, "myApplication"); Trace scanTrace = Trace.on("client:scan"); BatchScanner scanner = conn.createBatchScanner(...); // Configure your scanner for (Entryentry : scanner) { } Trace.off();
Accumulo has been instrumented to record the time that various operations take when tracing is turned on. The fact that tracing is enabled follows all the requests made on behalf of the user throughout the distributed infrastructure of accumulo, and across all threads of execution.
These time spans will be inserted into the trace accumulo table. You can browse recent traces from the accumulo monitor page. You can also read the trace table directly.
Tracing is supported in the shell. For example:
root@test> createtable test root@test test> insert a b c d root@test test> trace on root@test test> scan a b:c [] d root@test test> trace off Waiting for trace information Waiting for trace information Waiting for trace information Trace started at 2011/03/16 09:20:31.387 Time Start Service@Location Name 3355+0 shell@host2 shell:root 1+1 shell@host2 client:listUsers 1+1434 tserver@host2 getUserAuthorizations 1+1434 shell@host2 client:getUserAuthorizations 10+1550 shell@host2 scan 9+1551 shell@host2 scan:location 7+1552 shell@host2 client:startScan 6+1553 tserver@host2 startScan 5+1553 tserver@host2 tablet read ahead 11 1+1559 shell@host2 client:closeScan 1+1561 shell@host2 client:listUsers
Here we can see that the shell is getting the list of users (which is used for tab-completion) after every command. While unexpected, it is a fast operation. In fact, all the requests are very fast, and most of the time is spent waiting for the user to make a request while tracing is turned on.
Spans are added to the trace table asynchronously. The user may have to wait several seconds for all requests to complete before the trace information is complete.
You can extract the trace data out of the trace table. Each span is a stored as a column in a row named for the trace id. The following code will print out a trace:
String table = AccumuloConfiguration.getSystemConfiguration().get(Property.TRACE_TABLE); Scanner scanner = shellState.connector.createScanner(table, auths); scanner.setRange(new Range(new Text(Long.toHexString(scanTrace.traceId())))); TraceDump.printTrace(scanner, new Printer() { void print(String line) { System.out.println(line); } });