This directory contains a suite of scripts for placing continuous query and ingest load on accumulo. The purpose of these script is two fold. First, place continuous load on accumulo to see if breaks. Second, collect statistics in order to understand how accumulo behaves . To run these script copy all of the .example files and modify them. These scripts rely on pssh. Before running any script you may need to use pssh to create the log directory on each machine if you want it local. Also, create the table "ci" before running. You can run org.apache.accumulo.server.test.continuous.GenSplits to generate splits points for a continuous ingest table. The following ingest scripts inserts data into accumulo that will form a random graph. start-ingest.sh stop-ingest.sh The following query scripts randomly walk the graph created by the ingesters. Each walker produce detailed statistics on query/scan times. start-walkers.sh stop-walker.sh The following scripts start and stop batch walkers. start-batchwalkers.sh stop-batchwalkers.sh In addition to placing continuous load, the following scripts start and stop a service that continually collect statistics about accumulo and HDFS. start-stats.sh stop-stats.sh Optionally, start the agitator to periodically kill random servers. start-agitator.sh stop-agitator.sh Start all three of these services and let them run for a few hours. Then run report.pl to generate an simple html report containing plots and histograms showing what has transpired. A map reduce job to verify all data created by continuous ingest can be run with the following command. Before running the command modify the VERIFY_* variables in continuous-env.sh if needed. Do not run ingest while running this command, this will cause erroneous reporting of UNDEFINED nodes. The map reduce job will scan a reference after it has scanned the definition. run-verify.sh Each entry, except for the first batch of entries, inserted by continuous ingest references a previously flushed entry. Since we are referencing flushed entries, they should always exist. The map reduce job checks that all referenced entries exist. If it finds any that do not exist it will increment the UNDEFINED counter and emit the referenced but undefined node. The map reduce job produces two other counts : REFERENCED and UNREFERENCED. It is expected that these two counts are non zero. REFERENCED counts nodes that are defined and referenced. UNREFERENCED counts nodes that defined and unreferenced, these are the latest nodes inserted. To stress accumulo, run the following script which starts a map reduce job that reads and writes to your continuous ingest table. This map reduce job will write out an entry for every entry in the table (except for ones created by the map reduce job itself). Stop ingest before running this map reduce job. Do not run more than one instance of this map reduce job concurrently against a table. run-moru.sh