//-----------------------------------------------------------------------------
//
// Statistics Processor
//
// Last Update: 04/11/2001
//
//-----------------------------------------------------------------------------

//-----------------------------------------------------------------------------
// Design
//-----------------------------------------------------------------------------

This statistics processor is a Cougar II (Tsunami) feature. 
It is designed to replace the StatAggregate.cc and portion of WebOverview.cc 
for scalability, maintainability, and ability to customize reasons.

The statistics processor aggregate/calculate traffic server statistics based
on a XML-based configuration file. At the processor's initialization, the 
configuration file is read and parsed. Each set of calculation is then stored
as a C++ object, called StatObject; and each StatObject is linked into a list,
called StatObjectList.

The mgmt2/traffic_manager threads will invoke the StatObjectList->Eval() to
perform the statistics aggregation and calculation within its event loop. As
Eval() is call, each StatObject in StatObjectList is evaluated.

Recall: StatAggregate.cc aggregates/calculates/copies proxy.process.* TS
variables to proxy.node.* variables. Similarly, WebOverview.cc aggregate
proxy.node.* variables to their corresponding proxy.cluster.* variable.

So there are two types of calculations in the statistics processor: NodeEval
and ClusterEval. As their names imply, they aggregate node-based statistics
and clsuter-based statistics, respectively. We call the different basis of 
statistics aggregation as "scope". (See "Destination Attributes")

In the cluster-based statistics, the aggregation is further divided into two
types: sum and re-calculate. Sum refers calculating the proxy.cluster.* 
variable by simply summing all required proxy.node.* variables from nodes in
the cluster. Re-calculate refers to summing all proxy.nodes.* variables that 
are used in the process of calculation before performing the calculation. 
An analogy would be, summing all open connection in the cluster vs. the 
average hit rate in the cluster.

//-----------------------------------------------------------------------------
// Destination Attributes
//-----------------------------------------------------------------------------
	
	"scope" 
	- "node"
	- "cluster"

	"operation"
	- "sum"
		summing the corresponding node variable across all nodes in the cluster.
	- "re-calculate"
		

	"define"
	- "custom"
	- "built-in"

//-----------------------------------------------------------------------------
// Predefined Constants and Functions
//-----------------------------------------------------------------------------
	
	Predefined Constants

	. BYTES_TO_MB_SCALE (1/(1024*1024.0))
	  - convert bytes to mega-bytes

	. MBIT_TO_KBIT_SCALE (1000.0)
	  - convert mega-bits to kilo-bits

	. SECOND_TO_MILLISECOND_SCALE (1000.0)
	  - convert seconds to milliseconds

	. PCT_TO_INTPCT_SCALE (100.0)
	  - convert ratios to percentage

	. HRTIME_SECOND
	  - converting milli-seconds to seconds

	. BYTES_TO_MBIT_SCALE (8/1000000.0)
	  - convert bytes to mega-bits

	Predefined Functions
	. DIFFTIME
	  - the number of milliseconds since last update. Usually used in 
        combination of HRTIME_SECOND which computes the number of seconds
        since last update.

//-----------------------------------------------------------------------------
// Unit test plan
//-----------------------------------------------------------------------------

The statistics processor is designed to replace StatAggregate.cc and part of
the WebOverview. The first thing to test StatProcessor is to comment the 
aggregateNodeRecords() and doClusterAg() calls from proxy/mgmt2/Main.cc.

The next step is to replace the above function calls with StatProcessor::
processStat().

This statistics processor is a rather complicated module in traffic manager.
Hence it can't be easily tested. We divided the test into multiple sections.

1) Node-based Simple Aggregation
	- simply performs those aggregation that are node-based and the aggregation
      is performed every time statProcess() is invoked.
	  E.g.: hit rate = doc. hit / doc. served.

2) Node-based Time-Delta Aggregation
	- performs those aggregation that are node-based but the operation is only
      perform in a regular interval AND one of more variables in the 
      calculation is obtained by calculating the difference between the last
      updated value and the current value. E.g. average connections per second
      is calculated by subtracting the 10 seconds ago connection count from the
      current connection count and divide the quotient by 10.

Repeat the about 2 testes with cluster-based variables. So, we have, at least,
4 test cases.

Developing a PASS/FAIL unit test that will test the statistics processor is not 
cost-efficient. The approach we are going to use is to display the input value
and the output value before and after the calculation is done.

Let's subdivide the testes in two stages:

Stage 1 : Synthetic Data
------------------------
We control the testing environment by setting the input values. This will test
the correctness of the statistics processor in a controlled environment. PASS/
FAIL is determined by matching the input/output values.

Stage 2 : Load Data
-------------------
Submitting network traffic through traffic server with load tools like jtest and
ftest, dumps the statistics to a text file, periodically and examines the
resulting values

//-----------------------------------------------------------------------------
// For QA Engineer
//-----------------------------------------------------------------------------
The most concerning question for QA engineers is "how can I tell if the 
Statistics Processor is working correctly?"

Recall, the new Statistics Processor is meant to replace the StatAggregate.cc
and part of the WebOverview.cc. In essence, you should not see any apparent
change.

If you ever see a value of -9999.0 (or -9999), then there is an error in
computing that value.


<expr>
    - %d
    - %f
    - %k

<dst>
    - specifies the variable that stores that result value.
    - ATTRIBUTE: type
        built-in: variables that are built-in/defined in traffic server
        custom: variables that are introducted to be temporary storage or
                variables that are introdcuted by the client.
    - default attributes:
	type = built-in  

<src>
    - variable need to computer the <dst>
    - ATTRIBUTE: type
	node: this is a proxy.node.* variables
	cluster: the is a proxy.node.* variables but summing over all
                 nodes in the cluster.
    - default attributes:
	type = node

<min>
    - specifics what is the smallest possible value for <dst>. For values
      smaller than <min>, the <defualt> is used.

<max>
    - specifics what is the largest possible value for <dst>. For values
      larger than <max>, the <defualt> is used.

<default>
    - specifics what value to be assigned to <dst> is the result <dst>
      value is smaller then <min> or larger then <max>

RULES: (some of these are enfored by the DTD anyways)
- all operator and operand in <expr> MUST BE separated by a single space.
- the order of the tags matters
- the order of each entry matters
- each statistics entry has have at most 1 <dst>


DEFINED CONSTANT (in alphabetical order)
* _BYTES_TO_MB_SCALE
	* Origin: utils/WebMgmtUtils.h
	* Value:  (1/(1024*1024.0))

*  _HRTIME_SECOND
	* Origin:
	* Value: 

*  _MBIT_TO_KBIT_SCALE
	* Origin: utils/WebMgmtUtils.h
	* Value:  (1000.0);

*  _PCT_TO_INTPCT_SCALE
	* Origin: utils/WebMgmtUtils.h
	* Value:  (100.0);

*  _SECOND_TO_MILLISECOND_SCALE
	* Origin: utils/WebMgmtUtils.h
	* Value:  (1000.0);


            __DIFFTIME