Package org.apache.calcite.profile
Class SimpleProfiler
- java.lang.Object
-
- org.apache.calcite.profile.SimpleProfiler
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
SimpleProfiler.Run
A run of the profiler.(package private) static class
SimpleProfiler.Space
Work space for a particular combination of columns.-
Nested classes/interfaces inherited from interface org.apache.calcite.profile.Profiler
Profiler.Column, Profiler.Distribution, Profiler.FunctionalDependency, Profiler.Profile, Profiler.RowCount, Profiler.Statistic, Profiler.Unique
-
-
Constructor Summary
Constructors Constructor Description SimpleProfiler()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Profiler.Profile
profile(java.lang.Iterable<java.util.List<java.lang.Comparable>> rows, java.util.List<Profiler.Column> columns, java.util.Collection<ImmutableBitSet> initialGroups)
Creates a profile of a data set.static double
surprise(double expected, double actual)
Returns a measure of how much an actual value differs from expected.
-
-
-
Method Detail
-
profile
public Profiler.Profile profile(java.lang.Iterable<java.util.List<java.lang.Comparable>> rows, java.util.List<Profiler.Column> columns, java.util.Collection<ImmutableBitSet> initialGroups)
Description copied from interface:Profiler
Creates a profile of a data set.- Specified by:
profile
in interfaceProfiler
- Parameters:
rows
- List of rows. Can be iterated over more than once (maybe not cheaply)columns
- Column definitionsinitialGroups
- List of combinations of columns that should be profiled early, because they may be interesting- Returns:
- A profile describing relationships within the data set
-
surprise
public static double surprise(double expected, double actual)
Returns a measure of how much an actual value differs from expected. The formula isabs(expected - actual) / (expected + actual)
.Examples:
- surprise(e, a) is always between 0 and 1;
- surprise(e, a) is 0 if e = a;
- surprise(e, 0) is 1 if e > 0;
- surprise(0, a) is 1 if a > 0;
- surprise(5, 0) is 100%;
- surprise(5, 3) is 25%;
- surprise(5, 4) is 11%;
- surprise(5, 5) is 0%;
- surprise(5, 6) is 9%;
- surprise(5, 16) is 52%;
- surprise(5, 100) is 90%;
- Parameters:
expected
- Expected valueactual
- Actual value- Returns:
- Measure of how much expected deviates from actual
-
-