Class SimpleProfiler

  • All Implemented Interfaces:
    Profiler

    public class SimpleProfiler
    extends java.lang.Object
    implements Profiler
    Basic implementation of Profiler.
    • Constructor Detail

      • SimpleProfiler

        public SimpleProfiler()
    • Method Detail

      • profile

        public Profiler.Profile profile​(java.lang.Iterable<java.util.List<java.lang.Comparable>> rows,
                                        java.util.List<Profiler.Column> columns,
                                        java.util.Collection<ImmutableBitSet> initialGroups)
        Description copied from interface: Profiler
        Creates a profile of a data set.
        Specified by:
        profile in interface Profiler
        Parameters:
        rows - List of rows. Can be iterated over more than once (maybe not cheaply)
        columns - Column definitions
        initialGroups - List of combinations of columns that should be profiled early, because they may be interesting
        Returns:
        A profile describing relationships within the data set
      • surprise

        public static double surprise​(double expected,
                                      double actual)
        Returns a measure of how much an actual value differs from expected. The formula is abs(expected - actual) / (expected + actual).

        Examples:

        • surprise(e, a) is always between 0 and 1;
        • surprise(e, a) is 0 if e = a;
        • surprise(e, 0) is 1 if e > 0;
        • surprise(0, a) is 1 if a > 0;
        • surprise(5, 0) is 100%;
        • surprise(5, 3) is 25%;
        • surprise(5, 4) is 11%;
        • surprise(5, 5) is 0%;
        • surprise(5, 6) is 9%;
        • surprise(5, 16) is 52%;
        • surprise(5, 100) is 90%;
        Parameters:
        expected - Expected value
        actual - Actual value
        Returns:
        Measure of how much expected deviates from actual