Package org.apache.calcite.profile
Class ProfilerImpl
- java.lang.Object
-
- org.apache.calcite.profile.ProfilerImpl
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ProfilerImpl.Builder
Builds aProfilerImpl
.(package private) static class
ProfilerImpl.Collector
Collects values of a column or columns.(package private) static class
ProfilerImpl.CompositeCollector
Collector that collects two or more column values in a tree set.(package private) static class
ProfilerImpl.HllCollector
Collector that collects two or more column values into a HyperLogLog sketch.(package private) static class
ProfilerImpl.HllCompositeCollector
Collector that collects two or more column values into a HyperLogLog sketch.(package private) static class
ProfilerImpl.HllSingletonCollector
Collector that collects one column value into a HyperLogLog sketch.(package private) class
ProfilerImpl.Run
A run of the profiler.(package private) static class
ProfilerImpl.SingletonCollector
Collector that collects values of a single column.(package private) static class
ProfilerImpl.Space
Work space for a particular combination of columns.(package private) static class
ProfilerImpl.SurpriseQueue
A priority queue of the last N surprise values.-
Nested classes/interfaces inherited from interface org.apache.calcite.profile.Profiler
Profiler.Column, Profiler.Distribution, Profiler.FunctionalDependency, Profiler.Profile, Profiler.RowCount, Profiler.Statistic, Profiler.Unique
-
-
Field Summary
Fields Modifier and Type Field Description private int
combinationsPerPass
The number of combinations to consider per pass.private int
interestingCount
The minimum number of combinations considered "interesting".private java.util.function.Predicate<Pair<ProfilerImpl.Space,Profiler.Column>>
predicate
Whether a successor is considered interesting enough to analyze.
-
Constructor Summary
Constructors Constructor Description ProfilerImpl(int combinationsPerPass, int interestingCount, java.util.function.Predicate<Pair<ProfilerImpl.Space,Profiler.Column>> predicate)
Creates aProfilerImpl
.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ProfilerImpl.Builder
builder()
Profiler.Profile
profile(java.lang.Iterable<java.util.List<java.lang.Comparable>> rows, java.util.List<Profiler.Column> columns, java.util.Collection<ImmutableBitSet> initialGroups)
Creates a profile of a data set.
-
-
-
Field Detail
-
combinationsPerPass
private final int combinationsPerPass
The number of combinations to consider per pass. The number is determined by memory, but a value of 1,000 is typical. You need 2KB memory per sketch, and one sketch for each combination.
-
interestingCount
private final int interestingCount
The minimum number of combinations considered "interesting". After that, a combination is only considered "interesting" if its surprise is greater than the median surprise.
-
predicate
private final java.util.function.Predicate<Pair<ProfilerImpl.Space,Profiler.Column>> predicate
Whether a successor is considered interesting enough to analyze.
-
-
Constructor Detail
-
ProfilerImpl
ProfilerImpl(int combinationsPerPass, int interestingCount, java.util.function.Predicate<Pair<ProfilerImpl.Space,Profiler.Column>> predicate)
Creates aProfilerImpl
.- Parameters:
combinationsPerPass
- Maximum number of columns (or combinations of columns) to compute each passinterestingCount
- Minimum number of combinations considered interestingpredicate
- Whether a successor is considered interesting enough to analyze
-
-
Method Detail
-
builder
public static ProfilerImpl.Builder builder()
-
profile
public Profiler.Profile profile(java.lang.Iterable<java.util.List<java.lang.Comparable>> rows, java.util.List<Profiler.Column> columns, java.util.Collection<ImmutableBitSet> initialGroups)
Description copied from interface:Profiler
Creates a profile of a data set.- Specified by:
profile
in interfaceProfiler
- Parameters:
rows
- List of rows. Can be iterated over more than once (maybe not cheaply)columns
- Column definitionsinitialGroups
- List of combinations of columns that should be profiled early, because they may be interesting- Returns:
- A profile describing relationships within the data set
-
-