public class HyperLogLogUtils extends Object
Modifier and Type | Field and Description |
---|---|
static byte[] |
MAGIC |
Constructor and Description |
---|
HyperLogLogUtils() |
Modifier and Type | Method and Description |
---|---|
static HyperLogLog |
deserializeHLL(byte[] buf)
This function deserializes the serialized hyperloglogs from a byte array.
|
static HyperLogLog |
deserializeHLL(InputStream in)
Refer serializeHLL() for format of serialization.
|
static long |
getEstimatedCountFromSerializedHLL(InputStream in)
Get estimated cardinality without deserializing HLL
|
static float |
getRelativeError(long actualCount,
long estimatedCount)
Return relative error between actual and estimated cardinality
|
static void |
serializeHLL(OutputStream out,
HyperLogLog hll)
HyperLogLog is serialized using the following format
|
public static void serializeHLL(OutputStream out, HyperLogLog hll) throws IOException
|-4 byte-|------varlong----|varint (optional)|----------| --------------------------------------------------------- | header | estimated-count | register-length | register | --------------------------------------------------------- 4 byte header is encoded like below 3 bytes - HLL magic string to identify serialized stream 4 bits - p (number of bits to be used as register index) 1 - spare bit (not used) 3 bits - encoding (000 - sparse, 001..110 - n bit packing, 111 - no bit packing) Followed by header are 3 fields that are required for reconstruction of hyperloglog Estimated count - variable length long to store last computed estimated count. This is just for quick lookup without deserializing registers Register length - number of entries in the register (required only for for sparse representation. For bit-packing, the register length can be found from p)
out
- - output stream to write tohll
- - hyperloglog that needs to be serializedIOException
public static HyperLogLog deserializeHLL(InputStream in) throws IOException
in
- - input streamIOException
public static HyperLogLog deserializeHLL(byte[] buf)
buf
- - to deserializepublic static long getEstimatedCountFromSerializedHLL(InputStream in) throws IOException
in
- - serialized HLLIOException
public static float getRelativeError(long actualCount, long estimatedCount)
actualCount
- - actual countestimatedCount
- - estimated countCopyright © 2022 The Apache Software Foundation. All rights reserved.