Class ArrayCountingBloomFilter
- All Implemented Interfaces:
BitMapExtractor
,BloomFilter
,CellExtractor
,CountingBloomFilter
,IndexExtractor
Any operation that results in negative counts or integer overflow of
counts will mark this filter as invalid. This transition is not reversible.
The operation is completed in full, no exception is raised and the state is
set to invalid. This allows the cells for the filter immediately prior to the
operation that created the invalid state to be recovered. See the documentation
in isValid()
for details.
All the operations in the filter assume the cells are currently valid,
for example cardinality
or contains
operations. Behavior of an invalid
filter is undefined. It will no longer function identically to a standard
Bloom filter that is the merge of all the Bloom filters that have been added
to and not later subtracted from the counting Bloom filter.
The maximum supported number of items that can be stored in the filter is
limited by the maximum array size combined with the Shape
. For
example an implementation using a Shape
with a false-positive
probability of 1e-6 and Integer.MAX_VALUE
bits can reversibly store
approximately 75 million items using 20 hash functions per item with a memory
consumption of approximately 8 GB.
- Since:
- 4.5
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
CellExtractor.CellPredicate
-
Field Summary
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE
-
Constructor Summary
ConstructorDescriptionArrayCountingBloomFilter
(Shape shape) Constructs an empty counting Bloom filter with the specified shape. -
Method Summary
Modifier and TypeMethodDescriptionboolean
add
(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.int[]
Return a copy of the IndexExtractor data as an int array.int
Gets the cardinality (number of enabled bits) of this Bloom filter.int
Returns the characteristics of the filter.void
clear()
Resets the filter to its initial, unpopulated state.boolean
contains
(BitMapExtractor bitMapExtractor) Returnstrue
if this filter contains the bits specified in the bit maps produced by the bitMapExtractor.boolean
contains
(IndexExtractor indexExtractor) Returnstrue
if this filter contains the indices specified IndexExtractor.copy()
Creates a new instance of the CountingBloomFilter with the same properties as the current one.int
Returns the maximum allowable value for a cell count in this Counting filter.int
getMaxInsert
(CellExtractor cellExtractor) Determines the maximum number of times the Cell Extractor could have been added.getShape()
Gets the shape that was used when the filter was built.boolean
isValid()
Returnstrue
if the internal state is valid.boolean
processBitMaps
(LongPredicate consumer) Each bit map is passed to the predicate in order.boolean
processCells
(CellExtractor.CellPredicate consumer) Performs the given action for eachcell
where the cell count is non-zero.boolean
processIndices
(IntPredicate consumer) Each index is passed to the predicate.boolean
subtract
(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairs
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
contains, contains, estimateIntersection, estimateN, estimateUnion, isEmpty, isFull
Methods inherited from interface org.apache.commons.collections4.bloomfilter.CountingBloomFilter
getMaxInsert, getMaxInsert, getMaxInsert, getMaxInsert, merge, merge, merge, merge, remove, remove, remove, remove, uniqueIndices
-
Constructor Details
-
ArrayCountingBloomFilter
Constructs an empty counting Bloom filter with the specified shape.- Parameters:
shape
- the shape of the filter
-
-
Method Details
-
add
Description copied from interface:CountingBloomFilter
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
other
will be incremented by their corresponding values in theother
.This method will return
true
if the filter is valid after the operation.- Specified by:
add
in interfaceCountingBloomFilter
- Parameters:
other
- the CellExtractor to add.- Returns:
true
if the addition was successful and the state is valid- See Also:
-
asIndexArray
Description copied from interface:IndexExtractor
Return a copy of the IndexExtractor data as an int array.Indices ordering and uniqueness is not guaranteed.
The default implementation of this method creates an array and populates it. Implementations that have access to an index array should consider returning a copy of that array if possible.
- Specified by:
asIndexArray
in interfaceIndexExtractor
- Returns:
- An int array of the data.
-
cardinality
Description copied from interface:BloomFilter
Gets the cardinality (number of enabled bits) of this Bloom filter.This is also known as the Hamming value or Hamming number.
- Specified by:
cardinality
in interfaceBloomFilter
- Returns:
- the cardinality of this filter
-
characteristics
Description copied from interface:BloomFilter
Returns the characteristics of the filter.Characteristics are defined as bits within the characteristics integer.
- Specified by:
characteristics
in interfaceBloomFilter
- Returns:
- the characteristics for this bloom filter.
-
clear
Description copied from interface:BloomFilter
Resets the filter to its initial, unpopulated state.- Specified by:
clear
in interfaceBloomFilter
-
contains
Description copied from interface:BloomFilter
Returnstrue
if this filter contains the bits specified in the bit maps produced by the bitMapExtractor.- Specified by:
contains
in interfaceBloomFilter
- Parameters:
bitMapExtractor
- theBitMapExtractor
to provide the bit maps.- Returns:
true
if this filter is enabled for all bits specified by the bit maps
-
contains
Description copied from interface:BloomFilter
Returnstrue
if this filter contains the indices specified IndexExtractor.Specifically this returns
true
if this filter is enabled for all bit indexes identified by theIndexExtractor
.- Specified by:
contains
in interfaceBloomFilter
- Parameters:
indexExtractor
- the IndexExtractor to provide the indexes- Returns:
true
if this filter is enabled for all bits specified by the IndexExtractor
-
copy
Description copied from interface:CountingBloomFilter
Creates a new instance of the CountingBloomFilter with the same properties as the current one.- Specified by:
copy
in interfaceBloomFilter
- Specified by:
copy
in interfaceCountingBloomFilter
- Returns:
- a copy of this CountingBloomFilter
-
processBitMaps
Description copied from interface:BitMapExtractor
Each bit map is passed to the predicate in order. The predicate is applied to each bit map value, if the predicate returnsfalse
the execution is stopped,false
is returned, and no further bit maps are processed.If the extractor is empty this method will return true.
Any exceptions thrown by the action are relayed to the caller.
- Specified by:
processBitMaps
in interfaceBitMapExtractor
- Parameters:
consumer
- the function to execute- Returns:
true
if all bit maps returnedtrue
,false
otherwise.
-
processCells
Description copied from interface:CellExtractor
Performs the given action for eachcell
where the cell count is non-zero.Some Bloom filter implementations use a count rather than a bit flag. The term
Cell
is used to refer to these counts.Any exceptions thrown by the action are relayed to the caller. The consumer is applied to each cell. If the consumer returns
false
the execution is stopped,false
is returned, and no further pairs are processed.- Specified by:
processCells
in interfaceCellExtractor
- Parameters:
consumer
- the action to be performed for each non-zero cell.- Returns:
true
if all cells return true from consumer,false
otherwise.
-
processIndices
Description copied from interface:IndexExtractor
Each index is passed to the predicate. The predicate is applied to each index value, if the predicate returnsfalse
the execution is stopped,false
is returned, and no further indices are processed.Any exceptions thrown by the action are relayed to the caller.
Indices ordering and uniqueness is not guaranteed.
- Specified by:
processIndices
in interfaceCellExtractor
- Specified by:
processIndices
in interfaceIndexExtractor
- Parameters:
consumer
- the action to be performed for each non-zero bit index.- Returns:
true
if all indexes return true from consumer,false
otherwise.
-
getMaxCell
Description copied from interface:CountingBloomFilter
Returns the maximum allowable value for a cell count in this Counting filter.- Specified by:
getMaxCell
in interfaceCountingBloomFilter
- Returns:
- the maximum allowable value for a cell count in this Counting filter.
-
getMaxInsert
Description copied from interface:CountingBloomFilter
Determines the maximum number of times the Cell Extractor could have been added.- Specified by:
getMaxInsert
in interfaceCountingBloomFilter
- Parameters:
cellExtractor
- the extractor of cells.- Returns:
- the maximum number of times the CellExtractor could have been inserted.
-
getShape
Description copied from interface:BloomFilter
Gets the shape that was used when the filter was built.- Specified by:
getShape
in interfaceBloomFilter
- Returns:
- The shape the filter was built with.
-
isValid
Returnstrue
if the internal state is valid.This flag is a warning that an addition or subtraction of cells from this filter resulted in an invalid cell for one or more indexes. For example this may occur if a cell for an index was set to negative following a subtraction operation, or overflows the value specified by
getMaxCell()
following an addition operation.A counting Bloom filter that has an invalid state is no longer ensured to function identically to a standard Bloom filter instance that is the merge of all the Bloom filters that have been added to and not later subtracted from this counting Bloom filter.
Note: The change to an invalid state may or may not be reversible. Implementations are expected to document their policy on recovery from an addition or removal operation that generated an invalid state.
Implementation note
The state transition to invalid is permanent.
This implementation does not correct negative cells to zero or integer overflow cells to
Integer.MAX_VALUE
. Thus the operation that generated invalid cells can be reversed by using the complement of the original operation with the same Bloom filter. This will restore the cells to the state prior to the invalid operation. Cells can then be extracted usingprocessCells(CellPredicate)
.- Specified by:
isValid
in interfaceCountingBloomFilter
- Returns:
true
if the state is valid
-
subtract
Description copied from interface:CountingBloomFilter
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
other
will be decremented by their corresponding values in theother
.This method will return true if the filter is valid after the operation.
- Specified by:
subtract
in interfaceCountingBloomFilter
- Parameters:
other
- the CellExtractor to subtract.- Returns:
true
if the subtraction was successful and the state is valid- See Also:
-