Redis HyperLogLog

In Redis 2.8.9 release adds HyperLogLog structure.

Redis HyperLogLog base is used to make statistical algorithms HyperLogLog advantage is that the number of elements in the input or volume is very very large, the space needed for the basis of calculation is always fixed, and is very small.

In Redis inside each key HyperLogLog only takes 12 KB of memory, you can calculate the closest base 2 ^ 64 different elements. This is when the calculation base, the more memory-intensive elements of a collection of the more stark contrast.

However, because HyperLogLog only based on the input element calculation base, but does not store the input element itself, so that the collection HyperLogLog not like to return to the respective input elements.

What is the base?

For example, the data set {1, 3, 5, 7, 5, 7, 8}, then the cardinality of the set of data set is {1, 3, 5, 7, 8}, base (not repeating elements) 5. Cardinality estimation error is within an acceptable range, fast calculation base.

Examples

The following example demonstrates the work process HyperLogLog:

redis 127.0.0.1:6379> PFADD w3bigkey "redis"

1) (integer) 1

redis 127.0.0.1:6379> PFADD w3bigkey "mongodb"

1) (integer) 1

redis 127.0.0.1:6379> PFADD w3bigkey "mysql"

1) (integer) 1

redis 127.0.0.1:6379> PFCOUNT w3bigkey

(integer) 3