This note starts with me dictating with Claude Sonnet 4.6 doing touch ups and the actual math.


Bloom Filter

We want a Hash Set, but that’s too large. If we can tolerate a little false positive rate, we can have a very memory-efficient approximate membership structure: a Bloom Filter.

Core Idea

A Bloom Filter starts with a common trick: hash a value and perform a mod operation to get a small index. You then have a bit array of size and simply set that bit to 1.

Then you do some math and figure: “Hey, that actually has a pretty high error rate.” A single hash function means a single collision causes a false positive.

So, why not use multiple hash functions? When testing whether a value belongs to the set, ask: for all hash functions, are all bits set to 1? A false positive now requires all positions to be coincidentally occupied — much rarer.

Key insight

You don’t even need separate bit arrays. You can combine them into one big bit array of size . Each hash function maps into the same array. It can be shown mathematically that small arrays of size versus one big array of size are equivalent in false positive rate.

Operations

Insert a key :

Query for key :

Note

  • No false negatives: if was inserted, all its bits are set, so it always returns true.
  • False positives possible: another set of keys may have coincidentally set all positions.
  • No deletion: clearing a bit might unset a bit shared by another key.

False Positive Rate

Setup: bits, hash functions, elements inserted.

Step 1 — Probability a specific bit is still 0 after inserting one element:

Each hash function sets one bit uniformly at random. The probability a given bit is not set by one hash function on one insertion is:

After hash functions and insertions ( total bit-set operations):

Step 2 — Apply the standard limit for large :

So the probability a bit is 1 (occupied by some prior insertion) is:

Step 3 — False positive rate. A false positive occurs when all probed bits happen to be 1:

Optimal Number of Hash Functions

For fixed and , minimize over . Taking gives:

Substituting back, the false positive rate at optimal simplifies to:

Or equivalently, to achieve a target false positive rate , the required bits per element is:

Rule of thumb

For : bits per element, hash functions. For : bits per element, hash functions.