This note starts with me dictating with Claude Sonnet 4.6 doing touch ups and the actual math.

Bloom Filter

We want a Hash Set, but that’s too large. If we can tolerate a little false positive rate, we can have a very memory-efficient approximate membership structure: a Bloom Filter.

Core Idea

A Bloom Filter starts with a common trick: hash a value and perform a mod operation to get a small index. You then have a bit array of size $m$ and simply set that bit to 1.

Then you do some math and figure: “Hey, that actually has a pretty high error rate.” A single hash function means a single collision causes a false positive.

So, why not use multiple hash functions? When testing whether a value belongs to the set, ask: for all $k$ hash functions, are all $k$ bits set to 1? A false positive now requires all $k$ positions to be coincidentally occupied — much rarer.

Key insight

You don’t even need $k$ separate bit arrays. You can combine them into one big bit array of size $m$ . Each hash function maps into the same array. It can be shown mathematically that $k$ small arrays of size $m / k$ versus one big array of size $m$ are equivalent in false positive rate.

Operations

Insert a key $x$ :

for i = 1, \dots, k : B [h_{i} (x) mod m] \leftarrow 1

Query for key $x$ :

return i = 1 ⋀ k B [h_{i} (x) mod m] = 1

Note

No false negatives: if $x$ was inserted, all its bits are set, so it always returns true.

False positives possible: another set of keys may have coincidentally set all $k$ positions.

No deletion: clearing a bit might unset a bit shared by another key.

False Positive Rate

Setup: $m$ bits, $k$ hash functions, $n$ elements inserted.

Step 1 — Probability a specific bit is still 0 after inserting one element:

Each hash function sets one bit uniformly at random. The probability a given bit is not set by one hash function on one insertion is:

1 - \frac{1}{m}

After $k$ hash functions and $n$ insertions ( $kn$ total bit-set operations):

P (bit = 0) = (1 - \frac{1}{m})^{kn}

Step 2 — Apply the standard limit $(1 - \frac{1}{m})^{m} \approx e^{- 1}$ for large $m$ :

P (bit = 0) \approx e^{- kn / m}

So the probability a bit is 1 (occupied by some prior insertion) is:

P (bit = 1) \approx 1 - e^{- kn / m}

Step 3 — False positive rate. A false positive occurs when all $k$ probed bits happen to be 1:

ε = (1 - e^{- kn / m})^{k}

Optimal Number of Hash Functions

For fixed $m$ and $n$ , minimize $ε$ over $k$ . Taking $\frac{d ε}{d k} = 0$ gives:

k^{*} = \frac{m}{n} ln 2

Substituting back, the false positive rate at optimal $k$ simplifies to:

ε^{*} = (\frac{1}{2})^{k^{*}} = (\frac{1}{2})^{(m / n) l n 2}

Or equivalently, to achieve a target false positive rate $ε$ , the required bits per element is:

\frac{m}{n} = - \frac{lo g _{2} ε}{ln 2} \approx - 1.44 lo g_{2} ε

Rule of thumb

For $ε = 1$ : $m / n \approx 9.6$ bits per element, $k^{*} \approx 7$ hash functions. For $ε = 0.1$ : $m / n \approx 14.4$ bits per element, $k^{*} \approx 10$ hash functions.

Yanda's Random Notes

Explorer

Bloom Filter

Bloom Filter

Core Idea

Operations

False Positive Rate

Optimal Number of Hash Functions

Graph View

Table of Contents