Model free unbiased prediction method.

Recall the value function (given a policy) by definition is

We can just replace that expectation with sampling average to get value estimation.