Model free unbiased prediction method.
Recall the value function (given a policy) by definition is
We can just replace that expectation with sampling average to get value estimation.
Model free unbiased prediction method.
Recall the value function (given a policy) by definition is
We can just replace that expectation with sampling average to get value estimation.