Review of Storey's procedure for FDR control.
The False Discovery Rate (FDR) is defined as the expected proportion of false positives (incorrectly rejected null hypotheses) among all rejected hypotheses.
\[\text{FDR} = \mathbb{E}\left[ \frac{V}{R} \right] = \mathbb{E}\left[ \frac{V}{V+S} \right]\]where:
Controlling the FDR at level $\alpha$ (e.g., 0.05) ensures that the expected proportion of false discoveries remains below $5\%$.
A key limitation of the original Benjamini-Hochberg (BH) procedure is its conservatism. The BH procedure guarantees:
\[\text{FDR} \le \pi_0 \alpha\]where $\pi_0$ is the true proportion of null hypotheses.
This means the procedure is less powerful than it could be. Storey’s procedure improves upon this by estimating $\pi_0$ from the data to potentially gain more power while maintaining FDR control.
Recall the empirical process view point of BH:
\[\frac{\hat{\pi}_0(t) t}{\hat{F}(t)} \le \alpha\]We were to able to exactly characterize FDR of BH procedure as
\[\text{FDR}_{\text{BH}} = \pi_0 \alpha\]What if we could estimate $\pi_0$?
Let $\pi_0 = n_0 / n$ be the fraction of nulls. Pick $\lambda \in (0, 1)$ and compute
\[\hat{\pi}_0 = \frac{\sum_{i=1}^n \mathbb{I}(p_i > \lambda)}{n(1-\lambda)}\]Why is this estimate sensible?
We only consider $\lambda < t$ because in the estimate $\hat{\pi}_0$ we implicitly assume p-values above $\lambda$ are null.
If the p-values are independent, then for $\alpha \in (0, 1)$ the Storey’s procedure controls FDR as
\[\text{FDR} \le \alpha\]