FDR control: Storey's procedure

Review of Storey's procedure for FDR control.

False Discovery Rate

The False Discovery Rate (FDR) is defined as the expected proportion of false positives (incorrectly rejected null hypotheses) among all rejected hypotheses.

\[\text{FDR} = \mathbb{E}\left[ \frac{V}{R} \right] = \mathbb{E}\left[ \frac{V}{V+S} \right]\]

where:

Controlling the FDR at level $\alpha$ (e.g., 0.05) ensures that the expected proportion of false discoveries remains below $5\%$.

Improving BH

A key limitation of the original Benjamini-Hochberg (BH) procedure is its conservatism. The BH procedure guarantees:

\[\text{FDR} \le \pi_0 \alpha\]

where $\pi_0$ is the true proportion of null hypotheses.

This means the procedure is less powerful than it could be. Storey’s procedure improves upon this by estimating $\pi_0$ from the data to potentially gain more power while maintaining FDR control.

Recall the empirical process view point of BH:

\[\frac{\hat{\pi}_0(t) t}{\hat{F}(t)} \le \alpha\]

We were to able to exactly characterize FDR of BH procedure as

\[\text{FDR}_{\text{BH}} = \pi_0 \alpha\]

What if we could estimate $\pi_0$?

An estimate of $\pi_0$

Let $\pi_0 = n_0 / n$ be the fraction of nulls. Pick $\lambda \in (0, 1)$ and compute

\[\hat{\pi}_0 = \frac{\sum_{i=1}^n \mathbb{I}(p_i > \lambda)}{n(1-\lambda)}\]

Why is this estimate sensible?

Storey’s procedure

  1. Pick $\lambda \in [0, 1)$ (typically $1/2$)
  2. Estimate null proportions:
\[\hat{\pi}_0(\lambda) = \frac{1 + \sum_{i=1}^n \mathbb{I}(p_i > \lambda)}{n(1-\lambda)}\]
  1. Similar to BH construct the cutoff:
\[\text{storey}_{\lambda} = \sup \{ t : \frac{\hat{\pi}_0(\lambda) t}{\hat{F}(t)} \le \alpha \}\]

We only consider $\lambda < t$ because in the estimate $\hat{\pi}_0$ we implicitly assume p-values above $\lambda$ are null.

Theorem (Storey 2004)

If the p-values are independent, then for $\alpha \in (0, 1)$ the Storey’s procedure controls FDR as

\[\text{FDR} \le \alpha\]