Paper review: Improving randomized controlled trial analysis via data-adaptive borrowing

A deep dive into how machine learning and adaptive lasso can enhance RCTs by selectively borrowing information from external controls.

Core Assumptions

Assumption 2 (External control compatibility). Suppose that (i) $E{Y(0) \mid X = x, R = 0} = E{Y (0) \mid X = x, R = 1}$ and…

The Detection Mechanism

The adaptive lasso penalty detects incomparable external controls by recasting the challenge of identifying compatible subjects as a model selection problem based on individual bias. Here is how the detection mechanism works:

By using this penalty, the framework achieves “selection consistency” (Lemma 1). This means that as long as the initial estimator is high quality and the tuning parameters are chosen properly, the adaptive lasso will consistently and reliably pinpoint the zero-bias subjects, naturally filtering out incomparable external controls that could otherwise skew the trial’s results.

Practical Implementation

To learn the exact value of the bias parameter for each external control subject, the framework uses a two-step process involving machine learning predictions followed by a penalized optimization:

  1. Defining the True Bias: The true subject-level bias, $b_{i,0}$, is defined mathematically as the difference between the expected conditional outcome for an external control subject ($\mu_{0,E}(X_{i})$) and a concurrent trial control subject ($\mu_{0}(X_{i})$), which is expressed as $b_{i,0} = \mu_{0,E}(X_{i}) - \mu_{0}(X_{i})$.
  2. Calculating an Initial Estimate: An initial, unpenalized estimator, $\hat{b}{i}$, is constructed by calculating the difference between the estimated outcome means for both groups: $\hat{b}{i} = \hat{\mu}{0,E}(X{i}) - \hat{\mu}{0}(X{i})$. In practice, these conditional outcome means ($\hat{\mu}{0,E}$ and $\hat{\mu}{0}$) are estimated using off-the-shelf machine learning algorithms that possess guaranteed convergence rates.
  3. Applying the Adaptive Lasso Penalty: Finally, a refined bias estimator, $\tilde{b}$, is computed by solving a penalized least-squares optimization problem. The framework finds the vector of biases $b$ that minimizes the following equation:
\[\tilde{b} = \text{argmin}_{b} \{ (\hat{b}-b)^{T} \hat{\Sigma}_{b}^{-1} (\hat{b}-b) + \lambda_{N} \sum_{i \in E} p(|b_{i}|) \}\]

Breaking down the components:

Because the initial estimate $\hat{b}_{i}$ acts as the denominator in the penalty term $p( b_{i} )$, subjects who are truly comparable (and thus have an initial bias estimate close to zero) will receive an exceedingly large penalty. This dynamic successfully shrinks their final refined bias estimate ($\tilde{b}_{i}$) to exactly zero, allowing the framework to pinpoint and select them for the trial.

cenrla limit diffusion

fluid market sie lambda t

precious setting: lambda and n together goes to infinty static price approach (dynamic pricing is not need in seminal dynamic rpciing paper) uses CLT iterative reoptimization heuristic the core logic is even though we use CLT, we dont need large number actually

this talk focuses on large market regime core: same CLT small number argument

the core message: what matter is the ratio. not which one is fixed

plot: the optial policy in large market regime is not the static policy (high price w