Variance Reduction for Heavy-Tailed Monetization Metrics in Ranking Experiments via Post-Stratification
Title: Enhancing Ranking Experiment Reliability for Heavy-Tailed Monetization Metrics Using Post-Stratification
Abstract:
Evaluating ranking and retrieval systems in online environments frequently depends on downstream monetization indicators, such as application revenue or creator income. These metrics often exhibit heavy-tailed distributions, where a minority of users account for the majority of both the mean and variance. This characteristic results in diminished statistical power and questionable conclusions in A/B testing, particularly when traffic volume is constrained. To address this, we introduce a practical framework for reducing variance in online experiments by integrating CUPED with post-stratification. By utilizing pre-experiment covariates, our method enhances the sensitivity of monetization assessments without the need for increased traffic. Implemented at ShareChat for ranking-driven monetization studies, this technique significantly lowers variance and stabilizes decision-making, delivering the same level of statistical confidence while requiring approximately 45% less traffic compared to conventional metrics. Additionally, we outline practical design considerations, safety measures, and constraints, offering recommendations on the suitability of post-stratification for real-world information retrieval and recommendation systems.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC





