Abstract
Randomization is a foundational assumption in A/B testing. In practice, however, randomized experiments can still produce biased estimates under realistic data collection conditions. We use simulation to demonstrate how bias can emerge despite correct random assignment. Visualization is shown to be an effective diagnostic tool for detecting these issues before causal interpretation.
Introduction
A/B testing is widely used to estimate the causal impact of product changes. Users are randomly assigned to control (C) or treatment (T), and differences in outcomes are attributed to the experiment. Randomization is intended to balance user characteristics across groups when assignment occurs at the user level. However, even with correct random assignment, the observed segment mix can differ because real experiments are often analyzed on a filtered or triggered subset of users. Eligibility rules, exposure conditions, logging behavior, and data availability can vary by variant due to trigger logic, instrumentation loss, device or browser differences, and latency. As a result, treatment and control may represent different effective populations.