Selection bias when estimating average treatment effects in the M and butterfly structures
Abstract: Due to a phenomenon known as selection bias, the estimator of the average treatmen teffect (ATE) of a treatment variable on some outcome may be biased. Selection bias, caused by exclusion of possible units from the studied data, is a major obstacle to valid statistical and causal inferences. It is hard to detect in experimental or observational studies and is introduced when conditioning a sample on a common collider of the treatment and response variables. A certain type of selection bias known as M-Bias occurs when conditioning on a pretreatment variable that is part of a particular variable structure, the M structure. In this structure, the collider has no direct causal association with the treatment and outcome variables, but it is indirectly associated with both through ancestors. In this thesis, scenarios where potential M-bias arises were examined in a simulation study. The percentage of bias relative to the true ATE was estimated for each of the scenarios. A continuous collider variable was used and samples were conditioned to only include units with values on the collider variable above a certain cutoff value.T he cutoff value was varied to explore the relationship between the collider and theresulting bias. A variation of the M structure known as the butterfly structure was also studied in a similar fashion. The butterfly structure is known to result in confounding bias when not adjusting for said collider but selection bias when adjustment is done. The results show that selection bias is relatively small compared to bias originating from confounding in the butterfly structure. Increasing the cutoff level in this structure substantially decreases the overall bias of the ATE in almost all of the explored scenarios. The bias was smaller in the M structure than in the butterfly structure in close to all scenarios. For the M structure, the bias was generally smaller for higher cutoff values and insubstantial in some scenarios. This occurred because in most of the studied scenarios, a large proportion of the variance of the collider was explained by binary ancestors of said collider. When these ancestors are the primary causes of the collider, increasing the cutoff to a high enough value causes adjustment for the ancestors. Adjusting for these ancestors will in turn d-separate the treatment and the outcome which results in an unbiased estimator of the ATE. When conducting studies in pratice, the possibility of selection bias should be taken into consideration. Even though this type of bias is usually small even whe nthe causal effects between involved variables are strong, it can still be significant and an unbiased estimator cannot be taken for granted in the presence of sample selection.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)