Randomized controlled trials (RCTs) are considered gold standard in generating judicious evidence to support treatment decisions. Ideal-typical trials are called explanatory trials to distinguish it from trials completed under real-world conditions. The four most prevalent types of bias (selection-, performance-, attrition-, and detection-bias) can be avoided and internal validity of a study can be increased if all requested quality criteria will be met. The external validity can be neither investigated not can it be confirmed by randomized trials. But the confirmation of external validity is as important as the confirmation of internal validity because knowledge that has been generated in RCTs will be valuable only if it can be successfully applied to patients under real-world conditions. For confirmation of external validity the mentioned four types of bias have to be avoided. In addition, it has to be confirmed that the individuals from whom the evidence was derived are comparable to the individuals to whom the evidence should be applied. Violation of this simple appearing requirement is called 'sampling bias'. A two-step procedure seems to be useful to confirm internal as well as external evidence. As first step the efficacy of a therapeutic principle may be confirmed under ideal study conditions by using an explanatory trial without demanding the confirmation of external validity. In a second step the benefit for the investigated group of patients is examined under real-world conditions (pragmatic trial). The design and established methods for evaluation of these studies are discussed. The two-step approach offers three advantages: it reduces the risk to over-interpret the results of RCTs as explanatory trials can only demonstrate efficacy under ideal conditions. The benefit which is requested by our authorities can be demonstrated only by pragmatic trials which consider the external validity. Progress may possibly achieved only if controlled pragmatic trials will be used which can compare the influence of the intended (specific treatment effect) intervention with not-intended (confounder) interventions. Examples for these methods are the propensity score matching or structural equation models.