We consider the problem of evaluating the performance of a decision policy using past
observational data. The outcome of a policy is measured in terms of a loss (aka. disutility
or negative reward) and the main problem is making valid inferences about its out-of-sample
loss when the past data was observed under a different and possibly unknown policy. Data used include sensitive individual level data on e.g. drug dispensations from the prescribed drug registry.