Couple of important points you're making here.
On your first point, instead of using a single prior distribution I could do a weighted combination of multiple distributions. There are two ways to do this: either have a prior be a combination distribution, or compute multiple posteriors with different distributions and take their weighted average. Not sure which one correctly handles this uncertainty. I haven't done the math but I'd expect that either way, a formulation with distribution probabilities 90% log-normal/10% Pareto will give much more credence to high cost-effectiveness estimates than a pure log-normal. I don't believe it would change the results much to assign small probability to distributions with thinner tails than log-normal (e.g. normal or exponential).
On your second point, yeah I'm including some extra information in the prior, which is kinda wishy-washy. I realize this is suboptimal, but it's better than anything else I've come up with, and probably better than not using a quantitative model at all. Do you know a better way to handle this?
I mean that your prior probability density is given by $P(X) = w_{Pareto} P_{Pareto}(X) + w_{lognorm} P_{lognorm}(X)$ for weights $w$. (You can read LaTeX right?)
Sure. I think a better thing to do (which I think what Carl is suggesting) is to have a prior distribution over x (the effectiveness of a randomly chosen intervention), and interventionDistribution (a categorical distribution over different shapes you think the space of interventions might have). So P(x, 'Pareto') = P('Pareto') P(x | 'Pareto') = w_{Pareto} P_{Pareto}(x) and P(x, 'logNormal') = P('logNormal') P(x | 'logNormal') = w_{logNormal} P_{logNormal}(x). Then, for the first intervention you see, your prior density over effectiveness is indeed P(x) = w_{Pareto} P_{Pareto}(x) + w_{logNormal} P_{logNormal}(x), but after measuring a bunch of interventions, you can update your beliefs about the empirical distribution of effectivenesses.