Peter_Hurford comments on An Effective Altruist Message Test - Effective Altruism Forum

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread. Show more comments above.

Comment author: nikvetr 01 April 2017 07:56:38PM *  3 points [-]

Yay for Bayesian regression (binomial, I'm guessing? You re-binned your attitude and donations responses? I think an ordered logit would be more appropriate here and result in less of a loss in resolution, or even a dirichlet, but then you'd lose yer ordering)! Those posteriors look decently tight, though I do have some questions!

I'm a little confused on what your control was, exactly. You have both points and distributions in your posterior plots, but you don't have any control paragraph blurb in you google doc questionnaire. How did you evaluate your control? Did you give them a paragraph entirely unrelated to EA? These plots are the posterior estimates for p_binomial when each dummy variable for treatment is 0? Is "average treatment effect" some posterior predictive difference from the control p (i.e. why it's exactly 0)?

On a related (and elucidatory) note, could you more explicitly clarify which models you fitted, exactly? Did you do any model comparison or averaging, or evaluate model adequacy? You mention "controlling for other variables in the survey" but I don't see any e.g. demographic questions in your questionnaire. You said you "examined these relationships overall and among the critical subgroup of those with at least a bachelor’s degree" -- did you do this by excluding everyone without a bachelor's, or by modeling the effects of educational attainment and then doing model comparison to test the legitimacy of those effects (I'd think looking at the posterior for the interaction between your paragraph and education dummies would be the clearest test)? Did you use diffuse, "uninformative" priors (and hyperpriors)? Which ones, exactly?

I assume that since this is a hierarchical analysis you used MCMC (HMC?) to do the fitting. Are your posterior distributions smoothed substantially, e.g. with a kernel density estimator? Or did you just get fantastic performance? What diagnostics did you run to ensure MCMC health? How many chains did you run? Did you use stopping rules? In my experience, hierarchical regression models can be pretty finicky to fit as they get more complex.

Kudos on not just using some wackily inappropriate out-of-the-box frequentist test!

edit: also, what are the boxplot-looking things? 95% HPDIs? CIs? Some other %? Ah wait they're the sd of your marginal samples?

Comment author: Peter_Hurford  (EA Profile) 01 April 2017 08:23:28PM 2 points [-]

On a related (and elucidatory) note, could you more explicitly clarify which models you fitted, exactly?

It would be cool to provide the code, for both learning and verification purposes.

Comment author: Michael_S 01 April 2017 09:04:13PM 0 points [-]

Unfortunately, because I used proprietary survey data/a proprietary R package to run this analysis, I don't think I'll be able to share the data and code.

Comment author: nikvetr 01 April 2017 09:18:19PM 0 points [-]

Ah, interesting! What package? I've never heard of something like that before. Usually in the cold, mechanical heart of every R package is the deep desire to be used and shared as far as possible. If it's just someone's personal interface code, why not use something more publicly available? Can you write out your basic script in pseudocode (or just math/words?)? Especially the model and MCMC specification bits?

Comment author: Michael_S 01 April 2017 09:39:50PM 0 points [-]

Sure, in an ideal world, software would all be free for everyone; alas, we do not live in such a world :p. I used the proprietary package because it did exactly what I needed and doesn't require writing STAN code or anything myself. I'd rather not re-invent the wheel. I felt the tradeoff of transparency for efficiency and confidence in its accuracy was worth it, especially since I wouldn't be able to share the data either way (such are the costs of getting these questions on a 1200 person survey without paying a substantial amount).

But the basic model was just a multilevel binomial model predicting the dependent variable using the treatments and questions asked earlier in the survey as controls.

Comment author: nikvetr 01 April 2017 09:47:59PM *  0 points [-]

Of course (though wheel reinvention can be super helpful educationally), but there are great free public R packages that interface to STAN (I use "rethinking" for my hierarchical Bayesian regression needs but I think Rstan would work, too), so going with someone's unnamed, private code isn't necessary imo. How much did the survey cost (was it a lot longer than the included google doc, then? e.g. Did you have screening questions to make sure people read the paragraph?). And model+mcmc specification can have lots of fiddly bits that can easily lead us astray, I'd say

Comment author: Michael_S 01 April 2017 10:07:30PM *  -1 points [-]

Yeah, the survey was a lot longer. Typically general public surveys will cost over 10 dollars a complete, so getting 1200 cases for a survey like this can cost thousands of dollars.

I agree that model specification can be tricky, which is a reason I felt it well worth it to use the proprietary software I had access to that has been thoroughly vetted and code reviewed and is used frequently to run similar analyses rather than trying to construct my own.

I did not make sure people read the paragraph. I discussed the issue a bit in my discussion section, but one way a web survey might understate the effect is if people would pay closer attention and respond better to a friend delivering the message. OTOH, surveys do have some potentual vulnerability to the hawthorne effect, though that didn't seem to express itself in the donations question.

Comment author: nikvetr 01 April 2017 08:28:46PM 0 points [-]

Yep, and alongside it, of course, the raw data!