Comment author: nikvetr 01 April 2017 09:47:59PM *  0 points [-]

Of course (though wheel reinvention can be super helpful educationally), but there are great free public R packages that interface to STAN (I use "rethinking" for my hierarchical Bayesian regression needs but I think Rstan would work, too), so going with someone's unnamed, private code isn't necessary imo. How much did the survey cost (was it a lot longer than the included google doc, then? e.g. Did you have screening questions to make sure people read the paragraph?). And model+mcmc specification can have lots of fiddly bits that can easily lead us astray, I'd say

Comment author: Michael_S 01 April 2017 10:07:30PM *  -1 points [-]

Yeah, the survey was a lot longer. Typically general public surveys will cost over 10 dollars a complete, so getting 1200 cases for a survey like this can cost thousands of dollars.

I agree that model specification can be tricky, which is a reason I felt it well worth it to use the proprietary software I had access to that has been thoroughly vetted and code reviewed and is used frequently to run similar analyses rather than trying to construct my own.

I did not make sure people read the paragraph. I discussed the issue a bit in my discussion section, but one way a web survey might understate the effect is if people would pay closer attention and respond better to a friend delivering the message. OTOH, surveys do have some potentual vulnerability to the hawthorne effect, though that didn't seem to express itself in the donations question.

Comment author: nikvetr 01 April 2017 09:39:32PM *  0 points [-]

Ah, I guess that's better than no control, and presumably paying attention to a paragraph of text doesn't make someone substantially more or less generous. Did you fit a bunch of models with different predictors and test for a sufficient improvement of fit with each? Might do to be wary of overfitting in those regards maybe... though since those aren't focal Bayes tends to be pretty robust there, imo, so long as you used sensible priors

"I used a multilevel model to estimate the effects among those with and without a bachelor's degree. So, the bachelor's estimate borrow's power from those without a degree, reducing problems with over fitting."

If I'm understanding correctly, you had a hyperprior on the effect of education level? With just two options? IDK that that would help you much (if you had more: e.g. HS, BA/S, MS, PhD, etc. it might, but I'd try to preserve ordering there, myself).

"These models used STAN, which handles these multilevel models well. Convergence was assessed with gelman-rubin statistics."

STAN's great, but certainly not magic or perfect, and though idk them personally I'm sure its authors would strongly advocate paranoia about its output. So you got convergence with multiple (2?) chains from a random (hopefully) starting value? R_hats were all 1? That's good! Did all the other cheap diagnostics turn up ok (e.g trace plots, autocorrelation times/ESS, marginal histograms, quick within-chain metrics, etc.)?

Comment author: Michael_S 01 April 2017 09:53:57PM -1 points [-]

No; I did not fit multiple models. Lasso regression was used to fit a propensity model using the predictors.

Using bachelor's vs. non-bachelor's has advantages in interpretability, so I think this was the right move for my purposes.

I did not spend an exorbitant amount of time investigating diagnostics, for the same reason I used a proprietary package was has been built for running these tests at a production level and has been thoroughly code reviewed. I don't think it's worth the time to construct an overly customized analysis.

Comment author: nikvetr 01 April 2017 09:18:19PM 1 point [-]

Ah, interesting! What package? I've never heard of something like that before. Usually in the cold, mechanical heart of every R package is the deep desire to be used and shared as far as possible. If it's just someone's personal interface code, why not use something more publicly available? Can you write out your basic script in pseudocode (or just math/words?)? Especially the model and MCMC specification bits?

Comment author: Michael_S 01 April 2017 09:39:50PM -1 points [-]

Sure, in an ideal world, software would all be free for everyone; alas, we do not live in such a world :p. I used the proprietary package because it did exactly what I needed and doesn't require writing STAN code or anything myself. I'd rather not re-invent the wheel. I felt the tradeoff of transparency for efficiency and confidence in its accuracy was worth it, especially since I wouldn't be able to share the data either way (such are the costs of getting these questions on a 1200 person survey without paying a substantial amount).

But the basic model was just a multilevel binomial model predicting the dependent variable using the treatments and questions asked earlier in the survey as controls.

Comment author: Peter_Hurford  (EA Profile) 01 April 2017 08:23:28PM 2 points [-]

On a related (and elucidatory) note, could you more explicitly clarify which models you fitted, exactly?

It would be cool to provide the code, for both learning and verification purposes.

Comment author: Michael_S 01 April 2017 09:04:13PM -1 points [-]

Unfortunately, because I used proprietary survey data/a proprietary R package to run this analysis, I don't think I'll be able to share the data and code.

Comment author: nikvetr 01 April 2017 07:56:38PM *  3 points [-]

Yay for Bayesian regression (binomial, I'm guessing? You re-binned your attitude and donations responses? I think an ordered logit would be more appropriate here and result in less of a loss in resolution, or even a dirichlet, but then you'd lose yer ordering)! Those posteriors look decently tight, though I do have some questions!

I'm a little confused on what your control was, exactly. You have both points and distributions in your posterior plots, but you don't have any control paragraph blurb in you google doc questionnaire. How did you evaluate your control? Did you give them a paragraph entirely unrelated to EA? These plots are the posterior estimates for p_binomial when each dummy variable for treatment is 0? Is "average treatment effect" some posterior predictive difference from the control p (i.e. why it's exactly 0)?

On a related (and elucidatory) note, could you more explicitly clarify which models you fitted, exactly? Did you do any model comparison or averaging, or evaluate model adequacy? You mention "controlling for other variables in the survey" but I don't see any e.g. demographic questions in your questionnaire. You said you "examined these relationships overall and among the critical subgroup of those with at least a bachelor’s degree" -- did you do this by excluding everyone without a bachelor's, or by modeling the effects of educational attainment and then doing model comparison to test the legitimacy of those effects (I'd think looking at the posterior for the interaction between your paragraph and education dummies would be the clearest test)? Did you use diffuse, "uninformative" priors (and hyperpriors)? Which ones, exactly?

I assume that since this is a hierarchical analysis you used MCMC (HMC?) to do the fitting. Are your posterior distributions smoothed substantially, e.g. with a kernel density estimator? Or did you just get fantastic performance? What diagnostics did you run to ensure MCMC health? How many chains did you run? Did you use stopping rules? In my experience, hierarchical regression models can be pretty finicky to fit as they get more complex.

Kudos on not just using some wackily inappropriate out-of-the-box frequentist test!

edit: also, what are the boxplot-looking things? 95% HPDIs? CIs? Some other %? Ah wait they're the sd of your marginal samples?

Comment author: Michael_S 01 April 2017 09:02:29PM -1 points [-]

Yup, binomial.

The respondents in a treatment were each shown a message and asked how compelling they thought it was. The control was shown no message.

Yeah; the plots are the predicted values for those given a particular treatment. and Average Treatment Effect is the difference with the control.

I did not include every control used in the provided questionnaire. There were a mix of demographics/attitudinal/behavioral questions asked in the survey that I also used. These controls, particularly previous donations, were important for decreasing variance.

I used a multilevel model to estimate the effects among those with and without a bachelor's degree. So, the bachelor's estimate borrow's power from those without a degree, reducing problems with over fitting.

These models used STAN, which handles these multilevel models well. Convergence was assessed with gelman-rubin statistics.

17

An Effective Altruist Message Test

I decided to run an Effective Altruist message on a full population survey I have access to, use bayesian message testing software to analyze the results, and share the results with the EA community on the forum.   I tested several EA themed messages aimed at increasing respondents’ interest in... Read More
Comment author: Michael_S 01 April 2017 01:21:38PM 5 points [-]

I agree that the modal outcome of a Trump presidency is that he changes little and the Democrats come out stronger at the end of his presidency than they entered. However, I still think it would have been better that Clinton had won (even if we assume the same congress).

The most important reason is tail risk. As others have commented, the risk of nuclear war may be greater under Trump than it would have been under Clinton. So far, he seems to be pursuing a more conventional foreign policy than I feared, but I still believe the risk is higher than with Clinton. Additionally, I'm worried that the Trump presidency is increasing the salience of Russian hostility among Democrats and could increase the chance of conflict in the future even when a Democrat takes office.

Another are of concern is pandemics. Trump has expressed anti-Vaccine sentiments and submitted budgets which cut pandemic preparedness. Furthermore, the overall level of incompetence in his administration and many of his appointees leaves me worried that the response of the US to a major pandemic could be diminished.

None of the above is likely to happen, but I'd much rather play it safe with a Clinton presidency. Additionally, even the modal outcome of a presidency isn't all good for the liberals. Most notably, he'll almost certainly be able to move at least one conservative into the supreme court and has a high chance of moving at least one more. If Trump replaces a liberal with a conservative on the court, the court will move to the right and it will likely be quite a while until Democrats retake it. With a Clinton presidency, liberals would have been able to achieve a majority on the court that would likely have lasted a long time itself.

Comment author: Michael_S 26 February 2017 03:20:06PM 6 points [-]

Thanks for the write up. I think you make a compelling case that this is more effective than canvassing, which can be over 1000 dollars for votes at the margin in a competitive election like 2016. I do think there are a few ways your estimate may be an overestimate though.

Of those who claimed they would follow through with vote trading, some may not have. You mention that there wouldn't have been much value to defecting. However, much of the value of a vote for individual comes from tribal loyalties rather than affecting the outcome. That's why turnout is higher in safe presidential states in a presidential election than midterm elections, even when the midterm election is competitive. Some individuals may still have defected because of this.

Secondly, many of the 3rd party folks who made the trade could have voted for Clinton anyway. People who sign up for these sites are necessarily strategic thinkers. If they wanted more total votes for Stein/Johnson, but recognized that a vote for Clinton was more important in a swing state, they might have signed up for the site to gain the Stein/Johnson voter, but planned to vote for Clinton even if they didn't get a match. Additionally, even if they were acting in good faith when they signed up, they may have changed their mind as the election approached. 3rd parties are historically over estimated in polling compared to the election results, and 2016 was no exception: http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton_vs_johnson_vs_stein-5952.html.

I don't think these problems are enough to reduce the value by an order of magnitude, but it is worth keeping in mind.

Additionally, while vote trading may be high EV now, I am skeptical that it is easy to scale. It's even more difficult to apply outside of presidential elections, so, unlike other potential political interventions, it will mostly be confined to every 4 years in one race. Furthermore, the individuals who signed up now may be lower cost to acquire than additional potential third party traders. They are likely substantially more strategic than the full population of 3rd party voters; in many years, the full population isn't that large to begin with. The cost per additional vote may be larger than your current estimates.

Nevertheless, I agree that right now it's probably more valuable than traditional canvassing and I'm glad people are putting resources into it.

Comment author: Michael_S 08 January 2017 04:54:57PM *  4 points [-]

This sounds really great to me. I love the idea of having more RCTs in the EA sphere. I would definitely record how much they are giving 1 year later.

I also think it's worth having a hold out set. People can pre-register the list of friends, than a random number generator can be used to randomly selects some friends not to make an explicit GWWC pitch to. It's possible many of the friends/contacts who join GWWC and start donating are those who have already been exposed to EA ideas before over a long period of time, and the effect size of the direct GWWC pitch isn't as large as it would appear. Having a hold out set would account for this. With a hold out set, CEA wouldn't have to worry about who they contact. The holdout set would take care of this and make the estimate of the treatment effect unbiased.

Comment author: kbog  (EA Profile) 11 November 2016 03:37:25PM *  1 point [-]

That's not true at all.

It is true. Romney got 61 million votes and McCain got 60 million. Obama got 69 million and 66 million in 2008 and 2012 respectively. This year, Trump got 60 million votes and Hillary got 61 million.

There were several instances that fall under the same pattern: the email story, the hollywood access tapes, the debates, probably the apprentice tapes if they had appeared, and potentially the wikileak emails, though it's much harder to gauge their effect size.

Well, depending on how early before the election you want to consider. The debates for instance were all more than a week before the election. Again, it's basically impossible to put effort into making things like this happen, and the best way to do so might simply be conventional ways of building political clout and awareness.

Comment author: Michael_S 12 November 2016 12:04:46AM *  1 point [-]

You can't look at aggregate turnout numbers being different and assume the composition of turnout was different. You're making the assumption that there was 0 movement from Obama to Trump or from Romney to Clinton; both of which are definitely incorrect as evidenced by polling.

Secondly, turnout is much higher than that appears; much more will come in from California, Washington, Oregon and Colorado. It always takes these states forever to report. So the turnout numbers now are misleading.

View more: Prev | Next