jimrandomh commented on LLM Evaluators Recognize and Favor Their Own Generations 12m ago

LLM Evaluators Recognize and Favor Their Own Generations

· 2h ago

This is a linkpost for http://tiny.cc/llm_self_recognition

Self-evaluation using LLMs is used in reward modeling, model-based benchmarks like GPTScore and AlpacaEval, self-refinement, and constitutional AI. LLMs have been shown to be accurate at approximating human annotators on some tasks.

But these methods are threatened by self...

jimrandomh12m2

Interesting. I think I can tell an intuitive story for why this would be the case, but I'm unsure whether that intuitive story would predict all the details of which models recognize and prefer which other models.

As an intuition pump, consider asking an LLM a subjective multiple-choice question, then taking that answer and asking a second LLM to evaluate it. The evaluation task implicitly asks the the evaluator to answer the same question, then cross-check the results. If the two LLMs are instances of the same model, their answers will be more strongly cor... (read more)

MichaelStJules commented on Repugnance and replacement 1h ago

Repugnance and replacement

MichaelStJules

· 7d ago · 10m read

Summary

Many views, including even some person-affecting views, endorse the repugnant conclusion (and very repugnant conclusion) when set up as a choice between three options, with a benign addition option.
Many consequentialist(-ish) views, including many person-affecting

...

MichaelStJules

Dasgupta's view makes ethics about what seems unambiguously best first, and then about affecting persons second. It's still person-affecting, but less so than necessitarianism and presentism. It could be wrong about what's unambiguously best, though, e.g. we should reject full aggregation, and prioritize larger individual differences in welfare between outcomes, so A+' (and maybe A+) looks better than Z. Do you think we should be indifferent in the nonidentity problem if we're person-affecting? I.e. between creating a person a person with a great life and a different person with a marginally good life (and no other options). For example, we shouldn’t care about the effects of climate change on future generations (maybe after a few generations ahead), because future people's identities will be different if we act differently. But then also see the last section of the post.

Kaspar Brandner

In the non-identity problem we have no alternative which doesn't affect a person, since we don't compare creating a person with not-creating it, but creating a person vs creating a different person. Not creating one isn't an option. So we have non-present but necessary persons, or rather: a necessary number of additional persons. Then even person-affecting views should arguably say, if you create one anyway, then a great one is better than a marginally good one. But in the case of comparing A+ and Z (or variants) the additional people can't be treated as necessary because A is also an option.

MichaelStJules1h2

Then, I think there are ways to interpret Dasgupta's view as compatible with "ethics being about affecting persons", step by step:

Step 1 rules out options based on pairwise comparisons within the same populations, or same number of people. Because we never compare existence to nonexistence — we only compare the same people or with the same number like in nonidentity — at this step, this step is arguably about affecting persons.
Step 2 is just necessitarianism on the remaining options. Definitely about affecting persons.

These other views also seem compatible... (read more)

DominikPeters commented on Future of Humanity Institute 2005-2024: Final Report 1h ago

149

Future of Humanity Institute 2005-2024: Final Report

Pablo

· 9h ago · 6m read

This is a linkpost for https://static1.squarespace.com/static/660e95991cf0293c2463bcc8/t/661a3fc3cecceb2b8ffce80d/1712996303164/FHI+Final+Report.pdf

Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.

Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse

...

DominikPeters1h1

From Bostrom's website, an updated "My Work" section reads:

... That’s why I founded the Future of Humanity Institute at Oxford University in 2005. FHI brought together an interdisciplinary bunch of brilliant (and eccentric!) minds, and sought to shield them as much as possible from the pressures of regular career academia; and thus were laid the foundations for exciting new fields of study.

Those were heady years. FHI was a unique place - extremely intellectually alive and creative - and remarkable progress was made. FHI was also quite fertile, spawning a n

... (read more)

Arepo

That's sad. For anyone interested in why they shut down (I'd thought they had an indefinitely sustainable endowment!), the archived version of their website gives some info:

AnotherAnonymousFTXAccount

On your second point, FHI had at least ~£10m sitting in the bank in 2020 (see below, from the report). So the fundraising freeze, while unusual, wasn't terminal. A rephrasing of your question is "What adminstrative and organisational problems at FHI could possibly have prompted the Faculty to take the unusual step of a hiring and fundraising freeze in 2020, and why could it not be resolved over the next two to three years?"

You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.

Marisa posted What's the evidence for and against modern psychiatry? 2h ago

What's the evidence for and against modern psychiatry?

Marisa

· 2h ago · 1m read

Super broad question, I know.

I've been going down the rabbit hole of critical psychiatry lately and I'm finding it fascinating. Parts of it seem convincing and anecdotally align with my (admittedly extensive) interactions with the psychiatric system. But the evidence in both directions seems very cherry-picked and I haven't found an overall balanced view of both sides of the argument, and I'd like to be better informed, both to make personal decisions on the matter and to potentially advocate for evidence-based mental health policies.

Has anyone done (or is anyone interested enough to do) a somewhat thorough literature review on the effectiveness of things like medication, psychiatric hospitalizations, etc.?

Things I've been looking at that I'd be interested in a critical evaluation of:

...

Open Philanthropy posted Day in the Life: Alex Bowles 2h ago

Day in the Life: Alex Bowles

Open Philanthropy

· 2h ago · 2m read

This is a linkpost for https://www.openphilanthropy.org/research/day-in-the-life-alex-bowles/

Open Philanthropy’s “Day in the Life” series showcases the wide-ranging work of our staff, spotlighting individual team members as they navigate a typical workday. We hope these posts provide an inside look into what working at Open Phil is really like. If you’re interested in joining our team, we encourage you to check out our open roles.

Alex Bowles is a Senior Program Associate on Open Philanthropy’s Science and Global Health R&D team^[1], and a member of the Global Health and Wellbeing Cause Prioritization team. His responsibilities include estimating the cost-effectiveness of research and development grants in science and global health, identifying and assessing new strategic areas for the team, and investigating new Open Phil cause areas within global health and wellbeing.

*Alex and his wife, Kim, canoeing in Maine*

Day in the Life

I’m part of the ~70% of Open Phil staff who work...

Arepo commented on On building Omelas for shrimp; the implications of diversity-oriented theories of moral value on factory farming 2h ago

On building Omelas for shrimp; the implications of diversity-oriented theories of moral value on factory farming

Isaac King

· 20d ago · 6m read

Identity

In theory of mind, the question of how to define an "individual" is complicated. If you're not familiar with this area of philosophy, see Wait But Why's introduction.

I think most people in EA circles subscribe to the computational theory of mind, which means that...

Isaac King

Creating identical copies of people is not claimed to sum to less moral worth than one person. It's claimed to sum to no more than one person. Torturing one person is still quite bad.

Arepo2h2

By inference, if you are one of those copies, the 'moral worth' of your own perceived torture will therefore be 1/10billionth of its normal level. So, selfishly, that's a huge upside - I might selfishly prefer being one of 10 billion identical torturees as long as I uniquely get a nice back scratch afterwards, for e.g.

MichaelStJules commented on A simple argument for the badness of human extinction 2h ago

A simple argument for the badness of human extinction

Matthew Rendall

· 13h ago · 2m read

How bad would it be to cause human extinction? ‘'If we do not soon destroy ourselves’, write Carl Sagan and Richard Turco, ‘but instead survive for a typical lifetime of a successful species, there will be humans for another 10 million years or so. Assuming that our lifespan...

MichaelStJules

Maybe I'm misunderstanding, but if * we totally discounted what happens to future/additional people (even stronger than no reason to create them), and only cared about present/necessary people, and * killing everyone/extinction means killing all present/necessary people (extinction now, not extinction in the future) and no one else ever existing, then, conditional on the given virus mutating 1. the first virus kills 7 billion + possibly several million more people who presently/necessarily exist, but less than 8 billion present/necessary people 2. the second virus kills everyone, 8 billion present/necessary people 2 kills more present/necessary people, so we'd want to prevent it. EDIT: It looks like you pointed out something similar here.

Matthew Rendall

Thanks--that's very helpful. On a wide person-affecting view, A would be worse, but if we limit our analysis to present/necessary people, then outcome B would be worse. That had not occurred to me, probably because I find narrow person-affecting views so implausible. However, it doesn't seem very damaging to my argument. If we take a hardcore narrow person-affecting view, the extra ten billion deaths shouldn’t count at all in our assessment. But surely that's very hard to believe. Alternatively, if we adopt what Parfit calls a 'two tier view', then we’d give some weight to the deaths of the contingent people in scenario A, but less than to the deaths of present/necessary people. Even if we discounted them by a factor of five, however, scenario A would still be worse than scenario B. What is more, we can adjust the numbers: Scenario A: Seven billion necessary people die immediately and ten million die annually for the next 10,000 years for a total of 107 billion. Most of the future people are contingent. Scenario B: Eight billion die at once. All are necessary people. On the two tier-view, deaths of necessary people would have to be more than a hundred times as bad as those of contingent ones for B to be worse. That is hard to believe. Bottom line: 1. Plausible person-affecting views will judge A better than B. 2. That A is better than B is, however, implausible. 3. ∴ No otherwise plausible person-affecting view renders a plausible judgement about this case. 4. ∴ Person-affecting views do not provide a convincing rationale for rejecting my argument against the Intuition of Neutrality.

MichaelStJules2h2

Do you intend for the population to recover in B, or extinction with no future people? In the post, you write that the second virus "will kill everybody on earth". I'd assume that means extinction.

If B (killing 8 billion necessary people) does mean extinction and you think B is better than A, then you prefer extinction to extra future deaths. And your argument seems general, e.g. we should just go extinct now to prevent the deaths of future people. If they're never born, they can't die. You'd be assigning negative value to additional deaths, but no positiv... (read more)

aaronhamlin posted a Quick Take 3h ago

aaronhamlin3h1

AI safety

Ethical Implications of AI in Military Operations: A Look at Project Nimbus

Recently, 'Democracy Now' highlighted Google’s involvement in Project Nimbus, a $1.2 billion initiative to provide cloud computing services to the Israeli government, including military applications. Google employees have raised concerns about the use of AI in creating 'kill lists' with minimal human oversight, as well as the usage of Google Photos to identify and detain individuals. This raises ethical questions about the role of AI in warfare and surveillance.

Despite a sit-in and retaliation against those speaking against the project, there has been little visible impact on the continuation of the contract. The most recent protesters faced arrest. What does this suggest about the power of AI in the hands of governments and the efficacy of public dissent in influencing such high-stakes deployments of AI use?

Effective Altruism Forum
EA Forum

New & upvoted

Posts tagged community

Quick takes

Popular comments

Recent discussion

Summary

Day in the Life

Identity

Resources

Opportunities

Listen to posts anywhere