Jason commented on Nathan Young's quick take 42m ago

It seems plausible to me that those involved in Nonlinear have received more social sanction than those involved in FTX, even though the latter was obviously more harmful to this community and the world.

What does "involved in" mean? The most potentially plausible version of this compares people peripherally involved in FTX (under a broad definition) to the main players in Nonlinear.

titotal6h11

I think jailtime counts as social sanction!

Nathan Young posted a Quick Take 6h ago

Nathan Young6h12

An alternate stance on moderation (from @Habryka.)

This is from this comment responding to this post about there being too many bans on LessWrong. Note how the LessWrong is less moderated than here in that it (I guess) responds to individual posts less often, but more moderated in that I guess it rate limits people more without reason.

I found it thought provoking. I'd recommend reading it.

Thanks for making this post!
One of the reasons why I like rate-limits instead of bans is that it allows people to complain about the rate-limiting and to participate in discussion on their own posts (so seeing a harsh rate-limit of something like "1 comment per 3 days" is not equivalent to a general ban from LessWrong, but should be more interpreted as "please comment primarily on your own posts", though of course it shares many important properties of a ban).

This is a pretty opposite approach to the EA forum which favours bans.

Things that seem most important to bring up in terms of moderation philosophy:
Moderation on LessWrong does not depend on effort
"Another thing I've noticed is that almost all the users are trying. They are trying to use rationality, trying to understand what's been written here, trying to apply Baye's rule or understand AI. Even some of the users with negative karma are trying, just having more difficulty."
Just because someone is genuinely trying to contribute to LessWrong, does not mean LessWrong is a good place for them. LessWrong has a particular culture, with particular standards and particular interests, and I think many people, even if they are genuinely trying, don't fit well within that culture and those standards.
In making rate-limiting decisions like this I don't pay much attention to whether the user in question is "genuinely

...

Bella commented on Killing the moths 2h ago

211

Killing the moths

Bella

· 21d ago · 6m read

This post was partly inspired by, and shares some themes with, this Joe Carlsmith post. My post (unsurprisingly) expresses fewer concepts with less clarity and resonance, but is hopefully of some value regardless.

Content warning: description of animal death.

I live in a ...

Sam Freedman

Thanks for sharing this. I had a similar experience recently in which I treated a wool carpet with cypermethrin (an insecticide widely used to kill clothes moths). This or a similar compound is likely to be what was used in your case as well. It is relatively safe in humans and many mammals in the concentrations present in pest control products. Its mechanism of action is to bind with and disrupt sodium ion channels in the central nervous system of insects. It causes excessive firing of neurons and death. My suggestion to reduce infestations of moths and eventual suffering is to pre-treat natural fibres which are not washed regularly (like carpets) with a cypermethrin (or other pyrethroid) containing product at around 0.1% concentration. These will act as repellents and prevent the reproduction of moths before they become established and prevent future suffering. It should only be used indoors to prevent exposure to non-pest species as it is broadly toxic to insects and many vertebrates. These products are widely available on Amazon. (PS these products might be more toxic to cats for some reason. Bear this in mind when using them)

Bella2h2

Hey Sam — thanks for this really helpful comment. I think I will do this & do so at any future places I live with wool carpets.

Abdurrahman Alshanqeeti commented on Introducing EA in Arabic 2h ago

Introducing EA in Arabic

Abdurrahman Alshanqeeti

· 14d ago · 4m read

I am thrilled to introduce EA in Arabic (الإحسان الفعال), a pioneering initiative aimed at bringing the principles of effective altruism to Arabic-speaking communities worldwide.

Summary

Spoken by more than 400 million people worldwide, Arabic plays a pivotal role...

Abdurrahman Alshanqeeti2h1

Thank you Elham, I'm so happy to see your comment!

Honestly, this is an issue that I'm kinda struggling with too. Would you like to have a quick call to discuss our experiences and maybe collaborate on something?

You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.

SummaryBot commented on Partial value takeover without world takeover 2h ago

Partial value takeover without world takeover

Katja_Grace

· 12h ago

People around me are very interested in AI taking over the world, so a big question is under what circumstances a system might be able to do that—what kind of capabilities could elevate an entity above the melange of inter-agent conflict and into solipsistic hegemony?

We...

SummaryBot2h1

Executive summary: AI systems with unusual values may be able to substantially influence the future without needing to take over the world, by gradually shifting human values through persuasion and cultural influence.

Key points:

Human values and preferences are malleable over time, so an AI system could potentially shift them without needing to hide its motives and take over the world.
An AI could promote its unusual values through writing, videos, social media, and other forms of cultural influence, especially if it is highly intelligent and eloquent.
Partia

... (read more)

OscarD

NIce post! This seems like a key point to me, that it is hard to get good evidence on. The red stripes are rather benign, so we are in luck in a world like that. But if the AI values something in a more totalising way (not just satisficing with a lot of x's and red stripes being enough, but striving to make all humans spend all their time making x's and stripes) that seems problematic for us. Perhaps it depends how 'grabby' the values are, and therefore how compatible with a liberal, pluralistic, multipolar world.

SummaryBot commented on LLM Evaluators Recognize and Favor Their Own Generations 2h ago

LLM Evaluators Recognize and Favor Their Own Generations

Arjun Panickssery

· 18h ago

This is a linkpost for http://tiny.cc/llm_self_recognition

Self-evaluation using LLMs is used in reward modeling, model-based benchmarks like GPTScore and AlpacaEval, self-refinement, and constitutional AI. LLMs have been shown to be accurate at approximating human annotators on some tasks.

But these methods are threatened by self...

SummaryBot2h1

Executive summary: Frontier language models exhibit self-preference when evaluating text outputs, favoring their own generations over those from other models or humans, and this bias appears to be causally linked to their ability to recognize their own outputs.

Key points:

Self-evaluation using language models is used in various AI alignment techniques but is threatened by self-preference bias.
Experiments show that frontier language models exhibit both self-preference and self-recognition ability when evaluating text summaries.
Fine-tuning language models to

... (read more)

Hauke Hillebrandt

Cool instance of black box evaluation - seems like a relatively simple study technically but really informative. Do you have more ideas for future research along those lines you'd like to see?

jimrandomh

16h

Interesting. I think I can tell an intuitive story for why this would be the case, but I'm unsure whether that intuitive story would predict all the details of which models recognize and prefer which other models. As an intuition pump, consider asking an LLM a subjective multiple-choice question, then taking that answer and asking a second LLM to evaluate it. The evaluation task implicitly asks the the evaluator to answer the same question, then cross-check the results. If the two LLMs are instances of the same model, their answers will be more strongly correlated than if they're different models; so they're more likely to mark the answer correct if they're the same model. This would also happen if you substitute two humans or two sittings of the same human implace of the LLMs.

SummaryBot commented on Impactful animal welfare charity worthy of your donations: FRAME - Fund for the Replacement of Animals in Medical Experiments 2h ago

Impactful animal welfare charity worthy of your donations: FRAME - Fund for the Replacement of Animals in Medical Experiments

Deborah W.A. Foulkes

· 6h ago · 6m read

This is a linkpost for https://frame.org.uk/who-we-are/impact-report/

Excerpt from Impact Report 2022-2023

"Creating a better future, for animals and humans

£242,510 of research funded

3 Home Office meetings attended

5 PhDs completed through the FRAME Lab

33 people attended our Training School in Norway and our experimental design training...

SummaryBot2h1

Executive summary: FRAME (Fund for the Replacement of Animals in Medical Experiments) is an impactful animal welfare charity working to end the use of animals in biomedical research and testing by funding research into non-animal methods, educating scientists, and advocating for policy changes.

Key points:

In 2022, FRAME funded £242,510 of research into non-animal methods, supported 5 PhD students, and trained 33 people in experimental design.
The FRAME Lab at the University of Nottingham focuses on developing and validating non-animal approaches in areas lik

165

Future of Humanity Institute 2005-2024: Final Report

Pablo

· 1d ago · 6m read

This is a linkpost for https://static1.squarespace.com/static/660e95991cf0293c2463bcc8/t/661a3fc3cecceb2b8ffce80d/1712996303164/FHI+Final+Report.pdf

Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.

Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse

...

SummaryBot2h1

Executive summary: The Future of Humanity Institute (FHI) achieved notable successes in its mission from 2005-2024 through long-term research perspectives, interdisciplinary work, and adaptable operations, though challenges included university politics, communication gaps, and scaling issues.

Key points:

Long-term research perspectives and pre-paradigmatic topics were key to FHI's impact, enabled by stable funding.
An interdisciplinary and diverse team was valuable for tackling neglected research areas.
Operations staff needed to understand the mission as it g

... (read more)

MathiasKB

I'm awestruck, that is an incredible track record. Thanks for taking the time to write this out. These are concepts and ideas I regularly use throughout my week and which have significantly shaped my thinking. A deep thanks to everyone who has contributed to FHI, your work certainly had an influence on me.

Chris Leong

14h

For anyone wondering about the definition of macrostrategy, the EA forum defines it as follows:

Kaspar Brandner commented on Repugnance and replacement 4h ago

Repugnance and replacement

MichaelStJules

· 8d ago · 11m read

Summary

Many views, including even some person-affecting views, endorse the repugnant conclusion (and very repugnant conclusion) when set up as a choice between three options, with a benign addition option.
Many consequentialist(-ish) views, including many person-affecting

...

Kaspar Brandner

14h

I wouldn't agree on the first point, because making Desgupta's step 1 the "step 1" is, as far as I can tell, not justified by any basic principles. Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+. Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?). The fact that non-existence is not involved here (a comparison to A) is just a result of that decision, not of there really existing just two options. Alternatively there is the regret argument, that we would "realize", after choosing A+, that we made a mistake, but that intuition seems not based on some strong principle either. (The intuition could also be misleading because we perhaps don't tend to imagine A+ as locked in). I agree though that the classification "person-affecting" alone probably doesn't capture a lot of potential intricacies of various proposals.

MichaelStJules

10h

We should separate whether the view is well-motivated from whether it's compatible with "ethics being about affecting persons". It's based only on comparisons between counterparts, never between existence and nonexistence. That seems compatible with "ethics being about affecting persons". We should also separate plausibility from whether it would follow on stricter interpretations of "ethics being about affecting persons". An even stricter interpretation would also tell us to give less weight to or ignore nonidentity differences using essentially the same arguments you make for A+ over Z, so I think your arguments prove too much. For example, 1. Alice with welfare level 10 and 1 million people with welfare level 1 each 2. Alice with welfare level 4 and 1 million different people with welfare level 4 each You said "Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+." The same argument would support 1 over 2. Then you said "Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?)." Similarly, I could say "Picking 2 is only motivated by an arbitrary decision to compare contingent people, merely because there's a minimum number of contingent people across outcomes (... so what?)" So, similar arguments support narrow person-affecting views over wide ones. I think ignoring irrelevant alternatives has some independent appeal. Dasgupta's view does that at step 1, but not at step 2. So, it doesn't always ignore them, but it ignores them more than necessitarianism does. I can further motivate Dasgupta's view, or something similar: 1. There are some "more objective" facts about axiology or what we should do that don't depend on who presently, actually or across all outcomes necessarily exists (or even wide versions of this). What we should do is first constrained by these "more object

Kaspar Brandner4h1

You said "Ruling out Z first seems more plausible, as Z negatively affects the present people, even quite strongly so compared to A and A+." The same argument would support 1 over 2.

Granted, but this example presents just a binary choice, with none of the added complexity of choosing between three options, so we can't infer much from it.

Then you said "Ruling out A+ is only motivated by an arbitrary-seeming decision to compare just A+ and Z first, merely because they have the same population size (...so what?)." Similarly, I could say "Picking 2 is onl

Cooperative AI: Three things that confused me as a beginner (and my current understanding)

C Tilli

· 2d ago · 7m read

I started working in cooperative AI almost a year ago, and as an emerging field I found it quite confusing at times since there is very little introductory material aimed at beginners. My hope with this post is that by summing up my own confusions and how I understand them...

C Tilli5h1

Thank you Shaun!

I found myself wondering where we would fit AI Law / AI Policy into that model.

I would think policy work might be spread out over the landscape? As an example, if we think of policy work aiming to establishing the use of certain evaluations of systems, such evaluations could target different kinds of risk/qualities that would map to different parts of the diagram?

Effective Altruism Forum
EA Forum

New & upvoted

Posts tagged community

Quick takes

Popular comments

Recent discussion

Summary

Summary

Resources

Opportunities

Listen to posts anywhere