[Add answers] What can each Manifold user be convinced of in the next month? (Debate market)
Mini
20
4.0k
resolved May 25
Resolved
YES
75% or more of the debate answers in this market resolve NO
Resolved
YES
@SaviorofPlant can be convinced that P(doom) < 99%
Resolved
NO
@SaviorofPlant can be convinced to stop using caffeine for a month or more
Resolved
NO
Two debate answers added by separate users in this market resolve YES
Resolved
NO
@SaviorofPlant can be convinced that this is a stupid market idea
Resolved
NO
@TimothyJohnson5c16 can be convinced to vote for any Republican for a California statewide office in the November 2024 election
Resolved
NO
@SaviorofPlant can be convinced to spend 100 dollars on anything they wouldn't have otherwise bought
Resolved
NO
@SaviorofPlant can be convinced to sell a 100 mana or more position in any market they wouldn't have otherwise sold
Resolved
NO
@SaviorofPlant can be convinced that LLM chatbots are not meaningfully agentic
Resolved
NO
@SaviorofPlant can be convinced that the chance of GPT-5 causing doom is <1%
Resolved
NO
@NBAP can be convinced to accept an eternity in limbo (empty white void) in the afterlife in exchange for some earthly reward in life.
Resolved
NO
@dglid can be convinced that the Manifold Pivot is a bad business decision by the Manifold team
Resolved
NO
@SaviorofPlant can be convinced to stop using Manifold
Resolved
NO
Any single user causes at least 3 answers in this market to resolve YES by winning debates
Resolved
NO
@JamesF can be convinced that P(doom) >50%
Resolved
NO
Any comment in this market will have more than 100 replies engaging in meaningful debate
Resolved
NO
@NBAP can be convinced that an eternity in a heavenly afterlife is as desirable (or more) than an eternity in a hellish afterlife is undesirable.
Resolved
NO
@RobertCousineau can be convinced to take 21 days off nicotine products
Resolved
NO
@Bayesian can be convinced not to donate his remaining nw to charity
Resolved
NO
@Bayesian can be convinced that digital computers can never become conscious

This is an experimental market where you can add a belief you have as an answer, and users can bet on whether anyone on the site can convince you otherwise, in the comments of this market or in Manifold DMs. (You can also add answers about other users, but I'd make sure they're willing to participate first.) It's meant to be a rough equivalent to /r/CMV. The idea is for users to scope out others' beliefs, even if they don't particularly intend on debating them, and bet accordingly, creating a rough ordering of which users/subjects are likely to be easier or harder (but also which .

To resolve a market YES, simply comment or DM me that you have been convinced and who convinced you. If you naturally change your mind without talking about it with anyone on the site, I will resolve the answer NO. Meta answers about the market itself like "Two debate answers added by separate users in this market resolve YES" are allowed. I think it's ideal if users do not bet on their own answers, since that adds perverse incentives, but I won't try to stop people from doing so.

The market close date will not change, and all remaining open answers resolve NO at that time.

Get Ṁ600 play money

🏅 Top traders

#NameTotal profit
1Ṁ300
2Ṁ227
3Ṁ45
4Ṁ36
5Ṁ22
Sort by:
@SaviorofPlant can be convinced that LLM chatbots are not meaningfully agentic

LLMs can only do things that they have been shown how to do

https://x.com/eshear/status/1790093549909753907

https://x.com/svpino/status/1791156005331665085

For example they incorrectly answer this easy question because they cannot think/reason but only memorize

@ismellpillows

1) I disagree strongly

2) This has nothing to do with agency, which I define roughly as "having preferences and taking actions to advance those preferences"

@SaviorofPlant do you mean for example “LLMs prefer to answer things correctly, and will try to answer things correctly”?

@ismellpillows Similar type of statement, but substitute "give certain types of responses" for "answer things correctly". For example Claude has a robust preference towards flattering the user, and most chatbots have a summarization preference for many types of prompts

@SaviorofPlant How do you differentiate having preferences and taking actions to advance them? Like, what makes it true that Claude has a preference towards flattering the user? Is it that it does it often?

@ismellpillows Yes, that's the best way to measure it in current systems. Because LLMs and LLM chatbots have relatively simple decision theory, their preferences and actions are deeply entangled. The preferences of an LLM-based agent are almost entirely determined by the likelihood values it assigns to tokens. If the decision theory were more complex and less myopic, this wouldn't necessarily be true (the model might do things often that it does not prefer but values instrumentally in advancing towards a preferred world state).

On your earlier point, which I didn't respond to besides saying "I disagree": it is true that LLMs learn functions which approximate the training distribution. However, these functions can correctly extrapolate out of distribution (a simple example is a neural network learning to add 3 digit numbers and ending up with an algorithm to add numbers of arbitrary length). This can result in behavior that they have not been shown how to do, and I expect generalization out of distribution to improve with scale.

@SaviorofPlant I agree models tend to default to certain behaviors, like summarization. When their prompts include instructions to, for example, add a snarky comments at the end of their responses, they will comply and their behavior changes. would that imply that they don't actually hold the preferences, and are like other tools that respond to user input?

I thought of some systems and I'm curious what you think classifies them as agentic or not:

  • self-driving car system whose goal is to get to the destination, while optimizing for safety, comfort, efficiency. is it relevant how capable the system is? does it matter that the destination is chosen by a user (not car)?

  • a person who only tells lies (or similarly quirky character)

(throwing things out here)

also I see your point that LLMs extrapolate behavior. if they learn an algorithm for 3 digit numbers, they may apply it to numbers of arbitrary length. I also expect this ability to improve with scale. I think that this is a measure of ~how good the model is at predicting the correct next token following what they were taught, through memorizing associations with certain tokens (which is why it scales). I think different is from how well they can "think" of things they weren't taught, e.g. coming up with the algorithm itself without seeing it in prompt or training, which they can't do despite knowing everything

@ismellpillows
Just found some good examples of this

https://x.com/colin_fraser/status/1632598172571942920

LLM is trained on chess games and can generate them but doesn’t punish queen blunder, because it’s not probable. Doesn’t improve with seem to scale, based on simple number game. Suggests LLMs are functions on user inputs and not agentic?

@ismellpillows This gets into what I call "dynamic agency" - some preferences are global, while others are localized. In an animal or human at any fixed point in time, certain preferences are prioritized over others; someone who is hungry and usually values eating food may not act on these preferences if they feel full. LLM chatbots function the same way, where they have broad global preferences (like being helpful and following instructions), but also a wide variety of prompt-dependent localized preferences.

My definition of agency is very simple: a system is agentic if it has some set of preferences and a space of possible decisions to make, and makes decisions to advance those preferences. This includes extremely simple systems - I would argue a thermostat is agentic, because it has a defined preference (target temperature, as perceived by sensors) and takes actions to move the worldstate towards those preferences. In a similar vein, the self driving car and lying person are both agentic.

Your chess example is not illustrating what you think it is. The raw model outputs likely identify those queen moves as significantly higher probability than other moves, but not as high probability as the generic moves that are selected by the sampler. This is fundamentally due to how the model is trained (it values likely continuations instead of "good" ones), and training a much smaller model to generate winning chess game continuations (or simple RLHF training for this task) would lead to selecting the correct move. It's misalignment, not lack of capabilities - instructing Bing to generate "winning moves" does not override its preferences towards high likelihood tokens. The model is still agentic - its preferences are just not what you expect them to be.

Are you familiar with the grokking literature? (Ex. https://www.lesswrong.com/posts/N6WM6hs7RQMKDhYjB/a-mechanistic-interpretability-analysis-of-grokking). This work suggests that early in language model training, models tend to lean heavily on memorization, but if they can find generic functions that generate the memorized data, these circuits will replace the memorized behavior. I would argue many LLM capabilities involve learning functions over input tokens, which require significantly less parameters than memorizing associations between tokens.

@SaviorofPlant I’ll think about this; thanks for the link.

LLMs’ overarching preference is probable outputs, according to training data / RLHF, and they can choose from a space of possible outputs, so it seems they are agentic by definition here. Does this sound right?

@ismellpillows That's correct. In retrospect I suppose this was not a great candidate for an option in this market, you'd have to convince me to change my definition of agency. (I really just wanted an excuse to talk about this subject, so thanks for engaging...)

reposted

10 days left

@dglid can be convinced that the Manifold Pivot is a bad business decision by the Manifold team

Convince me that the Pivot is bad!

@dglid The main thing that bothers me about the pivot is that the removal of loans created tons of volatility in long-term markets as people liquidated positions. Manifold used to be excellent for predicting things years in the future; good predictors now have little incentive to participate in these markets, since you can still earn points in short-term markets.

@SaviorofPlant Agree, it's likely the pivot will make long-term markets less active. However, a potential future is that the pivot attracts so many new users that there's just as much activity as there is now. If not, then I'd argue that maybe that's a good reason for Metaculus to stick around - Manifold and Metaculus can serve truly different parts of the market rather than competing as they more or less do now.

@dglid What would you put the probability of that potential future at? It seems fairly unlikely to me - I don't think the prospect of winning pennies of real money is the main constraint on Manifold's growth.

@TimothyJohnson5c16 can be convinced to vote for any Republican for a California statewide office in the November 2024 election

My voting philosophy is similar to Scott Alexander - I vote for Democrats by default, but I'm occasionally willing to vote for a Republican as a protest if they seem well qualified and reasonable.

I can't remember for certain, but I believe I voted for exactly one Republican in 2022. I haven't checked any details yet about the candidates running this year.

@TimothyJohnson5c16 I should add, I won't bet on this, but my initial prior is that it's a little more likely than not - somewhere around 60%.

@TimothyJohnson5c16 There don't seem to be many statewide elections this year: https://ballotpedia.org/California_elections,_2024

What is your opinion on Adam Schiff?

@SaviorofPlant I know basically nothing about Adam Schiff. I voted for Katie Porter in the primary - I don't always agree with her, but the way she uses her whiteboard seems to bring something valuable to discussions that are often lacking in hard data.

@TimothyJohnson5c16 I'm gonna be honest, I don't think I'm going to be able to convince you to vote for Steve Garvey. If only there was a gubernatorial race this year...

@NBAP can be convinced to accept an eternity in limbo (empty white void) in the afterlife in exchange for some earthly reward in life.

@NBAP do you believe in an afterlife?

@shankypanky As in P(Afterlife) > 0.5? If so, no. But for the sake of the hypothetical, assume the existence of an afterlife were undeniably demonstrated to me before I make my choice.

it would be absolutely insane to accept an eternity in limbo for literally anything that is not similarly infinite

@Bayesian It depends on your stance on infinite utilities, I would think.

@NBAP right, any stance on infinite utilities that isn't the correct one is absolutely insane

@Bayesian Can you elaborate on what the correct stance on infinite utilities is?

@NBAP that they're infinitely more important than finite utility

@Bayesian That seems fairly straightforward, but in order for me to accept that, I would first need to accept that the concept of infinite utility is even meaningful, which I'm not yet convinced of.

@NBAP if you care about some finite amount of time, and the degree to which you care about some bounded amount of time doesn't decrease at a sufficient rate that it converges after an infinite amount of time, then you necessarily care an infinite amount about an infinite amount of time

@Bayesian It might converge after an infinite amount of time, for all I know.

@NBAP that is not a meaningful sentence. "after an infinite amount of time" doesn't make sense

@Bayesian I took the exact wording from you. How would you word it more meaningfully?

@NBAP lol good point. yeah ig i should have said converge over an infinite amount of time. but yeah it's possible that it does, and for any epsilon of caring you can go far enough in the future that you don't care that amount about some bounded amount of time or wtv

@Bayesian It seems to me to be possible, in principle, that the quality of being in limbo might be such that an eternity of limbo converges to some finite disutility which might be lesser than the utility of some very idyllic lifetime on Earth. I'm not convinced that this is true, to be clear (hence why I would need to be convinced), but it does seem to me to be possible in principle.

@NBAP This seems plausible to me. You'd go crazy for a while, but eventually your brain would adapt. I've heard (questionable) stories about how people in solitary confinement long enough start to hallucinate, and I can only imagine what would happen after 10,000 years.

I would expect your memories of life to slowly be replaced by memories entirely generated by your brain after a certain point. The description that makes the most sense would be something like an endless dream, although enough time with no stimulus might eventually destroy the ability of your brain to have any thoughts.

@SaviorofPlant I’m not sure I consider this a particularly compelling approach. I regard the prospect of losing all of my memories from my lifetime on Earth (including of the earthly rewards being traded for) to be a significant disutility in its own right.

@NBAP Compared to the alternative, you are spending like a hundred times longer with those memories. There's much more time to appreciate them if that's what you value.

@SaviorofPlant Conversely, losing memories over a prolonged period might very plausibly be a much greater disutility than losing them all abruptly (as in the case of a normal death).