Will the project "Feeling machines: Brain-inspired artificial general intelligence" receive receive any funding from the Clearer Thinking Regranting program run by ClearerThinking.org?
Remember, betting in this market is not the only way you can have a shot at winning part of the $13,000 in cash prizes! As explained here, you can also win money by sharing information or arguments that change our mind about which projects to fund or how much to fund them. If you have an argument or public information for or against this project, share it as a comment below. If you have private information or information that has the potential to harm anyone, please send it to clearerthinkingregrants@gmail.com instead.
Below, you can find some selected quotes from the public copy of the application. The text beneath each heading was written by the applicant. Alternatively, you can click here to see the entire public portion of their application.
Why the applicant thinks we should fund this project
Implementing machines with computational analogues of affective feelings (valence/emotion/moods) present unique challenges to AI safety that are currently underrepresented and not well understood. This project should be funded to increase awareness and understanding of this area of AI alignment research, and to foster more research in AI alignment that connects directly to work in computational neuroscience.
Project Activities
The project involves translating academic literature on how aspects of mentality underwrite general intelligence, and the implications of this for AI safety, into a maximally accessible and didactic format (mostly, long-read posts for the Alignment Forum).
The first step is to make existing academic research (including my own) on how self-modelling serves adaptive and intelligent behaviour in biological agents accessible for a broad readership in the alignment community. This part of the project identifies the role played by affective feelings in facilitating domain-general intelligence. Specifically, this part of the project will consider key factors identified by AI researchers as lacking in current systems, and provide an account of how biological agents overcome these challenges. The focus is on how feeling guides i) balancing exploration and exploitation; ii) flexible formation and (goal-directed) manipulation of abstractions, and; ii) underwrites our capacity for metareasoning and attentional allocation through mental action (selection of which computations to perform). The second step is a new strand of research, which uses this lens on general intelligence to consider particular issues in AI alignment.
How much funding are they requesting?
$50,000
Here you can review the entire public portion of the application (which contains a lot more information about the applicant and their project):
https://docs.google.com/document/d/1qJ70dKpK-tzdySzMxgC42wjEeBRqEvu6vVBl1aOGZEk/edit#
Sep 20, 3:32pm:
Sep 20, 3:49pm:
While some of the points are reasonable (such as the potential usefulness of more complex state-value attributions for systems like the AlphaZero line), I think the author is overly attributing these patterns to brain-inspired computing in particular (for instance, "affective coding" focusing on human-like emotion-analogues, instead of just the idea of hard-coding categorization patterns into an AI's initial structure).
Creative thought and knowledge of past AI systems whose ideas haven't yet been ported to modern paradigms are more than sufficient to generate some (I'm inclined to say most) of the general information-processing patterns present in psychological/neurological work but not in ML, without requiring a deep dive into a complex sub-field of biology.
For instance, affective coding, which is the example focused on in the application of the application, is easy to think of merely by trying to correlate old symbolic AI work to MCTS RL agents, and applying symbolic/hard-coded AI patterns to ML is a paradigm-shift that is already in-progress, without any clear benefit from trying to analogize neuroscientific concepts (which are generally based on systems produced evolutionarily for a very specific environment).
Additionally, there is no reason for the details of how an AI system assigns intermediate state-value-encodings to relate to human affect; rather, it seems that the optimal configuration would be determined by the structure of the agent itself, and the type of problems it is geared towards.
Similarly, priority-based simulations of environment characteristics (a concept which seems to cover all salient aspects of "goal-oriented imagination", which was mentioned in the project plan) are already in the limelight e.g. via the structure of MCTS in MuZero Reanalyze and EfficientZero, and I don't think a neuroscientific approach has much to add in this regard.
I am less familiar with predictive processing as a subfield of neuroscience, but from what I can understand of the application itself, it seems to be relevant mainly via a relation to affective coding, namely that the different types of valence are ways of compressing predictive information about relevant aspects of the world-state by structuring the present-model to have values which the system is predisposed to use for storing said information.
As before, this runs into the issue that while control theoretical concepts are useful to integrate into decision-making AI systems, the type of information that the system needs to keep track of has no reason to be correlated with the type of information tracked by human emotions (save in the field of communication, in which case some industrial AI systems (many of which are admittedly non-ML, however) already try to match human emotions for purposes like conversation/customer-service and sentiment analysis).
In addition, assuming that neuroscience is applicable to this topic, from the application's treatment of the topic it seems that it would primarily be in the form of enhancing the effectiveness of AI systems in ways that encourage power-seeking (by encouraging it to seek homeostasis on factors that approximately track aspects of the world-state), which seems contrary to the issue of ensuring that powerful AI systems are aligned to human interests (which even outside EA and the AGI movement is a widely-acknowledged ethical issue with AI). It is possible that the fashion in which emotions act simultaneously on multiple brain-regions could help with subagent alignment, but as before, I am not inclined to anticipate neuroscience in particular helping with this idea, considering that the human brain isn't particularly good at subagent alignment in the first place (or rather, at de-compartmentalization, which among other things interacts with those aspects of human-internal alignment regarding synchronization among brain regions, and thus those portions which I would expect emotion-based design patterns to help with).
As such, while there are ideas in neuroscience applicable to ML, I don't think that an actual attempt to link neuroscience to ML will accelerate AGI-related development at all; and if it does, it is likely to be in ways that increase capabilities without increasing alignment, which is rather undesirable. So I doubt this particular project will be funded.
I had a look at this applicant’s recently-published paper, “Machines That Feel and Think: The Role of Affective Feelings and Mental Action in (Artificial) General Intelligence,” and came away thinking that the applicant isn’t focused enough on the AI safety implications of his work.
In the article, it seems that he is trying to pave a way forward for the development of AGI (and he is also arguing that it will need to be capable of affective experiences in order for AGI to be developed at all), but (in the article at least) he gives no indication that he is focused on the safety implications of his arguments.
I didn’t find any mention in this applicant’s paper of the need to make sure that AGI is developed safely. This makes me a bit concerned about the degree to which this applicant would be focusing on the safety implications of his arguments (as opposed to merely trying to increase the probability that AGI is reached in the first place).
Just to give you an idea, here's his closing line (which is, like the rest of his article, focusing on the next steps to AGI rather than the next steps to making sure that AGI develops safely): "The next steps in understanding how to realise similar mechanisms in machines may come from clearer understanding as to the role phylogeny and ontogeny play in realising these mechanisms in biological agents."
I don't know what the outcome will be, but if my concerns with the article are not misplaced, I hope that this doesn’t get funded by CTR.
@mukimameaddict Ok yeah re-reading the abstract I agree this is concerning. The applicant mentions concerns like this in the proposal:
The only way I can conceive of this being harmful is if it increases the likelihood of brain-inspired AGI through highlighting current progress in computational neuroscience that is directly relevant to the creation of AGI. I am a bit concerned about this but on balance I think it’s better that we start considering the safety issues to these technologies as soon as possible, as it is likely to happen anyway (DeepMind explicitly draw on neuroscience-inspired AI).
But I'm still pretty not excited about work that argues that X will be needed for AGI, especially if it goes into some level of detail about X, even if I think X probably isn't needed.
@EliLifland I’m still betting up to ~25% because I think it’s possible this issue is assuaged to some extent via conversations with the applicant and references, e.g. if the applicant credibly intends to pivot in a much more safety-focused direction. But I now think significantly under 50% is appropriate.
@MichaelDickens Depends on presence of limit orders, and how close you are to 0 or 1 (the closer, the more it takes to move it yet closer).
It looks like both were in play there- they were moving it to 17% and there was a limit order that all got bought.
Aim of the project:
What are the activities that your project involves?
The project involves translating academic literature on how aspects of mentality underwrite general intelligence, and the implications of this for AI safety, into a maximally accessible and didactic format (mostly, long-read posts for the Alignment Forum).
A theme from the comments:
That said, as a layperson I can confirm that the author has a serious issue making ideas accessible.
Hard to have confidence in the translational impact when clarity is a key obstacle.
_____
I'm thinking of the $20k in AI Safety Bounties, where winners get $2,000 each (up to 10 winners max, so assumes some filtering for quality). Say we go with that rate, for the min amount Clearer Thinking is offering, $10,000, let's assume we get 5 posts. By that point it should be clear if this is worth pursuing further or not even by the stated indicators ("engagement and discussion with the posts on the Alignment Forum")
Staying conservative on this, I'm very much not the target audience. The actual contents could be brilliant or utter garbage and I'd have no idea.
On the positive side, it's a cheap project, and the subject area is popular. Clearer Thinking seemed engaged with the concept in their follow-up questions in the grant, which is a plus too. He has a position with the Digital Minds Project, which at least to an outsider like me gives me some reassurance that this isn't secretly written by an academic BS generator.
This project likely fits under the 'AI Ethics' section of FTX's project ideas on their website. The examples given there are more sophisticated though, "new textbooks in this area and/or integration with major AI labs" are far beyond this project. This would need to be seen as "building up capacity, expertise, and reputation in AI ethics." It seems like a weak fit, but is technically a fit, and most of the alternatives in this competition aren't great either.
That said, as a layperson I can confirm that the author has a serious issue making ideas accessible. It's stated as a weakness in the app, but they really need to work on it. I agree with @BTE, this is utterly impenetrable. I want to learn more about this area but have no intention of reading this.
The prediction question is whether it will receive any funds, which helps a lot. I can see this getting the minimum amount to encourage more thoughtful discussions on the EA forum, but the requested $50k feels very high. When the odds are 35%, I'm willing to buy a little.
Every single thing they say about reinforcement learning is so wrong they must not have read a single paper in the area. And have access to no one who can make them not found foolish.
Their entire experience is writing about how psychedelics helped them achieve selflessness—and making up “theor[ies] of consciousness”
Leaving aside their only goal is to make online posts
“OPTIONAL] In the future, what will be the most important indicators that your project is succeeding?
Concrete indicators of the success of this project would be engagement and discussion with the posts on the Alignment Forum.”
They don’t know what they’re talking about.
Leaving aside their only goal is to make online posts
Online posts can be very influential
I think previous commenters are being way too harsh on this proposal. While I haven't looked into all the details, my case that this should likely be funded is:
Making AI go better is the most important priority for improving the long-term future (on the margin), by a substantial amount.
While brain-like AGI safety isn't the agenda I'm most excited about, there's a significant (I'd say ~15%) probability we'll build brain-like AGI (via e.g. brain-inspired RL, I think WBE is unlikely). So it seems reasonable to have at least a few people working on brain-like AGI safety, and Steven Byrnes (and perhaps a few people aiding him?) is the only person I know of who currently working on this.
The applicant is working with Yoshua Bengio and Nick Bostrom, who both should be able to provide very useful direction on the project.
@EliLifland I should add that I've read through the application and while I'd guess I'm not as excited about the direction as the applicant, it doesn't strike me as obviously unreasonable, and the applicant obviously has some level of relevant expertise. I would be curious to hear an opinion from someone with more expertise in this area though.
@EliLifland One thing I am a bit confused about is why the applicant applied to Clearer Thinking rather than LTFF
Highly unpopular opinion: Nick Bostrom is a fake philosopher who writes science fiction and has NOTHING useful to contribute to someone trying to take a novel approach to AI.
This proposal amounts to “I would like to turn my blogging into a job, please pay me $50k to do what I am going to do anyway because you are entertained by it.”
I think they should start a Patreon.
@EliLifland If by that you mean "entertaining", sure, it was. But if you mean "good philosophy", then no, not at all.
“it is untenable to scale this approach up to behaviour as complex as human behaviour, where “a theoretical maximum of about six thousand actuations per second” would let an algorithm like AlphaZero peer “only a few seconds into the future,” (Russell, 2019, p. 476) as the action space is vastly larger than the confined space of a Go board. Russell argues that without metareasoning—inference about which computations to perform—much of the computational power is wasted on considering fruitless future scenarios….”
This is utter garbage.
The brain has hundreds of trillions (or more) activations per second, Go engines are exceptional even at 300 playouts per second (see KataGo paper), and they don’t even know the basics of MCTS
The idea that there is no “inference about which computations to perform” means they can’t even understand the abstract/outline of any RL paper of the last two decades.
@Gigacasting I definitely update further towards thinking that this should not be approved unless unrelated academics in the area confirm their works are considered central in the area and would form a good summary of it for non-domain experts.
A grant for a literature review covering the work of academics in the area broadly is something I think would be pretty interesting, but I'm currently predicting that this doesn't qualify.