Will OpenAI announce a multi-modal AI capable of any input-output modality combination by end of 2025? ($1000M subsidy)
38
1.8kṀ2417
Dec 31
83%
chance

Definitions

  • Modalities: This market considers four key modalities: Image, Audio, Video, Text.

  • Any Input-Output Combination: The AI should be versatile enough to accept any mixture of these modalities as input and produce any mixture as output.

Combination of modalities examples:

The AI model can take in a single or multiple modality inputs to generate single modality outputs.

For example:

  • Input: Text + Image, Output: Video + Sound

  • Input: Audio + Image, Output: Text + Image

  • Input: Text, Output: Video + Audio

  • Input: Video + Audio, Output: Text

Single to Single Generations examples:

The model should also be able to handle inputs in a single modality to single modality output, such as:

  • Text -> Image

  • Audio -> Text

  • Image -> Video

  • Image -> Audio

  • Audio -> Text

  • Image -> text

Criteria for Market Close

  • OpenAI must officially announce the model's capabilities to meet these criteria.

  • A staggered or slow release of the model is acceptable (by means of the API or the UI interface).

  • OpenAI allows at least some portion of the general public access to the model.

Market inspiration comes from rumors about "Arrakis" and academic work on Composable Diffusion (https://arxiv.org/pdf/2305.11846.pdf).

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy