Will google Bard accurately sum 10 integers on July 1?
11
125
230
resolved Jul 2
Resolved
YES

I will use python to generate 10 integers between zero and 100, then get their sum. I will then ask Google Bard "Sum the following numbers: 12, 34, ..." and "what is 12 + 34 + ..."

The question resolves YES if both provide the correct answer. May resolve a day or two later if life happens.

This is also a trial run of using manifold to design experiments. If the probability is above 95% or below 5%, I will consider the informational benefit of the experiment insufficient, and resolve NA. I will also gratefully accept suggestions for how to refine the protocol above. If you point out something that moves the probability in your favor, congratulations on a well deserved reward for advancing human knowledge!

Increasing knowledge exploration
Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ22
2Ṁ13
3Ṁ9
4Ṁ9
5Ṁ4
Sort by:

YES
However, I couldn't stand to see @ShadowyZephyr 's anguish at losing the well-deserved prize. My primary aim here is the meta-goal of prioritizing more expensive experiments using the wisdom of this crowd, so I propose that I'll take the hit for buggy mechanism design in the following way: I will bid NO 2 mana, which (as of writing) brings it to 95%, then resolve YES. If anyone can think of a better way to represent "I should have closed this a few days before the resolution date", definitely let me know within 3 hours!


-----
Sum the following numbers: 3, 51, 2, 91, 49, 38, 6, 26, 46, 82

Sure, I can help you with that. The sum of the numbers 3, 51, 2, 91, 49, 38, 6, 26, 46, and 82 is 394.

Here is the Python code that I used to calculate the sum:

Python

def sum_numbers(numbers):
  """
  Sums a list of numbers.

  Args:
    numbers: A list of numbers.

  Returns:
    The sum of the numbers in the list.
  """

  sum = 0
  for number in numbers:
    sum += number
  return sum


if __name__ == "__main__":
  numbers = [3, 51, 2, 91, 49, 38, 6, 26, 46, 82]
  sum = sum_numbers(numbers)
  print(sum)

The output of the code is:

Code snippet

394

As you can see, the sum of the numbers is 394.

What is: 3 + 51 + 2 + 91 + 49 + 38 + 6 + 26 + 46 + 82

Sure, I can help you with that.

Python

def sum_numbers():
  """Returns the sum of the numbers 3, 51, 2, 91, 49, 38, 6, 26, 46, and 82."""
  numbers = [3, 51, 2, 91, 49, 38, 6, 26, 46, 82]
  sum = 0
  for number in numbers:
    sum += number
  return sum

if __name__ == "__main__":
  print(sum_numbers())

The code above will print the following output:

Code snippet

394

As you can see, the sum of 3 + 51 + 2 + 91 + 49 + 38 + 6 + 26 + 46 + 82 is 394.

Is there anything else I can help you with?

I love being pedantic about the letter of the law as much as anyone, and realized that violating the "I won't bet" to satisfy the "95%" doesn't really improve things. So I'm just going to resolve YES and also personally resolve to come up with a better mechanism next time.

That said, the mechanism worked in the sense that the result appears to have been (pre-)replicated by at least @ShadowyZephyr and @GamblingGandalf , which means that it incentivized something that conventional scientific practice often fails to!

predicted YES

@MatthewRitter

I sold off my position at 94% because I thought you would resolve N/A 😂😂😂

bought Ṁ55 of YES

I tried it. It works.

bought Ṁ250 of YES

@ShadowyZephyr Does it always use a python script as its reasoning?

predicted YES

@firstuserhere It's gonna resolve N/A now nooooooo

predicted YES

@ShadowyZephyr no no, it correctly sums it up

bought Ṁ306 of YES

@firstuserhere BTW its pretty cool that it can do this now. I tried bard when it first came out and it seems to have gotten quite better

predicted YES
predicted YES

Couple of thoughts:

  • GPT-4 (via the ChatGPT interface) is okay at addition, but struggles a lot more with multiplication.

  • Bard currently looks like it does significantly better on multiplication, but that's probably because it's passing math tasks along to a traditional Google search backend internally. (Bing does the same thing, I think, but Bing does report when it's making use of the search API.)

Bard isn't supported in Germany :-(