Short description. Given two walks, one "truly random" and one generated by myself, determine which one has been generated by me. Resolves YES if the green walk has been created by me and NO if the red one has been.
More steps will be revealed as time goes on. For the raw data, see the comment section.
(Huge thanks to @roma for creating this embed.)
--------
Detailed description:
I have generated two "walks".
One of the strings was created by me by pressing the arrow keys (up, right, down, left) on my keyboard. I produced a string of length 300. This took me roughly 40 seconds.
Another string was generated by using a "true" source of randomness, each of the four directions being equally likely and independent of each other.
I will successively reveal more steps as time progresses. These new characters, together with the old ones, will be posted to the comment section below. [This is not the cleanest solution, but was the easiest way to automate regular updates.] I reserve the right to modify the pace of new information.
The market will resolve at the latest when all of the 300 characters have been revealed and people have made their final bets.
I have practiced this.
I may answer to further questions.
I won't bet on this market.
See also: How long will this take?
For further reference, here is the final data (300 steps).
RED: UURRRULDDURRURUUULLDRRULUUDLLRLRUULDUDDULLRULLDRLLRDDURUUUDLLRULLRLDRUUDDLDDRLUDLLRDRRULDDLLUDLLDUUDDULDDDUDRDRULDDRULLLRDDDRRLDRLURDLRUDRDRUUDDDRULLLURRDDRUDDURRURRUURUDLRRLRDDUDUUDRLRRULRDURRUUDLRDDRLRLURRUDLDRLLRUUURRUUULRRRLRRULLRUURDLRRLDDLDUUDRLLUDDRRLLLUURURLLLRRLUUUDRRLUUDDLLDLDDRLLDRULDDRRL
GREEN: LDLDRLRUDDDRLDDRLULDDDRLDURURDUDDDLRLDULDLLRLDDRRUULLDURRLLDUDURLRLUUDLULLRDRLUDLLLRDDRUDLLLLDUUDLDLDLDDRUURRRLURLLRRULRRDLUDUUULDRRRRLDURDLLLLURDDURDDLLUUUURLDDULLRDLDDDUULLLDUDRUURLLLUDUDRDLDRLRLLULRDUUDLUDUDURUDUULDRRULDLRULLDLLRLRDRUDDLLDLDRURUULUULUDLDULRLUDLLLRDURULLDDUDDRUUUDLLRRDDDRLRULDLLRL
Thank you for this market! It was fun!
I'm hoping to see more puzzles like this from you.
The only thing that I would like to see diffrent next time is a more predictable times when the new data arrives. I was hard for me to plan the work on this market because I was not sure when it will finish.
@bessarabov Thanks for the feedback! Yeah, this is something I've been struggling with. I'll think about how to improve this - one idea is to simply have a fixed-but-slow rate. The market will then last longer, but if this is known about in advance, then maybe people can take that into account and not burn out themselves on the first couple of days. If anyone has other ideas, I'd appreciate those.
@Loppukilpailija I also felt that data comes out too fast. For me ideal would be couple times a day throughout a week
@bessarabov Yep, great market!
I agree that it would be nice to have predictable times, like 2-4 times a day or so, so that we have fixed times for discussion. And how much new data per event could be adaptive and maybe announced in the batch before that.
For me this fast pace was actually nice since I already had allocated much free time here. But for even more participation we could stretch this over (more than) one week and we would participate less intensely, but with similar total effort.
I think @JosephNoonan currently tries a really slow pace with the random GPT market.
I still wonder if the proposed statistic (not a metric, darn it!) with the 3-thingies and the rotation was legit, no matter how small the p-value. Maybe that could be checked? OP, maybe you have some more fake data lying around from when you were practicing - then you could post it and we'd see if those sets also produce the small p-value when analyzed in the same way as this set.
@Tasty_Y Hmm, I don't think I actually have such data.
But there's one reason that makes me think that it was legit: when practicing, I had difficulty hitting each 3-length block and often had many that didn't occur in 300 blocks. I tried to fix that, but it seems reasonable that in some real sense my 3-length block frequencies were off, and this pops up if you inspect them and maybe apply some reasonable equivalence class to them.
@Tasty_Y sorry for "metric". I think I introduced that word in the bitstring market and somehow it stuck.
@nanob0nus In math context metric is always that thing: https://en.wikipedia.org/wiki/Metric_space It generalizes the notion of a distance and has to satisfy some simple axioms.
A number you calculate based on a sample to learn something about it is a statistic https://en.wikipedia.org/wiki/Statistic. It can possibly also be a metric, but it may not be.
@Tasty_Y Yeah I know, that is the second definition. You had me convinced that the non-strict meaning does not exist in english. If it is appropriate depends on if we are in a math context here or not. For me as a physicist the most famous metric is the metric from general relativity and it violates the positivity. We use "norm" if we want the positivity or subadditivity/triangle.
I was fed up with statistical strategies after the last market, and tried a strategy that relied on my intuition. When you put your hands on the arrows, you usually put them on up, left, and right. So I expected that the fake walk would at least at the start, walk up. I can't say if I was lucky or smart, but it worked
"You never know where Reality's knife will hit you from"
RETROSPECTIVE
This went great! I thought the problem was of suitable difficulty.
I mentioned that I had practiced. Elaborating on this a little:
I spent at least 2 hours of active training with the task and plenty of more testing and thinking. I came up with various hypotheses of how I could fail, including the classic "your frequencies for length 2 blocks are way off". More seriously, I basically optimized against the following metrics statistics:
- Frequencies of blocks of lengths 1-3
- Frequencies of specific type of patterns of length 3 and 4 (e.g. "xxyy" is such a pattern)
- Project U, D -> V, R, L -> H and consider the block frequencies.
- Discard horizontal steps, consider the resulting U/D-string and its block frequencies (and same for vertical)
- Long streaks of predictable patterns (e.g. those matching the regexes x* or (x|y)* for some x, y in {U, R, D, L}).
- Some macrolevel structure about how far I'm at step n compared to n - 20, say.
Had all kinds of fancy Bayes-factor computations there. And I think I actually optimized quite well against those!
(Except, of course, for the part where I forgot to have any streak of length 4. I was entertained by capy's comment on how this was evidence for my walk being the green one.)
...but you never know where Reality's knife will hit you from. Consider frequencies of blocks of length 3, up to rotation? Sure, yes, you get a p-value of 30 per million. The interesting part is the one where the market didn't agree that this was conclusive evidence for red. (I mean, yes, traumas from the previous market, but that sure is a small p-value!) Congratulations to nanob0nus for cracking it!
(And by the way, "It's easier to tap your fingers from the pinky to the index finger than the opposite side" was not something I thought about when training.)
So I think we got into a somewhat interesting territory. The first one or two or five metrics you would think to look at didn't reveal anything (because I trained for those), then you find some more obscure metric and that gives you tiny p-values and then people disagree on how important that is and oh boy.
I also think I got a bit lucky. Seemed to me like the green random walk just ran to the left, way more than your usual random walk.
I eagerly wait until the day when the game is decided by something other than naive block frequencies. Maybe next round...
@Loppukilpailija A similar idea where the path is etch-a-sketched instead of discrete might help avoid the block frequency weakness.
My original method still gives a similar result on the final data:
The y-axis is just the same thing for 4-grams, which highly correlates with the x-axis.
These are the two marginals with p-values:
Each dot is one in a million, that is the last shown digit in the percentages. Compared to the former p ≈ 0.0030% (from 126 steps) this is only a decrease by factor 10, so the pattern was not fully repeated since then. I have found like 10 other things with ps around 1-3%, some for red, some for green, but nothing below 0.5%, excepts for these entropies of rotated ngrams which @capybara could reproduce and found that he by hand creates values similar to RED.
@nanob0nus Trusting the nanobonus quant fund for up to 300M of investment - will send the customary 30M management fee in case of success.
Giving up on statistical means 😭
What if OP has memorized a constant like Pi or e to a suitably large number of digit and is then generating key press by whether it's even or odds, and whether it's bigger than 5. Or somesuch. That would be foolproof yet human generated.
Edit: Doesn't seem to match the most common constants
@nanob0nus it's not that hard to memorize a hundred pi decimals - but 300 is maybe pushing it
The linear congruential generators are indeed not so bad to do by head. I tried the Lehmer p=59, m=6, wich is particularly easy mentally. Testing all starting seed, and no match to the sequences. Also probably it would show statistically - those are not so good especially with super small constants for mental operations.
Maybe it's a "I just use my free will"-type situation.
edit Also no match for p=101,m=50.
Yeah, 7.5 runs per second sounds unimaginable to me. But I'm particularly bad at doing any math mentally, and maybe it's doable for someone on the opposite end of the spectrum
One of the strings was created by me by pressing the arrow keys (up, right, down, left) on my keyboard. I produced a string of length 300. This took me roughly 40 seconds.
Did you do this in one go?
An hourly update providing new information. Number of steps revealed: 300 steps.
RED: UURRRULDDURRURUUULLDRRULUUDLLRLRUULDUDDULLRULLDRLLRDDURUUUDLLRULLRLDRUUDDLDDRLUDLLRDRRULDDLLUDLLDUUDDULDDDUDRDRULDDRULLLRDDDRRLDRLURDLRUDRDRUUDDDRULLLURRDDRUDDURRURRUURUDLRRLRDDUDUUDRLRRULRDURRUUDLRDDRLRLURRUDLDRLLRUUURRUUULRRRLRRULLRUURDLRRLDDLDUUDRLLUDDRRLLLUURURLLLRRLUUUDRRLUUDDLLDLDDRLLDRULDDRRL
GREEN: LDLDRLRUDDDRLDDRLULDDDRLDURURDUDDDLRLDULDLLRLDDRRUULLDURRLLDUDURLRLUUDLULLRDRLUDLLLRDDRUDLLLLDUUDLDLDLDDRUURRRLURLLRRULRRDLUDUUULDRRRRLDURDLLLLURDDURDDLLUUUURLDDULLRDLDDDUULLLDUDRUURLLLUDUDRDLDRLRLLULRDUUDLUDUDURUDUULDRRULDLRULLDLLRLRDRUDDLLDLDRURUULUULUDLDULRLUDLLLRDURULLDDUDDRUUUDLLRRDDDRLRULDLLRL