Is the reason GPT-4 cannot rhyme or do arithmetic well due to BPE?

I kind of want to run an experiment to see if the reason GPT-4 cannot rhyme or do arithmetic well is because of Byte-Pair Encoding. I think this could be done by training gpt2-small with and without bpe. The problem is that GPT2 small is not as capable so it won't be as impressive. Also maybe I'll do a literature search and find that indeed someone has already tried this. I am buying into the yes market so as to incentivize myself to actually work on solving this, but IDK if thats a good strat.

The test I'll run it on is this friend's website with prompts that should trigger rhyming. As shown below.

Sort by:
Mira avatar

Bing Creative is great at rhyming despite being a finetuned GPT-4. It can also do morse code, base64, etc.

I think ChatGPT-4 is finetuned for more technical discussion, so tends to be worse at creative. But the base model seems capable.

tftftftftftftftftftftftf avatar

@Mira Yes I looked into this and in fact gpt4 can rhyme very well, I don't know what I was thinking!

NLeseul avatar

@tftftftftftftftftftftftf Going by the little bit of testing I've done, it's quite a bit better at rhyming (as well as meter) than 3.5 was. But it still seems to be quite unreliable at generating poems that follow a particular rhyme or meter specified in the prompt, unless it's a well-established pattern like a sonnet. I think that's mostly what the quoted tweet was getting at; they were asking for A-B-A-B patterns, and they got A-A-B-B instead.

tftftftftftftftftftftftf avatar

@NLeseul Yes and this would also be impacted by BPE!