Will I have access to a program that can reliably determine pronoun and verb referents by the end of 2023?

1.2kṀ6280

resolved Apr 16

Resolved

ALL

I want to be able to have a computer program change the gender of a person in a passage. This is a hard problem. Consider the sentence:

Alice went to the store because she was thirsty.

Changing the gender of the subject to male requires changing the pronoun "she" to "he".

Bob went to the store because he was thirsty.

Changing their gender to neutral is worse, because now the verb "was" must be conjugated differently as well.

Alex went to the store because they were thirsty.

The hard part is figuring out which pronouns refer to which people. (Pronoun coreference.) The best tool I've found for this so far is Huggingface's NeuralCoref 4.0, which, as with all contemporary AI tools being tasked with real-world problems, does just well enough to give you hope, then completely lets you down as soon as it encounters the slightest hiccup.

The other problem is similar: figuring out which verbs need to be conjugated and which people they refer to. (Subject-verb agreement.) In the sentence:

Alice was thirsty, so she went to the store and was dissapointed that apple juice was out of stock.

The first "was" refers to the noun "Alice", and would not need to change if Alice's gender changed to neutral. The next verb is "went", which also doesn't need to be conjugated. But the next "was" refers to the pronoun "she", and would need to change to "were" if that pronoun became the linguistically-plural "they", as in:

Alex was thirsty, so they went to the store and were dissapointed that apple juice was out of stock.

And then the last "was" refers to the noun "apple juice", and also does not need to be conjugated.

Both of these problems are quite challenging. In the general case, they require a semantic understanding of the sentence, not just a syntactic one, and this sort of problem is actually used as a test for AI intelligence. (/IsaacKing/will-ai-pass-the-winograd-schema-ch)

Luckily for me, the environment in which I need this system to run is quite restricted and non-adversarial. While I still doubt there are any simple rules that can do what I need without some form of machine learning involved, I can avoid giving the system any intentionally-challenging examples like Winograd schemas. I just need it to be able to handle the sort of phrases that are likely to show up in my MTG rules question database.

Will I be able to build or gain access to a system that can do this to my satisfaction by the end of 2023?

It doesn't need to be perfect, it just needs to be good enough that it saves me and the other question-writers some effort. (Right now we have to manually label each pronoun and verb with the person it refers to.)

Technology

Language

Barcalona

New Year's Resolutions 2024

Get

1,000

to start trading!