• haungack@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    4 days ago

    Likewise, instruct the AI to break the word down into letters one per line first, and then they get it right more often. I think that’s the point the post is trying to make.

    The letter counting issue is actually a fundamental problem of whole-word or subword-tokenization that’s had an obvious solution since ~2016, and i don’t get why commercial AI won’t implement a solution. Probably because it’s a lot of training code complexity (but not much compute) for solving a very small problem.