• vivendi@programming.dev
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 days ago

    That is different. It’s because you’re interacting with token-based models. There has been new research on giving byte level data to LLMs to solve this issue.

    The numerical calculation aspect of LLMs and this are different.

    It would be best to couple an LLM into a tool-calling system for rudimentary numeral calculations. Right now the only way to do that is to cook up a Python script with HF transformers and a finetuned model, I am not aware of any commercial model doing this. (And this is not what Microshit is doing)