• Catoblepas@piefed.blahaj.zone
    link
    fedilink
    English
    arrow-up
    11
    ·
    4 months ago

    OCR isn’t a large language model. That’s why sometimes with poor quality scans or damaged text you get garbled nonsense from it. It’s not determining the statistically most likely next word, it’s matching input to possible individual characters.

    • pkjqpg1h@lemmy.zipdeleted by creator
      link
      fedilink
      arrow-up
      1
      arrow-down
      3
      ·
      4 months ago

      I mean using LLMs for OCR like (Gemini 3 Flash or Kimi K2.5)