• Arthur Besse@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      15 days ago

      By the end of long workflows

      Yes, this has been known for 10 years.

      huh? the kind of “long workflows” this paper is discussing didn’t exist two years ago much less 10

      • kingofras@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        14 days ago

        it doesn’t matter. the principle is that if x is the length of your context window, then at 0.4x the chance of hallucinations start increasing exponentially. we’re now at token windows of 1M, and all it does is shift that hallucination window further away, so the model ‘feels’ stronger because it takes longer before it hallucinates, but eventually it always does.