• 0 Posts
  • 9 Comments
Joined 1 year ago
cake
Cake day: June 4th, 2023

help-circle




  • mathematically “correct” sounding output

    It’s hard to say because that’s a rather ambiguous way of describing it (“correct” could mean anything), but it is a valid way of describing its mechanisms.

    “Correct” in the context of LLMs would be a token that is likely to follow the preceding sequence of tokens. In fact, it computes a probability for every possible token, then takes a random sample according to that distribution* to choose the next token, and it repeats that until some termination condition. This is what we call maximum likelihood estimation (MLE) in machine learning (ML). We’re learning a distribution that makes the training data as likely as possible. MLE is indeed the basis of a lot of ML, but not all.

    *Oversimplification.