Hallucination is Inevitable: An Innate Limitation of Large Language Models (arxiv preprint)

OmnipotentEntity@beehaw.org · 10 months ago

Hallucination is Inevitable: An Innate Limitation of Large Language Models (arxiv preprint)

TheChurn@kbin.social · 10 months ago

A token is not a concept. A token is a word or word fragment that occured often in free text and was assigned a number. Common words, prefixes, and suffixes are the vast majority of tokens, and the rest are uncommon pairs of letters.

The algorithm to generate tokens is essentially compression, there is no semantic meaning embedded in them.