Meta AI recently published pre-print research showing off a radical new “Megabyte” framework for building generative pre-trained transformer (GPT) systems.
Dubbed “promising” by OpenAI’s Andrej Karpathy, former director of artificial intelligence at Tesla, the new architecture is designed to process large volumes of data — such as images, novels and video files — without the use of a process known as tokenization.Promising.
Everyone should hope that we can throw away tokenization in LLMs. Doing so naively creates (byte-level) sequences that are too long, so the devil is in the details.Tokenization means that LLMs are not actually fully end-to-end.
There is a whole separate stage with… https://t.co/t240ZPxPm7Tokenization is a lossy process that’s comparable to file compression.
Read more on cointelegraph.com