Jarno Montonen
02/23/2023, 12:32 PMwtaysom
02/26/2023, 12:51 AMf(text) = f(text.some_part) ** f(text.some_other_part)
.
In as much as LLMs "think" and "are creative", I'd say it comes from the attempt to find the "conceptual space" defined by the training text. Here the core transformation is from some chunk of text to the next chunk plus some attention data structure that gets updated along the way.Jarno Montonen
02/27/2023, 6:54 AM