Well, it's also important to understand that "language" has many different meanings. There's the languages we speak, the languages we use to interact with different things in our daily lives (programming languages, the interactions with our pets and cars ...) etc etc. Narrowing it down to human spoken language is NOT appropriate when it comes to LLMs.
LLMs don’t depend on language as meaning. They depend on language as pattern. To an LLM, English, Python code, legal text, music notation - all look similar. They’re just structured sequences of symbols. "Language" is the interface - there's no real intelligence behind it. "Large" is because of the huge quantity of structured and unstructured data aquired in training; i.e. the model has been exposed and internalized a massive amount of "patterns".
Yes, of course.
Note however that you've quoted the first two sentences from my reply to @Salt who (incorrectly) argued that 'understanding ... genuine language meaning' was a prerequisite for generative LLM output. So 'language' refers to human (written) language in that instance.
Last edited:
