A picture may be worth a thousand words, but how many numbers is a word worth? The question may sound silly, but it happens to be the foundation that underlies large language models, or LLMs — and through them, many modern applications of artificial intelligence.
Every LLM has its own answer. In Meta’s open-source Llama 3 model, each word contains 4,096 numbers; for GPT-3, it’s 12,288. Individually, these long numerical lists — known as embeddings — are just inscrutable chains of digits. But in concert, they encode mathematical relationships between words that can look surprisingly like meaning.
The basic idea behind word embeddings is decades old. To model language on a computer, start by taking every word in the dictionary and making a list of its essential features — how many is up to you, as long as it’s the same for every word. “You can almost think of it like a 20 Questions game,” said Ellie Pavlick, a computer scientist studying language models at Brown University and Google DeepMind. “Animal, vegetable, object — the features can be anything that people think are useful for distinguishing concepts.” Then assign a numerical value to each feature in the list. The word “dog,” for example, would score high on “furry” but low on “metallic.” The result will embed each word’s semantic associations, and its relationship to other words, into a unique string of numbers.
Researchers once specified these embeddings by hand, but now they’re generated automatically. For instance, neural networks can be trained to group words (or, technically, fragments of text called tokens) according to features that the network defines by itself. “Maybe one feature separates nouns and verbs really nicely, and another separates words that tend to occur after a period from words that don’t occur after a period,” Pavlick said.
The downside of these machine-learned embeddings is that unlike in a game of 20 Questions, many of the descriptions encoded in each list of numbers are not interpretable by humans. “It seems to be a grab bag of stuff,” Pavlick said. “The neural network can just make up features in any way that will help.”
But when a neural network is trained on a particular task called language modeling — predicting the next word in a sequence — the embeddings it learns are anything but arbitrary. Like iron filings lining up under a magnetic field, the values become set in such a way that words with similar associations have mathematically similar embeddings. For example, the embeddings for “dog” and “cat” will be more similar than those for “dog” and “chair.”
This phenomenon can make embeddings seem mysterious, even magical: a neural network somehow transmuting raw numbers into linguistic meaning, “like spinning straw into gold,” Pavlick said. Famous examples of “word arithmetic” — “king” minus “man” plus “woman” roughly equals “queen” — have only enhanced the aura around embeddings. They seem to act as a rich, flexible repository of what an LLM “knows.”
But this supposed knowledge isn’t anything like what we’d find in a dictionary. Instead, it’s more like a map. If you imagine every embedding as a set of coordinates on a high-dimensional map shared by other embeddings, you’ll see certain patterns pop up. Certain words will cluster together, like suburbs hugging a big city. And again, “dog” and “cat” will have more similar coordinates than “dog” and “chair.”
But unlike points on a map, these coordinates only refer to each other — not to any underlying territory, the way latitude and longitude numbers indicate specific spots on Earth. Instead, the embeddings for “dog” or “cat” are more like coordinates in interstellar space: meaningless, except for how close they happen to be to other known points.
So why are the embeddings for “dog” and “cat” so similar? It’s because they take advantage of something that linguists have known for decades: Words used in similar contexts tend to have similar meanings. In the sequence “I hired a pet sitter to feed my ____,” the next word might be “dog” or “cat,” but it’s probably not “chair.” You don’t need a dictionary to determine this, just statistics.
Embeddings — contextual coordinates, based on those statistics — are how an LLM can find a good starting point for making its next-word predictions, without having to encode meaning.
Certain words in certain contexts fit together better than others, sometimes so precisely that literally no other words will do. (Imagine finishing the sentence “The current president of France is named ____.”) According to many linguists, a big part of why humans can finely discern this sense of fitting is because we don’t just relate words to each other — we actually know what they refer to, like territory on a map. Language models can’t, because embeddings don’t work that way.
Still, as a proxy for semantic meaning, embeddings have proved surprisingly effective. It’s one reason why large language models have rapidly risen to the forefront of AI. When these mathematical objects fit together in a way that coincides with our expectations, it feels like intelligence; when they don’t, we call it a “hallucination.” To the LLM, though, there’s no difference. They’re just lists of numbers, lost in space.