Head games
Partha Niyogi would like to build a computer that could hold a conversation: a computer able to invent and parse sentences it has never heard before, to infer grammatical rules not explicitly taught, to recognize words, whether they’re spoken by a little girl standing in an open field, a grown man whispering in a reverberant hallway, a New Yorker, a Texan, a pedestrian shouting over the din of rush-hour traffic.
|
Children infer the complex rules of language by listening to their parents speak.
This is no mean task. In fact, it still astonishes Niyogi, a computer-science and statistics professor and artificial-intelligence engineer who’s more than dabbled in linguistics and cognitive science, that human beings learn to talk at all. “Children don’t know language,” he says. “They only get data from watching how language is used by their parents. On the basis of a finite number of sentences, a child figures out the rule system and learns to evaluate the grammaticality of new, unknown sentences, to assess whether they have any meaning.” Moreover, he says, grammar is only one aspect of language. “When you’re born you don’t even know what words are, what phonemes are, or syllables. As the person learning, you’re only exposed to acoustic utterances. When they talk, people don’t even pause between words.” Statistically speaking, he says, learning an entire language this way—by inference and extrapolation alone—is impossible, leading scientists to believe that humans have some “biological predisposition” to it. “The remarkable thing about language,” he says, “is that it only exists in our heads. The only thing that exists in the physical world is acoustic waves.”
For 15 years, Niyogi has studied the acquisition of human speech, translating syntax and sound into mathematical terms he hopes will yield an algorithm—or several—for endowing computers with the power of language. No formula so far has entirely succeeded; researchers can build a computer that interprets speech, but only if its conversation partner offers perfect diction and a quiet lab. “People are very impressed with chess,” he says. “They think you must be intelligent to learn chess. But actually, we have built a computer that can beat a human-level player at chess. However, when it comes to learning a language, which we take for granted, computers are a disaster. No one is going around saying, ‘Oh, I’m really impressed you learned your native tongue,’ even though it’s much harder. It’s so complex—and yet it’s child’s play. They learn to use language effortlessly. It turns your world upside down.”
Looking for a way into the intricate riddle of language (which he calls the “jewel in the crown of human intelligence”), Niyogi has recently turned to investigating its evolution. In a book due out next month, The Computational Nature of Language Learning and Evolution (Massachusetts Institute of Technology Press), he charts major alterations in English, French, Chinese, and Portuguese, hunting for mathematical patterns. Children learn from their parents and other adults “robustly and reliably,” he says, but not flawlessly—and the result, every eon or so, is a new language. “In linguistics, we see a language remain stable for long periods of time, and then in a short period—200 years or something—it changes. This is why you and I cannot understand Old English, why we cannot read the Anglo Saxon Chronicles.” Social scientists usually take a cultural or historical view of language, tracing shifts in speech to migrations, invasions, war, and economic catastrophe. Sociological upheaval can certainly contribute, Niyogi says, but he sees linguistic transition as more akin to biological evolution than the advance of human civilization. Natural selection influences the origin and development of shared languages, discarding ineffective or unlearnable expressions. Accidental mutations find their way into common use. “The origins of bifurcation”—a linguistic split—“can be sociological,” he says, “but it becomes a function of biology, of children learning language.”
Offering a more concrete image, Niyogi compares language evolution to the drop in temperature that turns water to ice. “There is a whole range of temperature in which water can be stable as water,” he says. “And then you move the temperature a little bit, and it crosses a critical threshold. Suddenly the water changes and a new stable phase emerges.” Similarly, a language will hold its ground as long as each generation maintains a certain “learning fidelity,” or reliability. “But if the learning fidelity falls below the threshold—which could happen for sociological reasons but could also be the result of a completely random fluctuation—the mechanism kicks in, and fewer and fewer people speak the language. Eventually, a new language emerges. Water doesn’t change to ice all at once.”
As much as anything else, Niyogi says, his book strives to be a research tool for linguists, biologists, and social scientists looking to test their own theories about particular linguistic changes. “The models I create offer a computational framework for examining scientific suppositions about why, for instance, English word order changed.” Niyogi, meanwhile, is pursuing his own hypothesis that natural data—words, syllables, grammar—have a “particular geometrical structure. The question is, can we exploit this geometric structure to make better algorithms? I think we have shown that pretty much the answer is yes.”