Using A Program To Revive Dead Languages

About Chris Buescher
"If Ignorance is bliss, then knowledge must be orgasmic." Chris is just your average atheistic, nerdy, science loving, eccentric.
View all posts by Chris Buescher →

The process of reverse engineering dead languages is a time consuming one.  The written word, when available, is an excellent aid.  Unfortunately the oldest known writing dates to less then 6,000 years.  By this time, most of the earliest proto-languages had long since splintered into numerous other tongues.  This leaves linguists with the lengthy task of comparing similar words in related languages in an attempt to find the common ancestor.  But thanks to the efforts of researchers at UC Berkeley and the University of British Columbia, a new computer algorithm is being refined that will help automate much of this tedious process.

A proto-language, or Ursprache if you prefer, is a common ancestor to a group of other languages.  Most proto-languages  have left no physical evidence of their existence.  Without direct evidence of how a language sounded, linguists instead have to rely on the Comparative Method.  This involves the comparison of cognates, words that have similar sounds, meanings and histories across related languages.  By comparing the structures of these words along with the understanding of which sound changes are more common in each of the daughter languages, it is possible to begin the process of reviving a dead language.  Compare languages further apart in time, but along the same language tree and commonalities can be used to infer further back into the languages past.  Using these and other tools that are regularly used by linguists, Alexandre Bouchard-Côté of the University of British Columbia along with Dan Klein, Tom Griffiths, and D. Hall of UC Berkeley created a computer program using a Markov chain Monte Carlo sampler algorithm that attempts to automate as much of the search for common ancestral words as possible.

As a proving ground, the researchers turned their program loose on the Austronesian language family.  Austronesian is the family of tongues that include the languages spoken natively from Malaysia and Indonesia to Australia and throughout Polynesia (as well as the oddball of human settlements, Madagascar).  Using a pool of over 14,000 words, the program was left to try to piece together the various languages that mutated over time into the myriad we see today across the Pacific rim.  When it was done, the program had assembled more than 600 proto-languages that linked together their modern descendants.  The results were then compared to the work done previously by linguists.  Though not perfect the programs results agreed with those of linguists about 85% of the time.

While no replacement for the hard work of the well trained linguist, this program and those modeled off of similar principles may very well go a long way towards helping to uncover our species lost linguistic heritage.

—————————————————————————————————————————————————-

A. Bouchard-Cote, D. Hall, T. L. Griffiths, D. Klein. Automated reconstruction of ancient languages using probabilistic models of sound change. Proceedings of the National Academy of Sciences, 2013; DOI: 10.1073/pnas.1204678110

 Using A Program To Revive Dead Languages
Did you like this? Share it:
Posted by + on February 14, 2013. Filed under SCI/TECH/OTHER STUFF. You can follow any responses to this entry through the RSS 2.0. You can leave a response or trackback to this entry
Back to Main Page

3 Responses to Using A Program To Revive Dead Languages

  1. E.A. Blair Reply

    February 14, 2013 at 10:20 am

    One good way to check whether the program is effective is to test it on a case where we have a large number of child languages with a very well-documented parent language.

    There are at least three dozen living Romance languages, and the ursprache is Latin, one of the best-documented dead languages in the world. If a program can take data from French, Provençal, Friulian, Occitan, Walloon, Spanish, Catalan, Romanian and so forth and come up with a good approximation of Latin, then its validity can be given some credit.

    After all, “Soli linguæ bonæ sunt linguæ mortuæ”.

    • Michael John Scott Reply

      February 15, 2013 at 7:48 am

      Good thoughts E.A. Thanks.

      • E.A. Blair Reply

        February 15, 2013 at 9:17 am

        Every once in a while I actually get to use my linguistics degree (unlike many others who majored in that subject).

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>