The Indo-European languages are a family of related languages that today are widely spoken in the Americas, Europe, and also Western and Southern Asia. Just as languages such as Spanish, French, Portuguese and Italian are all descended from Latin, Indo-European languages are believed to derive from a hypothetical language known as Proto-Indo-European, which is no longer spoken.
It is highly probable that the earliest speakers of this language originally lived around Ukraine and neighbouring regions in the Caucasus and Southern Russia, then spread to most of the rest of Europe and later down into India. The earliest possible end of Proto-Indo-European linguistic unity is believed to be around 3400 BCE.
Since the speakers of the Proto-Indo-European language did not develop a writing system, we have no physical evidence of it. The science of linguistics has been trying to reconstruct the Proto-Indo-European language using several methods and, although an accurate reconstruction of it seems impossible, we have today a general picture of what Proto-Indo-European speakers had in common, both linguistically and culturally. In addition to the use of comparative methods, there are studies based on the comparison of myths, laws, and social institutions.
Branches of Indo-European Languages
The Indo-European languages have a large number of branches: Anatolian, Indo-Iranian, Greek, Italic, Celtic, Germanic, Armenian, Tocharian, Balto-Slavic and Albanian.
This branch of languages was predominant in the Asian portion of Turkey and some areas in northern Syria. The most famous of these languages is Hittite. In 1906 CE, a large amount of Hittite finds were made on the site of Hattusas, the capital of the Hittite Kingdom, where about 10,000 cuneiform tablets and various other fragments were found in the remains of a royal archive. These texts date back to the mid to late second millennium BCE. Luvian, Palaic, Lycian, and Lydian are other examples of families belonging to this group.
All languages of this branch are currently extinct. This branch has the oldest surviving evidence of an Indo-European language, dated about 1800 BCE.
This branch includes two sub-branches: Indic and Iranian. Today these languages are predominant in India, Pakistan, Iran, and its vicinity and also in areas from the Black Sea to western China.
Sanskrit, which belongs to the Indic sub-branch, is the best known among the early languages of this branch; its oldest variety, Vedic Sanskrit, is preserved in the Vedas, a collection of hymns and other religious texts of ancient India. Indic speakers entered into the Indian subcontinent, coming from central Asia around 1500 BCE: In the Rig-Veda, the hymn 1.131 speaks about a legendary journey that may be considered a distant memory of this migration.
Avestan is a language that forms part of the Iranian group. Old Avestan (sometimes called Gathic Avestan) is the oldest preserved language of the Iranian sub-branch, the “sister” of Sanskrit, which is the language used in the early Zoroastrian religious texts. Another important language of the Iranian sub-branch is Old Persian, which is the language found in the royal inscriptions of the Achaemenid dynasty, starting in the late 6th century BCE. The earliest datable evidence of this branch dates back to about 1300 BCE.
Today, many Indic languages are spoken in India and Pakistan, such as Hindi-Urdu, Punjabi, and Bengali. Iranian languages such as Farsi (modern Persian), Pashto, and Kurdish are spoken in Iraq, Iran, Afghanistan, and Tajikistan.
Rather than a branch of languages, Greek is a group of dialects: During more than 3000 years of written history, Greek dialects never evolved into mutually incomprehensible languages. Greek was predominant in the southern end of the Balkans, the Peloponnese peninsula, and the Aegean Sea and its vicinity. The earliest surviving written evidence of a Greek language is Mycenaean, the dialect of the Mycenaean civilization, mainly found on clay tablets and ceramic vessels on the isle of Crete. Mycenaean did not have an alphabetic written system, rather it had a syllabic script known as the Linear B script.
The first alphabetic inscriptions have been dated back to the early 8th century BCE, which is probably the time when the Homeric epics, the Iliad and the Odyssey, reached their present form. There were many Greek dialects in ancient times, but because of Athens cultural supremacy in the 5th century BCE, it was the Athens dialect, called Attic, the one that became the standard literary language during the Classical period (480-323 BCE). Therefore, the most famous Greek poetry and prose written in Classical times were written in Attic: Aristophanes, Aristotle, Euripides, and Plato are just a few examples of authors who wrote in Attic.
This branch was predominant in the Italian peninsula. The Italic people were not natives of Italy; they entered Italy crossing the Alps around 1000 BCE and gradually moved southward. Latin, the most famous language in this group, was originally a relatively small local language spoken by pastoral tribes living in small agricultural settlements in the centre of the Italian peninsula. The first inscriptions in Latin appeared in the 7th century BCE and by the 6th century BCE it had spread significantly.
Rome was responsible for the growth of Latin in ancient times. Classical Latin is the form of Latin used by the most famous works of Roman authors like Ovid, Cicero, Seneca, Pliny, and Marcus Aurelius. Other languages of this branch are: Faliscan, Sabellic, Umbrian, South Picene, and Oscan, all of them extinct.
Today Romance languages are the only surviving descendants of the Italic branch.
This branch contains two sub-branches: Continental Celtic and Insular Celtic. By about 600 BCE, Celtic-speaking tribes had spread from what today are southern Germany, Austria, and Western Czech Republic in almost all directions, to France, Belgium, Spain, and the British Isles, then by 400 BCE, they also moved southward into northern Italy and southeast into the Balkans and even beyond. During the early 1st century BCE, Celtic-speaking tribes dominated a very significant portion of Europe. On 50 BCE, Julius Caesar conquered Gaul (ancient France) and Britain was also conquered about a century later by the emperor Claudius. As a result, this large Celtic-speaking area was absorbed by Rome, Latin became the dominant language, and the Continental Celtic languages eventually died out. The chief Continental language was Gaulish.
Insular Celtic developed in the British Isles after Celtic-speaking tribes entered around the 6th century BCE. In Ireland, Insular Celtic flourished, aided by the geographical isolation which kept Ireland relatively safe from the Roman and Anglo-Saxon invasion.
The only Celtic languages still spoken today (Irish Gaelic, Scottish Gaelic, Welsh and Breton) all come from Insular Celtic.
The Germanic branch is divided in three sub-branches: East Germanic, currently extinct; North Germanic, containing Old Norse, the ancestor of all modern Scandinavian languages; and West Germanic, containing Old English, Old Saxon, and Old High German.
The earliest evidence of Germanic-speaking people dates back to first half of the 1st millennium BCE, and they lived in an area stretching from southern Scandinavia to the coast of the North Baltic Sea. During prehistoric times, the Germanic speaking tribes came into contact with Finnic speakers in the north and also with Balto-Slavic tribes in the east. As a result of this interaction, the Germanic language borrowed several terms from Finnish and Balto-Slavic.
Several varieties of Old Norse were spoken by most Vikings. Native Nordic pre-Christian Germanic mythology and folklore has been also preserved in Old Norse, in a dialect named Old Icelandic.
Dutch, English, Frisian, and Yiddish are some examples of modern survivors of the West Germanic sub-branch, while Danish, Faroese, Icelandic, Norwegian, and Swedish are survivors of the North Germanic branch.
The origins of the Armenian-speaking people is a topic still unresolved. It is probable that the Armenians and the Phrygians belonged to the same migratory wave that entered Anatolia, coming from the Balkans around the late 2nd millennium BCE. The Armenians settled in an area around Lake Van, currently Turkey; this region belonged to the state of Urartu during the early 1st millennium BCE. In the 8th century BCE, Urartu came under Assyrian control and in the 7th century BCE, the Armenians took over the region. The Medes absorbed the region soon after and Armenia became a vassal state. During the time of the Achaemenid Empire, the region turned into a Persian satrap. The Persian domination had a strong linguistic impact on Armenian, which mislead many scholars in the past to believe that Armenian actually belonged to the Iranian group.
The history of the Tocharian-speaking people is still surrounded by mystery. We know that they lived in the Talka Makan Desert, located in western China. Most of the Tocharian texts left are translations from well-known Buddhist works, and all of these texts have been dated between the 6th and the 8th centuries CE. None of these texts speak about the Tocharians themselves. Two different languages belong to this branch: Tocharian A and Tocharian B. Remains of the Tocharian A language have only been found in places where Tocharian B documents have also been found, which would suggest that Tocharian A was already extinct, kept alive only as a religious or poetic language, while Tocharian B was the living language used for administrative purposes.
Many well-preserved mummies with Caucasoid features such as tall stature, red, blonde, and brown hair, have been discovered in the Talka Makan Desert, dating between 1800 BCE to 200 CE. The weaving style and patterns of their clothes is similar to the Hallstatt culture in central Europe. Physical analysis and genetic evidence have revealed resemblances with the inhabitants of western Eurasia.
This branch is completely extinct. Among all ancient Indo-European languages, Tocharian was spoken farthest to the east.
This branch contains two sub-branches: Baltic and Slavic.
During the late Bronze Age, the Balts' territory may have stretched from around western Poland all the way across to the Ural Mountains. Afterwards, the Balts occupied a small region along the Baltic Sea. Those in the northern part of the territory occupied by the Balts were in close contact with Finnic tribes, whose language was not part of the Indo-European language family: Finnic speakers borrowed a considerable amount of Baltic words, which suggests that the Balts had an important cultural prestige in that area. Under the pressure of Gothic and Slavic migrations, the territory of the Balts was reduced towards the 5th century CE.
Archaeological evidence shows that from 1500 BCE, either the Slavs or their ancestors occupied an area stretching from near the western Polish borders towards the Dnieper River in Belarus. During the 6th century CE, the Slav-speaking tribes expanded their territory, migrating into Greece and the Balkans: this is when they are mentioned for the first time, in Byzantine records referring to this large migration. Either some or all of the Slavs were once located further to the east, in or around Iranian territory, since many Iranian words were borrowed into pre-Slavic at an early stage. Later on, as they moved westward, they came into contact with German tribes and again borrowed several additional terms.
Only two Baltic languages survive today: Latvian and Lithuanian. A large number of Slavic languages survive today, such as Bulgarian, Czech, Croatian, Polish, Serbian, Slovak, Russian, and many others.
Albanian is the last branch of Indo-European languages to appear in written form. There are two hypotheses on the origin of Albanian. The first one says that Albanian is a modern descendant of Illyrian, a language which was widely spoken in the region during classical times. Since we know very little about Illyrian, this assertion can be neither denied nor confirmed from a linguistic standpoint. From a historical and geographical perspective, however, this assertion makes sense. Another hypotheses says that Albanian is a descendant of Thracian, another lost language that was spoken farther east than Illyrian.
Today Albanian is spoken in Albania as the official language, in several other areas in of the former Yugoslavia and also in small enclaves in southern Italy, Greece and Macedonia.
All languages in this group are either extinct or they are a former stage of a modern language. Examples of this groups of languages are Phrygian, Thracian, Ancient Macedonian (not to be confused with Macedonian, a language currently spoken in Macedonia, part of the Slavic branch), Illyrian, Venetic, Messapic, and Lusitanian.
Indo-European Historical Linguistics
In ancient times it was noticed that some languages presented striking similarities: Greek and Latin are a well-known example. During classical antiquity it was noted, for example, that Greek héks “six” and heptá “seven” were similar to the Latin sex and septem. Furthermore, the regular correspondence of the initial h- in Greek to the initial s- in Latin was pointed out.
The explanation that the ancients came up with was that the Latin language was a descendant of Greek language. Centuries later, during and after the Renaissance, the close similarities between more languages were also noted, and it was understood that certain groups of languages were related, such as Icelandic and English, and also the Romance languages. Despite all of these observations, the science of linguistics did not develop much further until the 18th century CE.
During the British colonial expansion into India, a British orientalist and jurist named Sir William Jones became familiar with the Sanskrit language. Jones was also knowledgeable in Greek and Latin and was surprised by the similarities between these three languages. During a lecture on February 2, 1786 CE, Sir William Jones expressed his new ideas:
The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists; there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit; and the old Persian might be added to the same family, if this were the place for discussing any question concerning the antiquity of Persia. (Fortson, p. 9)
The idea that Greek, Latin, Sanskrit, and Persian were derived from a common source was revolutionary at that time. This was a turning point in the history of linguistics. Rather than the “daughter” of Greek, Latin was for the first time understood as the “sister” of Greek. By becoming familiar with Sanskrit, a language geographically far removed from Greek and Latin, and realizing that chance was an insufficient explanation for the similarities between these languages, Sir William Jones presented a new insight which triggered the development of modern linguistics.