Linguistic Diversity among Internet Users

The most direct effort to estimate the linguistic diversity of Internet users comes from the translation services company Global Reach. These estimates, produced every year from 1996 to 2002, are widely cited as projecting an Internet of ever-increasing linguistic diversity. They are based on the International Telecommunication Union (ITU) estimates of user populations in each country, hence a user is defined as someone who has used the Internet in the past three months.

Figure 4 presents Global Reach’s estimated user populations for different languages. The period from 2003 to 2005 is shown in dashed lines, as these are projected estimates. As expected, English, with an estimated 230 million users, had nearly three times as many users in 2001 as the nearest language, namely Chinese, with approximately 60 million users. Figure 4 shows that all of these user groups appear to be growing exponentially except for English and Japanese, which appear to be slowing. Both language groups are estimated to have about 50% of their available populations as Internet users already.

From the Global Reach estimates one can calculate linguistic diversity indices for the global population of Internet users; these values are presented in Figure 5. Because the composition of the “other” language group is left unexplained in the Global Reach data, we have calculated minimum and maximum values for the index, based on the assumption of “other” representing a single language (the minimum diversity) or a uniform distribution across 6,000 languages (the maximum diversity).

It is striking that although there are initially large gains in the diversity index from 1996 to 1999, linguistic diversity appears to be leveling off after 2000, in spite of the exponential growth of many of the languages. Additionally, the 2003-2005 projections continue this leveling trend; the projected increase in the number of Chinese speakers, because it is so large, actually mitigates the increase in diversity.

Hence, the Internet has not become linguistically diverse merely by being global and interconnecting large numbers of people. Other issues need to be addressed in order to guarantee that languages of the connected peoples are represented online, and these may be highly particular to the contexts of the connected communities.

