Further opportunities to extend reach in Europe, Southeast Asia, and sub-Saharan Africa

Wikipedia is highly localized within Europe, but opportunity still exists

Major growth has occurred with European midsize language Wikipedias. Article growth in European national languages such as Czech, Lithuanian, and Hungarian, as well as more establish Danish and Norwegian languages has been 25% or more over the past year with editor growth in the double digits for the less established Wikipedias. Likewise, the growth of a few local languages such as Catalan in Spain has been significant. With Catalan, some 190,000 articles exist with over 65,000 greater than 1.5kb.[1] The growth in these midsized language Wikipedias coupled with the continued use by many Europeans of pan-regional language Wikipedias (English, French, Spanish, Russian, Portuguese) has resulted in a 32% penetration of Wikipedia amongst midsize European countries (i.e. not including UK, France, Germany, Russia, Spain), one of the highest penetration of Wikipedia worldwide [2]


However, opportunity for further localization still exists in Europe. With 64% of Western Europeans and 29% of Eastern Europeans using the Internet, or some 500 million people, there still exists the opportunity to further penetrate Europe, as well as deliver content in a European Wikipedian's language of choice. As most European countries are dominated by a single national language which is broadly spoken, used in media, and the language of education, it may be necessary to provide robust Wikimedia projects in local languages both to attract and retain loyal contributors who want to share in the sum of all knowledge. Languages with 3 to 20 million speakers and less than 50,000 local language Wikipedia articles include major national languages such as Greek, Albanian, and Belarusan.[3]

Opportunities exist to continue to localize in Southeast Asia

There are over 470 million speakers of the 7 major Southeast Asian languages (Indonesian Tagalog/ Filipino Vietnamese, Thai, Malay, Burmese and Khmer) comprising 80% of the population of Indonesia, Philippines, Vietnam, Thailand, Malaysia, Myanmar, Cambodia, Lao, Singapore and Timor Leste. These languages all have official status in one or more country and are used extensively in the local print media. Internet access varies dramatically from country to country. Malaysia and Singapore both have high rates of Internet use of 63% and 69% of the population respectively. Vietnam, Thailand, and Indonesia have moderate rates of Internet use of between 11-21% of the population and Cambodia, Myanmar, Lao and Timor Leste all have Internet use rates of less than 2% of the population. [4].

The growth of Southeast Asian language Wikipedias has depended upon two factors, the number of language speakers with Internet access, and the degree of English language literacy. Greater the number of potential users and lower levels of English literacy are correlated with larger and more built out Wikipedias. For example, Indonesian, Vietnamese, and Thai Wikipedias have approximately 108,000 93,000 and 48,000 articles respectively with 22,000, 28,000 and 17,000 articles greater than 1.5 kb. In countries where these languages are spoken, Indonesia, Vietnam and Thailand these national languages are largely the medium for education even University level. This is in contrast to Malay and Tagalog/Filipino, languages spoken in countries that have high levels of English literacy due to the extensive use of English in the educational systems. the Malay Wikipedia has 44,000 articles 11,000 of which are greater than 1.5kb while the Tagalog/Filipino Wikipedia has 22,000 articles with only 2,000 greater than 1.5kb. Given the limited Internet access in Cambodia and Myanmar it is not surprising that the Khmer and Burmese Wikipedias have only 2000, and 1000 articles respectively.[5]


Wikipedia is under-localized in Sub-Saharan Africa

There are over 230 million speakers of 12 major Sub-Saharan African languages, comprising 28% of the population. However, many of the Wikipedias in these languages have shown very minimal growth. To date only three Wikipedias, Swahili, Yoruba, and Afrikaans, have more than 5000 articles. There are 13 Wikipedias of sub-Saharan African languages with more than 3 million speakers that have less than 500 articles and these include major languages such as Hausa and Zulu which are spoken by more than 20 million people each. [6] There are several barriers that have contributed to the slow growth of African language Wikipedias. Foremost amongst them is the lack of access to the Internet in sub-Saharan Africa as only 4% of the population are Internet users.[7] However, access to the Internet is increasing rapidly, Mobile phone penetration is significantly more widespread than Internet penetration and some countries have built up 3G networks enabling people to access the Internet via their phone [8] Many African languages have only existed as written languages for a short period of time and are not the primary languages of communication amongst the well educated. This has created another barrier to growth of Sub-Saharan African Wikipedias as many Africans that use Wikipedia do so in the former colonial languages of French, English and Portuguese.


NOTE: While principally anecdotal and supported by only a single days worth of page views, it is still worthwhile noting that articles in many of these local languages, Swahili in particular, have been "seeded" by editors and organizations (e.g. Google) outside of Africa showing a lack of participation from within country.

References

  1. Europe
  2. This estimate was derived from data from the International Telecommunications Union and ComScore
  3. Reach/Regional_Analysis/Europe
  4. Information on Internet use from International Telecommunications Union 2008 /
  5. Reach/Regional Analysis/South-East Asia
  6. Reach/Regional Analysis/Sub-Saharan Africa
  7. Information on Internet use from International Telecommunications Union 2008 /
  8. http://www.nytimes.com/2009/10/06/science/06uganda.html?_r=2&hpw