The Citizendium will sometimes require the use of words (for example, names of people, and place-names) which are in languages which are not written in the Latin (Roman) alphabet used for written English. These words should appear in articles in a form that readers unfamiliar with the original languages will understand. To do this, these words must be 'transliterated' from their native script into the Latin (Roman) alphabet; this process is known as 'romanization'.
The point is not to convert names and words into English; rather, it is to convert names and words normally written in non-Roman scripts into Roman script. Very few English-speaking people can read 小泉 純一郎 and know that the former Prime Minister of Japan is being referred to. However, it is not a settled matter that his name should be rendered as "Junichiro Koizumi" or "Koizumi Jun'ichirō", or any of the other possible options. Each language will have its own specific Romanization policy on Citizendium, to guide this transliteration process here, on words and names from that language.
We should eschew creating our own romanization systems from scratch, as this will impede relatively painless transition from other print media to Citizendium. Innovation in romanization will cost more than it gains. For most languages written in non-Roman script, there are generally already one (or more!) romanization systems, and we should prefer those, precisely for commonality with other written material (which generally will use them). Some of them have taken over Roman symbols for sounds which don't occur in the language in question, and reused them for related sounds in that language which don't have a Roman character; e.g. the 'Q' in the Chinese system Hanyu Pinyin. However, these formalisms will be known to people with any familiarity with that area, and we can of course provide guides as well.
However, as noted, in many cases there are several competing standards for writing the language in romanized form. For example, there are two major ways of romanizing Japanese. Even within prescribed romanization methods, there may be variations in spelling and style. To take just one example, the Japanese word for a small police station (交番 or こうばん) could be written kouban, kōban, kooban or koban - four versions in the same 'Hepburn' romanization system. Other languages such as Chinese Hanzi script are even more complicated, with dozens of competing romanization methods covering a large variety of dialects of Chinese.
If individual authors are allowed to choose the form of Romanization independently, then confusion will quickly result. If one article talks about Guizhou (a province of China), another article mentions Kweichow (the same name in a different romanization system), and a third article discusses Kwei-chow, then the reader will be confused, and left wondering if these are all different places, or really the same place. A common standard for romanization of each language is required throughout all Citizendium articles.
This page provides a defined process to be used in each language to create standards to use when romanizing foreign words on the Citizendium; it also provides links to the policies (and discussion of the policies, which should take place on the appropriate Talk: page) for various languages. Once agreement is reached, those decisions will form policy.
Discussion should aim at striking a balance between what is culturally and linguistically accurate, versus what is easiest for a reader to understand, where the reader is not a user of the language in question. Some romanization systems favour native speakers' understanding of the distinct sounds of their language, while others distort the rendering of the 'phonemes' to show equivalences with the sounds of English. (For example, one form of Japanese romanization includes chi because the sound is similar, but not the same as English ch sound; however, to native speakers, the sound is more like ti, and is written as such in another system.) Whether to render words strictly accurately, or allow potentially misleading spellings (if these make it easier for English users) is one of the factors to take into account.
Given a choice between two different systems, one of which is more accurate, and the other of which is more widely used, there is no clear answer as to which to prefer, and there is no specific general guideline in this policy document. Rather, we leave this choice up to each language group independently. In some cases, the system of romanization used for a particular language is so ubiquitous that switching to something else would do our readers a severe disservice. In other cases, publishers and writers have been bound to an imprecise system of romanization by technical constraints, and chosing a more modern system - especially if specialists in the field are moving towards it - will be preferable.
Multiple transliteration systems
A number of languages (e.g. Japanese, Chinese) have seen several different transliteration systems in use over time, with older one(s) now deprecated. To prevent unnecessary clutter, the following guidelines are provided about transliterations in these older systems:
- Don't give every possible transliteration, or even a large number of them, in the lede; if a lengthy treatment on names is required (e.g. for Beijing), it should go in a separate section.
- Don't give multiple transliterations when making a reference to a subject; give them only in the article on that subject.
- When a transliteration was at one time widely used, but is now deprecated, older books are often still seen which use the old system; therefore, names which were well-known in the older transliteration (e.g. Chungking) should be given prominently, and redirects from them must be set up.
The International Phonetic Alphabet, while useful for its intended purpose, is not very much use for this transliteration, because it requires phonetics knowledge which most of our readers do not have in order to be able to understand it. It is not directed toward the same goal as romanization, either: for example, the IPA is used to show the pronunciation of English words as well. The IPA would be a good system if we were looking to represent the phonology of a given non-English word, but that is not the chief goal of romanization.
This is not to say that we should ignore the pronunciation of different terms; indeed, if CZ is to ultimately become a vital resource, help with correct pronunciation would be really useful. While not perfect, IPA is the best thing out there for this, and we are in no way trying to discourage the use of IPA in Citizendium articles; indeed, its use in limited circumstances (such as at the beginning of an article, to show the correct pronunciation of the subject) is desirable.
Contributors interested in romanization issues should add their names below using three tildes (~~~). You may also indicate what languages you are interested in.
- John Stephenson
- Brian P. Long - Ancient Greek, Sanskrit, Arabic, Hebrew
- Derek Harkness - Chinese
- J. Noel Chiappa - Japanese (somewhat), Chinese (slightly), Ancient Greek (slightly)
Words from languages other than English which are written in the Latin alphabet (with or without diacritical marks) should generally be rendered as-is, with pronunciation guides. (Just because an English-speaker might not know how to pronounce "Oberpfaffenhausen" or "Bad Tölz-Wolfratshausen", we should not rewrite the name phonetically.) Only languages which do not fall within this group need romanization policies.
Please add titles and language links below as necessary, and start discussion on the new pages.
Middle Eastern/Indian Languages
East Asian languages
- CZ:Pronunciation guides
- CZ:Editorial Council Resolution 0010
- CZ forum discussion
- ALA-LC Romanization Tables:Transliteration Schemes for Non-Roman Scripts - System approved by the Library of Congress and the American Library Association