Some Problems on the Encoding of Phags-pa Script

JTC1/SC2/WG2 N2871 Date: Some Problems on the Encoding of Phags-pa Script The 45 th WG2 meeting held in June, 2004, adopted N2829 in which is mentioned, The ad hoc discussion on encoding Phags-pa was unable to resolve all of the ballot comments on Phags-pa. There were many complex and difficult questions for which much further communication and discussion will be required. With reference to changes requested to Phags-Pa encoding in PDAM 1 ballot comments, and the details given in document N2745 and associated documents N2719 and N2771, WG2 accepts the ad hoc group recommendations in document N2829. WG2 accepts the revised encoding of the 52 characters as shown on pages 79 to 81 in document N2832 for inclusion in Amendment 2 to ISO/IEC 10646: 2003 (removing it from Amendment 1). National bodies and liaison organizations are invited to provide feedback on the various open issues identified in document N2829. (Resolution M45.16). We think that such a resolution is extremely timely and extremely correct. With a view of the problems summarized in N2829 and other relevant problems, experts from China and Mongolia have their second meeting on Phags-pa script encoding in Changsha Oct 24-25,2004, at which we carefully examine various opinions and now put forward, in the name of our two nations, our official and revised Proposal for Phags-pa Script Encoding and A Users Agreement Related to Phags-pa Script Encoding. The following are our opinions on some problems on Phags-pa script encoding that require further exchange of views and deepened discussion. Part One A Brief Account of the Phags-pa Script and Its Encoding (1) Peculiarities and complexities of Phags-pa script: The Phags-pa script is a phonetic writing system created by the imperial teacher Phags-pa by a special edict of Emperor Khubilaii, Shizhu of the Yuan Dynasty and promulgated in 1269 for the purpose of translating & writing multi-lingual texts across the Yuan Empire. Materials discovered by now show that texts in Mongolian, Han.(Chinese), Tibetan, Uighur and some other foreign languages including Sanskrit have been translated/written in Phags-pa alphabet, of which Mongolian and Chinese texts are naturally the most frequently translated/written inasmuch as Mongolian was the state language of Yuan Dynasty and Chinese is used by the majority of ethnic groups in China. As a special writing system for presenting multi-lingual texts, Phags-pa script has its outstanding features. Its alphabet has rich content, boasting more letters than enough for translating/writing any particular language, and that it has compound letters, too. The Phags-pa script is written from left to right and from above downward. It differs from Mongolian in that it takes the syllable as a unit of ligature. While Tibetan uses the same punctuation mark to indicate the limits of a syllable and a word, Phags-pa alphabet needs no special punctuation marks for showing the limits of syllables or words. In writing texts in different languages, Phags-pa script shows different characteristics not only in the number of letters used, but also in the way each language is pronounced or spelt. Thus, Mongolian and Han (Chinese) texts are transcribed according to their sounds, whereas Tibetan and Sanskrit are in principle transliterated word for word, though certain words are presented in 1 accordance with their sounds. Consequently, such inconsistency ought to be contained and reflected in the present encoding. For example, such letters as, and never appear in Mongolian documents, but do appear in Chinese texts. In Mongolian and Sanskrit texts, and, and, and do not distinguish from each other. But in Chinese texts, these six letters are all letters having their distinguishing functions, viz., = 禅, = 审, = 喻, = 影, = 晓, = 匣. Letters,, and are all separate letters with distinctive functions in Sanskrit, but do not appear in Mongolian texts. So, in order to perfectly reflect such complicated details, we should treat these letters as nominal glyphs in our Phags-pa script encoding. (2) Alphabet of the Phags-pa script: The encoding of Phags-pa script should reflect its letter system fully and correctly. We must know that the Phags-pa script translates-writes texts in different languages in not quite the same way. Their relationship is very complicated, far from being as simple as with a single language. We maintain that the letter system of the nominal glyph of Phags-pa script is a system characterized by pronunciation as its content and graphic figures as distinguishing features, whereas its letter system of variant presentation glyphs is a system characterized by mere sounds regardless of its external forms. We take and as two separate nominal glyphs mainly because they represent different sounds (though we notice the insignificant difference in their shapes). But we treat and as two free variants of one and the same letter mainly in accordance with their sounds (their difference in letter forms being a little greater than that between and ). The same is true with such different letters as,,,,, and which we regard as variants of the single letter, especially according to their sounds (It should be admitted that their difference in letter forms is still greater than that between the above-mentioned two sets of glyphs). (3) Uses of the Phags-pa script: Since texts in Phags-pa script are all historical documents, their spellings are not quite consistent or standardized. Nor is Phags-pa script a current writing system, so we do not have any realistic requirement or possibility to standardize it. We believe that the main purpose of the Phags-pa script encoding is to intactly preserve and represent those historical documents and materials for the convenience of research. We should by no means delete certain spellings or graphic symbols that do exist in historical documents, but instead, include as much as possible those diversified spellings and graphic symbols in our encoding. There are scholars who refuse such variants as, and, all of which however are found in Mongolian, Tibetan and Sanskrit documents. Thus, there may appear several ways to spell a word, of which one spelling is correct, the others are not. But the letter spellings must be presented in the 2 encoding in order to represent what historically is true: alongside of and alongside of. appears 123 times and 33 times in Mongolian documents. We mustn t change all their written forms into one single, must we? if so, it would violate historical truth! One opinion goes like this, It is possible to easily translate a Tibetan document in Phags-pa alphabet back into Tibetan, which is actually an unthinkable objective excluded from the UCS. First of all, one should not demand that any encoding fulfill such a function, for it is an extravagant and unpractical hope. We clearly know that between Mongolian encoding and Mongolian texts in Phags-pa alphabet there does not exist any possibility of automatically transferring to each other. Moreover, what we have to emphasize is that there does not exist such necessity! So far as we know, in Tibetan documents written in Phags-pa letters there are a lot of words that do not meet the standards of Tibetan orthography. Thus, for the Tibetan person name Rin chen, there can be five Phags-pa spellings:,,, and ; and for the Tibetan word skor gsum, the Phags-pa script has two spellings: and ( ). How can we translate them into Tibetan letter for letter? Instead, one should rather take into consideration the problem of how to link it up with the Latin transliteration in the Phags-pa script. We have to point out, then, that names in UCS are not the same thing as Latin transliteration of the text. (4) Influence of Mongolian writing system on Phags-pa script: Some people think that the only possible relation between Phags-pa script and Mongolian writing is that they are both written in the same direction. However, the research of Chinese and Mongolian scholars shows that there exist many influences of Mongolian writing system on Phags-pa script. For example, the structure of the Phags-pa letter OE and UE are based on the Mongolian ᠥ OE and ᠦ UE; the separate, word-initial, word-medial and word-final forms of vowels in Phags-pa script are also designed in accordance with the vowel harmony in Mongolian; and the Phags-pa variant forms, of the Mongolian vowels OE and UE in other than the first syllable are based on Mongolian vowel harmony, too. Besides, the Phags-pa letter A856 is also modeled on Mongolian writing. Therefore, it is only natural that Phags-pa script presentation of Mongolian glyphs became the same as to Mongolian glyphs. Such similar features of Phags-pa script and Mongolian writing are not depend on our attitude to Mongolian writing system, nor our fabrication, but it was the reflection of the Mongolian writing system to the Phags-pa script, which was created by the Great lama Phags-pa. (5) Research on Phags-pa script: Before the 1980 s, research on Phags-pa script had been conducted mainly outside China and Mongolia, and few materials in this script were discovered. Since the 1980 s, however, we witnessed an unprecedented upsurge in the research on Phags-pa script in the Phags-pa s homeland China and Mongolia. According to incomplete statistics, over 110 treatises and monographs on Phags-pa script were published in China and Mongolia during this period, there are occurred less than 20 works in other countries. It was discovered more than 50 monuments in Phags-pa script during the past 20 years successively. Scholars in China and 3 Mongolia deepened their research on the Phags-pa script by relying on such rich materials and published many valuable monographs and treatises in which they put forward a series of new viewpoints. Chinese and Mongolian scholars have such a rich materials in their hands and their up-dated research achievements have provided us with an important scientific basis for the preparation of the present encoding of Phags-pa script. However the studies on Phags-pa script are could not covered all the monuments in different languages, especially Chinese texts in Phags-pa script are requiring serious further investigations. Although scholars have reached common understanding on many problems concerning Phags-pa script, there nevertheless exist great divergences in quite a number of problems, thus, opinions do vary in the understanding of the letters, and, and such divergences are very difficult to eliminate within a short time. The spellings of documents and sources in Phags-pa alphabet are not quite consistent or standardized, and what is more, since Phags-pa script is not a current official writing system, there is no need or possibility to standardize it. In view of this situation, we must not rigidly advocate one point of view and restrict another in preparing the Phags-pa script encoding. Instead, we have to adopt a tolerant attitude, treating all views equally without discrimination in problems in which there exist serious differences. In other words, it is undesirable to create a situation in which a certain point of view is restricted merely because of our way of encoding. Thus, there exist more than three explanations of the quality of the letter, in encoding we should tolerate all of them without attempting to clarify its quality. Different schools may explain and use in different ways. And again, different views on the letters and should also be tolerated without imposing any restriction on them. In a word, our encoding will permit to register the two letters and in three ways. As soon as certain academic problems are solved, we may, then and not until then, revise, replenish and improve the Phags-pa encoding. (6) Experience to be summed up and lessons to be drawn in the Phags-pa script encoding: During the late 1980 s and early 1990 s, Mongolian IT engineers in Mongolia, China and Germany prepared their respective Mongolian editing devices 1. In China, during the practice of more than 10 years, rich experience has been accumulated with many lessons to be drawn in the preparation of Phags-pa script encoding as well as Phags-pa script information treatment. This is also very precious for us in developing the encoding. It is based on the experience and lessons in information treatment in the past 15 years that we put forward the proposal to include the treatment of and as whole characters, and also to provide the syllable delimiter in the present encoding. Inasmuch as we prepare Phags-pa script encoding for the purpose of serving Phags-pa information treatment, it is only too natural that we consider and handle certain problems in terms of information treatment. To devise the syllable delimiter seems to go against the status quo of Phags-pa script, but actually to do so is absolutely advantageous with no harm to information treatment in the future. 1 See v Mong (1978) in Germany, Multi-lingual editing device (1989) by Inner Mongolia University, Founder BookMaker 9.1 for Mongolian (1990) in China and Light printing (1990) in Mongolia. 4 (7) Two different ways to register letters: Quite a number of Phags-pa letters have their isolate, syllable-initial, syllable-medial and syllable-final forms. The isolate form of a vowel and the syllable-initial form of a consonant may be regarded as nominal glyphs, yet the majority of syllable-initial forms and all syllable-medial and syllable-final forms cannot be nominal glyphs. Under such conditions, there naturally appear two different ways to register letters: (a) Registration within a syllable or a word. Since it relies on its preceding and following glyphs so in most cases there is no need to use any control symbol, thus, the two variant presentation glyphs and in the words and are registered without using the control symbol and (b) Registration of a single variant presentation glyph, which, without reliance on preceding and following glyphs, should use the control symbol, thus, and not within a word should be registered as and. The UCS does not regulate any rules for the use of the control symbol. For unmistakable exchange between the users, however, it is necessary to specify certain rules for the users to follow. Such rules should include the two different ways to register letters, for its details please see the Reference Table in the Users Agreement Related to Phags-pa Script Encoding. (8) The Users Agreement Related to Encoding of Phags-pa Script: One opinion is that people should be able to register every letter with the only help of the Table of Nominal Glyphs of the UCS, there being no need to read the regulations in the Users Agreement for the Phags-pa script encoding. In our eyes, this is a mere illusory wish, for according to our experience all encodings prepared in terms of nominal glyphs have variant presentation glyphs and a certain number of various control symbols not included in the UCS. In order that no different understanding might happen in the users information exchange, there must be an agreement or a few regulations to unify the number and forms of the variant presentation glyphs and the usage of various kinds of control symbols. According to the UCS, if such an agreement or regulations are not included in it, then the users of a given encoding should reach such agreement or regulations through consultations. No doubt, the same is true with Phags-pa script encoding, otherwise it won t do to have only the Table of Glyphs for Phags-pa in the UCS. The author of 2719 seems to be persisting in such a viewpoint as there being no need for the Users Agreement, disdaining to take a glance at the Users Agreement Related to the Encoding of Phags-pa Script. That s why out of the 17 example words he listed in Article 8 of 2719, 12 are mistaken. 2 If the author had read the Users Agreement on Mongolian Encoding System, he would not have made such a hopeless mess in spelling the few most easy and simple Mongolian words. Perhaps from a negative side, his example confirms the very need for having a readers agreement. (9) Definition of nominal glyphs and variant presentation glyphs of the Phags-pa script: Both N2771 and we agree that Phags-pa script has nominal glyphs, but differ in the standards with which to define variant presentation glyphs. Since Phags-pa documents were written in historical periods with technical restrictions and lack of ascertained standards for its writing, it is a troublesome matter to try to define the variant presentation glyphs of certain letters of the script. 2 He misspells for, for, for, for, for, for and for! All this has resulted from his ignoring and not observing the regulations in the Users Agreement on Mongolia Encoding System. 5 And it is evident that glyphs with slight differences between them cannot all be defined as variant presentation glyphs. In the view of the absence of ready standards for definition, we proposed in N2745 a set of standards for defining variant presentation glyphs (See [3] of N2745-1) to the effect that: Strictly speaking, each letter in the Phags-pa script has several variant presentation glyphs. Out of most consonant letters can be separated their respective four variants, viz., the isolate, syllable-(or word-)initial, syllable-(or word-)medial and syllable-(or word-)final forms. A few letters have less than four variants. Owing to different styles of writing, certain variants may take the same form. According to the conditions under which each variant presentation glyph appears, the variant presentation glyphs of Phags-pa script are divided into conditional variants and free variants (the latter include positional variants and postpositive variants ). and, and which differ evidently in strokes yet have the same pronunciation, are free variants. Certain glyphs are marked as having two sound forms in one language, but having only one sound in another language. The former are actually two different letters while the latter two free variants of one and the same letter. Variants, which differ slightly in size, thickness, length or turning angle, may be called stroke variants. Stroke variants which are not marked as having different pronunciations, or as having any variant forms of letter under certain conditions, are not indicated in the encoding. Of course, the above standards are open to discussion to see if they are feasible. However, we have to point out that for the definition of the variant presentation glyphs of letters in various languages, there should be a unified standard. A double standard is not to be used. Part Two Views on Some Concrete Problems (1) The vowel letters OE and UE : In the Mongolian language, these two vowels are indispensable basic vowels. Though a few variant presentation glyphs of these two letters are compound letters consisting of two or three lexemes, the majority of scholars list them in their Table of Letters, regarding them as compound yet independent letters, like N.Poppe (1941,1957), B.Rinchen(1956), L.Ligeti(1964,1972), D.Čoijilsüreng(1974), Č.Šagdarsüreng (1981,2001),Bulag(1983),Bao Xiang(1984),A.Damdinsüreng(1985),Tulgaguri(1998), Y.Jančib(2002). Scholars who do not include these two letters in their Table of Letters, also point out that in a few cases, a double letter shows one sound. 3 The formation of these two letters and in Phags-pa script are modeled on the 3 Junast, The Phags-pa Script and Mongolian Documents, Vol.I:A Collection of Research Essays, p.52. 6 Mongolian vowel letters ᠥ andᠦ, too. Their pronunciations are not merely putting together the sounds of their original lexemes, like A+E+O and A+E+U, but are the single sounds ö and ü which have nothing to do with them. This point is in perfect conformity with the letter 2864 in N2719. Because though the letter consists of two lexemes and, it is pronounced F, a single sound having nothing to do with H+U put together. It is quite right that N2719 treats as two nominal
