Monday, February 15, 2010

English, non-English, and the Deseret Alphabet

I still need to write up December’s talk down Provo-way, but I had some thoughts fresh in my mind that I wanted to get down, so we’ll just have to proceed sans that writeup for a bit longer.  

One issue I’ve run into recently is the problem of writing non-English words in a passage of English text using the Deseret Alphabet.  (See, for example, http://tinyurl.com/y8lqkwp and http://tinyurl.com/y93zzyj.)  

Now, the intention of the inventors of the Deseret Alphabet was very clearly that it could be used with other languages (as was pointed out in December’s talk), and so I think that their response would be that non-English words used in a passage of English should be written with the Deseret Alphabet.  

I ended up disagreeing with them, however.  

To begin with, their experience of writing was pretty much limited to Indo-European languages and Biblical Hebrew.  (And among Indo-European languages, it was pretty much limited to Germanic languages, Romance languages, and Welsh.) By “experience,” I don’t just mean that of George D. Watt and the Regents of the University of Deseret, but the Church as a whole.  They would also have been aware of some Native American languages, but, like all but one of the language Church members were familiar with at the time, they were written in the Latin alphabet if at all.  

In any event, the Deseret Alphabet, which was designed with English in mind and therefore matches the phonetic/phonemic system of English, is going to do less well with other languages.  Good examples would be French nasalized vowels and the French “r,” or the German umlauted vowels and “ch,” or the Welsh “ll.”  Absent modifications of the Deseret Alphabet itself to handle such sounds by adding new letters (à la the IPA), or the development of orthographies for other languages that modify the sound-values of the various letters within the Deseret Alphabet, you’re really not dealing with the other language qua a language, but rather a transliteration of the native orthography into the Deseret Alphabet or the anglicization of the foreign word.  

If you’ve got an anglicization, it’s no longer stricto senu a foreign word.  As for developing a transliteration or new orthography, they are very tricky and not at all easy to do well.  Anyone familiar with an East Asian language written with hanzi, kana, or hangul  will be very much aware of the existence of competing transliteration schemes into the Latin alphabet (usually called romanizations in this context), none of them entirely satisfactory.  

So, for me, just trying to write the foreign word in the Deseret Alphabet is a non-starter.  Leave the foreign word in the appropriate orthography.

Which leaves open a big question:  What is a foreign word?  It also leads to a smaller one:  What is the appropriate orthography?

The general English practice when writing a foreign word is to italicize it if it is a genuinely foreign and to leave it unitalicized if it’s gone native.  So “sans” up above is not italicized, because it’s a naturalized English citizen, whereas “stricto sensu” is because it’s not.  

Of course, that really doesn’t answer the first question, since it kind of assumes the answer (i.e., it begs the question in the sense that logicians give that phrase).  And in any event, it’s a solution to a slightly different problem.  

Now, in some cases, there are obvious criteria.  With place names, for example, one has cases where the English name is something very different from the native one (Florence, Munich, Moscow, Wales), or the English pronunciation is distinct from the native one and of long standing (Paris, Seville).  In such cases, you go with the Deseret Alphabet.  

Not that this makes things easy.  In China, if I were to use older English names instead of the ones the mainland Chinese government encourages (Canton for Guangzhou, Peking for Beijing), I’d probably use the Deseret Alphabet.  

Hong Kong places mostly have official English names because of its former status as a British colony, and in any event there is no reasonably universal romanization of Cantonese to use instead.  And if the PRC government were to try to impose new spellings based on Mandarin, the Cantonese-speaking locals would probably object.  We’re not going to see the place called Xianggang anytime soon.  So Hong Kong place names pretty much get transliterated to the Deseret Alphabet.  

Shanghai may seem borderline because that’s still the preferred spelling, but the word has been thoroughly anglicized to the point that “to shanghai” is a recognized verb in English.  It gets transliterated, too.

But what about using the new names, or the names of relatively obscure places which never had a really standard English name per se?

In this case, I opted to use Hanyu pinyin sans tone marks, as is standard English practice, simply because these new names are the native names being written with a very specific romanization.  On an ambitious day, I may leave in the tone marks.  As a result, in the middle of a passage about China, names such as Guangzhou are left as Guangzhou, whereas Canton becomes 𐐗𐐰𐑌𐐻𐐱𐑌.

Species names (Ornithorhynchus anatinus) are theoretically Latin and written in italics anyway.  Since the genus is part of the species name and is Latin, it stays in the Latin alphabet, too.  So do names down the classification tree from family to order and beyond.  When I’m talking about the family Felidae, that stays in Latin, but if I’m talking about felines, that gets Deseretified.  

If a word keeps its accents (garçon, Māori) or non-English letters (Hawaiʼi), I think of it as foreign and it stays Latin, even if there’s a common English mangling.  If the word is high-falootin’ and pretentious, it ain’t English either and is left alone. (Pretentious? Moi?).  And if the word just feels foreign to me, it stays Latin, at least when written by me.  Sorry, folks, cwm is not an English word.  

As for the proper orthography, this becomes a big issue with languages whose native speakers don’t use the Latin alphabet when writing.  It was something that the original designers probably never even thought of because, as I say, one would expect them to grok writing in a non-Latin script only in the case of biblical Greek and Hebrew.

That is so nineteenth century.

There is an increasing recognition that global communication is inherently multilingual and, with the proliferation of computers that can handle multiple scripts at once, an increasing willingness to at least show a word in its native orthography (that is, written with the native script), even if it’s immediately followed by a English-based Latin transliteration.  You see this a lot on Wikipedia.  

I like this trend, and particularly in the Deseret Alphabet wikia, it’s quite appropriate to litter the text with sinograms or Cyrillic.  

But beyond that, scholars and the educated have rarely hesitated in the past to leave non-Latin text non-Latin if the reader can be expected to read it that way.  The official libretto for Gilbert and Sullivan’s Iolanthe includes “the οἱ πολλοί [sic].”

I like this trend, too, but realistically, one can no longer assume any familiarity with any specific non-Latin alphabet, even among the education.

So the answer to the second question starts out, “At least show the word in the native script where reasonable.”

Even in the world in general, speakers of minority languages are being more assertive about the right for the words in their language to be spelled with the “native” spelling simply as a way of legitimizing the use of that language.  (Of course, it helps to have a government buying in.)  As a result, spellings such as “Hawaiʼi” and “Māori” are becoming more common.

And I like legitimizing minority languages as well.  (C’mon, Irish!  Don’t die on us!)  

Given all that, and given my personal reluctance to devise Deseret Alphabet transliterations or orthographies for any non-English language, the answer to the second question ends, “Otherwise, use a standard Latin spelling or transliteration.”

One of the main disadvantages of the Deseret Alphabet was that, even had it succeeded, people would have had to learn both scripts, Deseret and Latin.  Even were the English-speaking world generally to have adopted Deseret, Latin would have to be learned by everybody for a generation or so as to read older books, and the educated would continue need it for non-English European languages (et al.).  And until the English-speaking world itself switched over, Mormon missionaries could hardly be expected to proselyte among English-speakers without knowing English written with the Latin alphabet.  

So the Deseret Alphabet was, at least in the short term, worse for education rather than better.

In any event, because practical concerns would have kept “native readers” of the Deseret Alphabet literate in the Latin one for quite some time, I really have no qualms about littering an ostensibly Deseret Alphabet text with words written in the Latin alphabet, even if it looks kind of weird.  I prefer to err on the side of caution and let non-English words remain non-English.  Even when speaking English, we use a lot of borrowings from different languages, and it’s OK not to pretend otherwise.  

Vive la différence!