Deseret Alphabet: Deseret Alphabet

Showing posts with label Deseret Alphabet. Show all posts

Tuesday, May 22, 2012

John Morris Redux

Third time’s the charm, I suppose.

On Sunday, I was in the vicinity of Cedar City with assorted relations and friends to see the annular solar eclipse. I was determined to finally get the precise location of the John Morris headstone using the Deseret Alphabet. This was my third attempt. The first time, I took some nice pictures but didn’t have my iPhone’s geotagging turned on, so it didn’t help an awful lot. The second time was a quick wooosh past the cemetery on a tour bus and I couldn’t even spot it. Now I was going to do things right.

As we drove past the cemetery, I was riding shotgun and kept my eyes peeled. I spotted it easily enough and we pulled over so that I could get a picture.

The headstone is very readily visible from the road; no other headstones stand between it and the low cemetery wall just a few yards away. It’s almost exactly at the point where 800 North intersects North Main Street. The latitude and longitude are 37° 41' 25.80" N, 113° 3' 44.40" W. (The altitude is some 1749 m, if that helps.)

The headstone is visible (barely) on Google Maps, and quite visible (but illegible) on Google Street View.

I did not check the Iron Mission State Park visitors center down the road to see if they sell t-shirts, but considering the fact that the last time I checked nobody there even knew that there was such a thing as the Deseret Alphabet, let alone that there was a replica of a 19th century headstone in the Deseret Alphabet to be found a couple of blocks away—well, I have my doubts.

Wednesday, March 28, 2012

An Oz. of Prevention is Worth a Lb. of Cure

While working on today’s XKCD, I ran across something I hadn’t given a lot of thought to before. The text of the comic uses the venerable abbreviation, “oz.” How to proceed?

By and large, most unit names one runs across are English in the sense that if they’re written out in full, they’re written without italics: ounce, pound, foot, mile. Even SI units are so treated: gram, meter (or metre in the UK), newton, joule. Some units not at all used in the Anglosphere also have English equivalents (catty, talent).

Abbreviations for all these things are therefore English abbreviations. This is a bit more complicated for SI units, because they don’t strictly speaking have abbreviations. They have symbols, which is why we write "km/s" without any periods. (The English would leave out the periods anyway, but that’s their problem.) You’re supposed to use the symbols regardless of the writing system you’re using, so “kilometer” should always be represented with “km,” and never “k,” “κμ,” or “𐐿𐑋,” let alone “公里,” but that doesn’t seem to stop people.

The flies in the ointment are a small number of very, very old units—units so old that the standard English abbreviations used for them are not derived from the English word. The most widely used of these are the two related to weight: ounce and pound, which are abbreviated to “oz.” (from the old Italian onza) and “lb.” (from the Latin libra).

My general policy with regard to abbreviations has been to respect the language of origin. “Common Era” consists of two English words and so is abbreviated to “𐐗.𐐀.”—I pronounce the word /'irə/, after all, even if /'ɛrə/ is preferred. “Anno Domini,” however one may pronounce it, is Latin, not English. It ends up, therefore, as “A.D.”

Initialisms are just one kind of abbreviation, so I tend to treat them similarly: “HTML” gets turned into “𐐐𐐓𐐣𐐢.” XKCD is a special case because “XKCD” isn’t actually an initialism or abbreviation for anything. It’s just a name made up of four Latin letters. Acronyms present a problem of their own, inasmuch as turning “SCUBA” into “𐐝𐐗𐐊𐐒𐐈” gives a rather different result from turning “scuba” into “𐑅𐐿𐐭𐐺𐐲.” “Scuba” has become a naturalized English word, after all; most people probably don’t know that it originally was an acronym, let alone what it was an acronym for. And then there are things like “SAT,” which could either be “𐐝𐐈𐐓” or “𐐝𐐊𐐓,” depending on whether or not one thinks it’s a word and what one thinks it stands for.

The simple fact is that spoken languages evolve around their written forms. One reason why China has found it impossible to abandon sinograms is that for three thousand years, speakers have modified the way they speak on the assumption that words are written using them. Spoken and written Chinese exist in symbiosis, and neither can change without having an impact on the other.

English, as usual, ups this trend to eleven. Not only has it been stealing words from other languages with careless abandon for centuries and spelling them every which way, but since the mid-2oth century, acronyms have become a major way by which its vocabulary is extended. This even goes for the foreign words we acquire. (I'd give obvious examples, but that would end up involving Godwin’s law.) So we have initialisms which are abbreviations, initialisms which are full words, initialisms which are treated like words but pronounced as if they were abbreviations, camel-case words, and every possible combination of the above. I won’t even get into IM-speak (or r u going 2 insist?) and l33t.

Among the barriers the Deseret Alphabet—as well Shavian et al.—faces in trying to be taken seriously as a writing system, then, is the fact that the language it is intended to write is spoken on the assumption that its being written in a completely different script. If you prefer, significant chunks of spoken English don’t make sense unless you’re using the Latin script for writing.

As for our friends “ounce” and “pound,” I decided that since they’re English words, I should give them English abbreviations: “𐐵𐑌.” and “𐐹𐐼.,” respectively. (“𐐍𐑌.” is a pretty useless abbreviation, of course since the word in full would only have one more letter. It's like abbreviating “June” as “Jun.” It just seems unnecessary.) If the old Italians or ancient Romans object to either abbreviation—well, I’ll cross that bridge if and when I ever come to it.

Wednesday, December 14, 2011

Font-maker, Font-maker, Make Me a Font

One of my long-term gripes about the Deseret Alphabet is that we seem to be stuck with the letter-forms created in the mid-19th century. Even worse, almost everybody simply recreates the exact glyphs from the font used to print the four books in the 1860’s. Now, the Church commissioned the best font it could afford, but given the state of American typography at the time, the result is somewhat infelicitous.

There are two big gripes with the “standard” shapes of the Deseret Alphabet letters.

Gripe One: There are no ascenders and descenders. It turns out that we don’t really read by looking at word shapes but by actually recognizing letters—but ascenders and descenders are probably a big part of how we distinguish letters when reading. In any event, TYPING IN SOMETHING THAT LOOKS LIKE ALL CAPS FOR ANY LENGTH OF TIME IS RATHER TIRING FOR LATIN-TRAINED READERS.

Gripe Two: The lower-case letters are just smaller versions of the upper-case letters.

Back in the 1990’s, I did two Deseret Alphabet fonts. One (whose glyphs eventually became part of Apple Symbols) uses the exact glyphs from the 1860’s, and the other (which is available if you download Apple’s font tools) was created via an involved process using Metafont and still uses the basic shapes as before.

And as for the real typographers out there, the Hermann Zapfs and Jonathan Hoeflers and their colleagues, there is little interest in making a good-looking Deseret Alphabet font.

A colleague of mine had an excellent idea. Sit a calligrapher down (he suggested my wife), and have them copy the Deseret Alphabet over and over with a steel pen. As they do this repeatedly, they’ll start to change the glyphs in a natural sort of way, and eventually we might have something that actually looks organic. This, after all, is just a compressed version of what happened with the Latin script we all know and love.

My wife, however, does not have the time, and I don’t have any other calligraphers handy, so this past spring I did the next best thing and took the bull by the horns myself.

I’m no artist by any means, but over the years I have developed a certain level of skill in creating glyphs in Font Lab Studio using bits and pieces of other glyphs. I thought I’d try the same here. I therefore started with a freely-available font, Computer Modern Unicode, with as huge a repertoire as I could find. The more glyphs there are with pieces I can use, the better for me. In particular, given the genealogy of the Deseret Alphabet, a full suite of Latin, Greek, and Cyrillic glyphs would probably supply me with everything I needed and look reasonably organic.

The results with CMU weren’t entirely satisfactory, so I tried again with a different font, this time DejaVu Sans. The results were somewhat better this time, but still not quite what I’d like. (It’s the font I used for the PDF of Isaac Asimov ’s short story “Youth.”) I made a third effort, therefore, with DejaVu Serif, and that seems to be the best of the bunch. It’s not 100% of the way there, however. My wife caught my proofing something I set with it and asked me what alphabet that was. When I said it was the DA, she said, “Oh, I should have known. The letters don’t look like they belong together.”

A couple of months ago, someone on the Deseret Alphabet group over at Yahoo! expressed an interest in seeing the Proclamation on the Family in the Deseret Alphabet, so I whipped up something and put it online. It’s available here.

The general reaction has been fairly positive, so it would be nice to make the font more freely available. Unfortunately, for various complicated reasons, I can’t do that right now. I can, however, describe the steps I used.

I therefore whipped up a quick spreadsheet with all the letters in the basic Deseret Alphabet in both Apple Symbols and the DejaVu Serif-derivative, and some notes as to how I made it. This is also online as a PDF. And while I was at it, I added a waterfall sheet illustrating the font.

Now if only I can finish proofing the book I set with this puppy…

Monday, February 15, 2010

English, non-English, and the Deseret Alphabet

I still need to write up December’s talk down Provo-way, but I had some thoughts fresh in my mind that I wanted to get down, so we’ll just have to proceed sans that writeup for a bit longer.

One issue I’ve run into recently is the problem of writing non-English words in a passage of English text using the Deseret Alphabet. (See, for example, http://tinyurl.com/y8lqkwp and http://tinyurl.com/y93zzyj.)

Now, the intention of the inventors of the Deseret Alphabet was very clearly that it could be used with other languages (as was pointed out in December’s talk), and so I think that their response would be that non-English words used in a passage of English should be written with the Deseret Alphabet.

I ended up disagreeing with them, however.

To begin with, their experience of writing was pretty much limited to Indo-European languages and Biblical Hebrew. (And among Indo-European languages, it was pretty much limited to Germanic languages, Romance languages, and Welsh.) By “experience,” I don’t just mean that of George D. Watt and the Regents of the University of Deseret, but the Church as a whole. They would also have been aware of some Native American languages, but, like all but one of the language Church members were familiar with at the time, they were written in the Latin alphabet if at all.

In any event, the Deseret Alphabet, which was designed with English in mind and therefore matches the phonetic/phonemic system of English, is going to do less well with other languages. Good examples would be French nasalized vowels and the French “r,” or the German umlauted vowels and “ch,” or the Welsh “ll.” Absent modifications of the Deseret Alphabet itself to handle such sounds by adding new letters (à la the IPA), or the development of orthographies for other languages that modify the sound-values of the various letters within the Deseret Alphabet, you’re really not dealing with the other language qua a language, but rather a transliteration of the native orthography into the Deseret Alphabet or the anglicization of the foreign word.

If you’ve got an anglicization, it’s no longer stricto senu a foreign word. As for developing a transliteration or new orthography, they are very tricky and not at all easy to do well. Anyone familiar with an East Asian language written with hanzi, kana, or hangul will be very much aware of the existence of competing transliteration schemes into the Latin alphabet (usually called romanizations in this context), none of them entirely satisfactory.

So, for me, just trying to write the foreign word in the Deseret Alphabet is a non-starter. Leave the foreign word in the appropriate orthography.

Which leaves open a big question: What is a foreign word? It also leads to a smaller one: What is the appropriate orthography?

The general English practice when writing a foreign word is to italicize it if it is a genuinely foreign and to leave it unitalicized if it’s gone native. So “sans” up above is not italicized, because it’s a naturalized English citizen, whereas “stricto sensu” is because it’s not.

Of course, that really doesn’t answer the first question, since it kind of assumes the answer (i.e., it begs the question in the sense that logicians give that phrase). And in any event, it’s a solution to a slightly different problem.

Now, in some cases, there are obvious criteria. With place names, for example, one has cases where the English name is something very different from the native one (Florence, Munich, Moscow, Wales), or the English pronunciation is distinct from the native one and of long standing (Paris, Seville). In such cases, you go with the Deseret Alphabet.

Not that this makes things easy. In China, if I were to use older English names instead of the ones the mainland Chinese government encourages (Canton for Guangzhou, Peking for Beijing), I’d probably use the Deseret Alphabet.

Hong Kong places mostly have official English names because of its former status as a British colony, and in any event there is no reasonably universal romanization of Cantonese to use instead. And if the PRC government were to try to impose new spellings based on Mandarin, the Cantonese-speaking locals would probably object. We’re not going to see the place called Xianggang anytime soon. So Hong Kong place names pretty much get transliterated to the Deseret Alphabet.

Shanghai may seem borderline because that’s still the preferred spelling, but the word has been thoroughly anglicized to the point that “to shanghai” is a recognized verb in English. It gets transliterated, too.

But what about using the new names, or the names of relatively obscure places which never had a really standard English name per se?

In this case, I opted to use Hanyu pinyin sans tone marks, as is standard English practice, simply because these new names are the native names being written with a very specific romanization. On an ambitious day, I may leave in the tone marks. As a result, in the middle of a passage about China, names such as Guangzhou are left as Guangzhou, whereas Canton becomes 𐐗𐐰𐑌𐐻𐐱𐑌.

Species names (Ornithorhynchus anatinus) are theoretically Latin and written in italics anyway. Since the genus is part of the species name and is Latin, it stays in the Latin alphabet, too. So do names down the classification tree from family to order and beyond. When I’m talking about the family Felidae, that stays in Latin, but if I’m talking about felines, that gets Deseretified.

If a word keeps its accents (garçon, Māori) or non-English letters (Hawaiʼi), I think of it as foreign and it stays Latin, even if there’s a common English mangling. If the word is high-falootin’ and pretentious, it ain’t English either and is left alone. (Pretentious? Moi?). And if the word just feels foreign to me, it stays Latin, at least when written by me. Sorry, folks, cwm is not an English word.

As for the proper orthography, this becomes a big issue with languages whose native speakers don’t use the Latin alphabet when writing. It was something that the original designers probably never even thought of because, as I say, one would expect them to grok writing in a non-Latin script only in the case of biblical Greek and Hebrew.

That is so nineteenth century.

There is an increasing recognition that global communication is inherently multilingual and, with the proliferation of computers that can handle multiple scripts at once, an increasing willingness to at least show a word in its native orthography (that is, written with the native script), even if it’s immediately followed by a English-based Latin transliteration. You see this a lot on Wikipedia.

I like this trend, and particularly in the Deseret Alphabet wikia, it’s quite appropriate to litter the text with sinograms or Cyrillic.

But beyond that, scholars and the educated have rarely hesitated in the past to leave non-Latin text non-Latin if the reader can be expected to read it that way. The official libretto for Gilbert and Sullivan’s Iolanthe includes “the οἱ πολλοί [sic].”

I like this trend, too, but realistically, one can no longer assume any familiarity with any specific non-Latin alphabet, even among the education.

So the answer to the second question starts out, “At least show the word in the native script where reasonable.”

Even in the world in general, speakers of minority languages are being more assertive about the right for the words in their language to be spelled with the “native” spelling simply as a way of legitimizing the use of that language. (Of course, it helps to have a government buying in.) As a result, spellings such as “Hawaiʼi” and “Māori” are becoming more common.

And I like legitimizing minority languages as well. (C’mon, Irish! Don’t die on us!)

Given all that, and given my personal reluctance to devise Deseret Alphabet transliterations or orthographies for any non-English language, the answer to the second question ends, “Otherwise, use a standard Latin spelling or transliteration.”

One of the main disadvantages of the Deseret Alphabet was that, even had it succeeded, people would have had to learn both scripts, Deseret and Latin. Even were the English-speaking world generally to have adopted Deseret, Latin would have to be learned by everybody for a generation or so as to read older books, and the educated would continue need it for non-English European languages (et al.). And until the English-speaking world itself switched over, Mormon missionaries could hardly be expected to proselyte among English-speakers without knowing English written with the Latin alphabet.

So the Deseret Alphabet was, at least in the short term, worse for education rather than better.

In any event, because practical concerns would have kept “native readers” of the Deseret Alphabet literate in the Latin one for quite some time, I really have no qualms about littering an ostensibly Deseret Alphabet text with words written in the Latin alphabet, even if it looks kind of weird. I prefer to err on the side of caution and let non-English words remain non-English. Even when speaking English, we use a lot of borrowings from different languages, and it’s OK not to pretend otherwise.

Vive la différence!

Thursday, September 3, 2009

Oh, and Some Good News

Now that Snow Leopard is released for Macs, there is good news on the keyboard front. There’s been a long-standing bug in Mac keyboard support that prevented the creation of decent keyboards for the Deseret Alphabet, keyboard which would let you type 𐐟, not by some obscene and hard-to-remember key chord, but by typing S-h. (Or S-H.)

That bug has finally been fixed, and I have a keyboard I’ve been using which takes advantage of the bug fix. You still have to cure yourself of some old Latin typing habits, but it’s a vast improvement on what went before.

It’s not quite ready for release yet; I need to write up documentation. But as soon as that’s done, I’ll post it somewhere appropriate.

Friday, May 8, 2009

The Deseret Alphabet Hits the Big Time (Kind Of)!

I was going to blather on a bit about pronunciation issues, since that’s cropped up in my life this past week, but I have something better to talk about instead.

The Deseret Alphabet has been in Unicode since version 3.1 of the standard (March 2001), so it’s hardly new there. And it’s been included in Apple’s Apple Symbols font since Mac OS X 10.3 (October 2003), so it’s hardly new there, either.

Today, the Deseret Alphabet took the next big step forward. Associated with Unicode is a second project, the Common Locale Data Repository (CLDR). A locale in computer parlance is a linking of a place with a language, and it refers to all the standard names for things or standard ways of doing things in that place/language combination. Locales make it possible for me to specify my place (Salt Lake City) and language (English), and, armed with that information, my computer can set the default names for the months and days of the weeks, the default format to use for dates and times, the default currency, the default units of measurements, and so on. Of course, I can override these if I choose, but the goal is to make it as unnecessary as possible.

Version 1.6 was under development a year or so ago, and I spent a couple of evenings madly typing in Deseret Alphabet (and Shavian) data to make Deseret and Shavian locales possible. Unfortunately, the rules for inclusion in CLDR 1.6 meant that Deseret and Shavian didn’t make it, because I was the only one who had vetted the data. The rules were relaxed somewhat for version 1.7, however, and with its release today, the Deseret Alphabet can now be used in conjunction with locale information to provide standard information for the computer to use in all kinds of interesting places.

Now, I don’t know when CLDR 1.7 will start showing up in shipping projects (e.g., Mac OS X Snow Leopard). It is, however, entirely probable that within a year software you and I and other normal people use will actually be able to use the Deseret Alphabet automatically for things like dates and times.

(I am a normal person, aren’t I?)

Deseret Alphabet