Wednesday, December 14, 2011

Font-maker, Font-maker, Make Me a Font

One of my long-term gripes about the Deseret Alphabet is that we seem to be stuck with the letter-forms created in the mid-19th century. Even worse, almost everybody simply recreates the exact glyphs from the font used to print the four books in the 1860’s. Now, the Church commissioned the best font it could afford, but given the state of American typography at the time, the result is somewhat infelicitous.

There are two big gripes with the “standard” shapes of the Deseret Alphabet letters.

Gripe One: There are no ascenders and descenders. It turns out that we don’t really read by looking at word shapes but by actually recognizing letters—but ascenders and descenders are probably a big part of how we distinguish letters when reading. In any event, TYPING IN SOMETHING THAT LOOKS LIKE ALL CAPS FOR ANY LENGTH OF TIME IS RATHER TIRING FOR LATIN-TRAINED READERS.

Gripe Two: The lower-case letters are just smaller versions of the upper-case letters.

Back in the 1990’s, I did two Deseret Alphabet fonts. One (whose glyphs eventually became part of Apple Symbols) uses the exact glyphs from the 1860’s, and the other (which is available if you download Apple’s font tools) was created via an involved process using Metafont and still uses the basic shapes as before.

And as for the real typographers out there, the Hermann Zapfs and Jonathan Hoeflers and their colleagues, there is little interest in making a good-looking Deseret Alphabet font.

A colleague of mine had an excellent idea. Sit a calligrapher down (he suggested my wife), and have them copy the Deseret Alphabet over and over with a steel pen. As they do this repeatedly, they’ll start to change the glyphs in a natural sort of way, and eventually we might have something that actually looks organic. This, after all, is just a compressed version of what happened with the Latin script we all know and love.

My wife, however, does not have the time, and I don’t have any other calligraphers handy, so this past spring I did the next best thing and took the bull by the horns myself.

I’m no artist by any means, but over the years I have developed a certain level of skill in creating glyphs in Font Lab Studio using bits and pieces of other glyphs. I thought I’d try the same here. I therefore started with a freely-available font, Computer Modern Unicode, with as huge a repertoire as I could find. The more glyphs there are with pieces I can use, the better for me. In particular, given the genealogy of the Deseret Alphabet, a full suite of Latin, Greek, and Cyrillic glyphs would probably supply me with everything I needed and look reasonably organic.

The results with CMU weren’t entirely satisfactory, so I tried again with a different font, this time DejaVu Sans. The results were somewhat better this time, but still not quite what I’d like. (It’s the font I used for the PDF of Isaac Asimovs short story “Youth.”) I made a third effort, therefore, with DejaVu Serif, and that seems to be the best of the bunch. It’s not 100% of the way there, however. My wife caught my proofing something I set with it and asked me what alphabet that was. When I said it was the DA, she said, “Oh, I should have known. The letters don’t look like they belong together.”

A couple of months ago, someone on the Deseret Alphabet group over at Yahoo! expressed an interest in seeing the Proclamation on the Family in the Deseret Alphabet, so I whipped up something and put it online. It’s available here.

The general reaction has been fairly positive, so it would be nice to make the font more freely available. Unfortunately, for various complicated reasons, I can’t do that right now. I can, however, describe the steps I used.

I therefore whipped up a quick spreadsheet with all the letters in the basic Deseret Alphabet in both Apple Symbols and the DejaVu Serif-derivative, and some notes as to how I made it. This is also online as a PDF. And while I was at it, I added a waterfall sheet illustrating the font.

Now if only I can finish proofing the book I set with this puppy…

Monday, May 2, 2011

Isaac Asimov in the Deseret Alphabet

I've converted Isaac Asimov's short story "Youth" to the Deseret Alphabet in my usual haphazard way.

The plain text version is available, but at the moment it isn't being shared, so if you want it you'll have to ask first.

A PDF version is available at
http://bit.ly/jNMlun. This uses a sans serif Deseret Alphabet font I made using Deja Vu as a starting point. I am trying four experiments with this font:

1) The line through the middle of the 𐐔 is removed, so it looks like a reversed Latin D.

2) The curl on 𐐏 is dropped, so it looks like a regular Latin V.

3) The curlicue inside 𐐃, 𐐍, and 𐐘 is turned into a dot, either inside the letter (for upper-case letters) or above it (for lower-case letters).

4) Some of the lower-case letters have had ascenders or descenders added.

I'm not entirely happy with the font. I think the spacing needs work and some of the lower-case letters were made by scaling upper-case letters, so they look a little thin.

As for the story, it's the only thing by Asimov in the public domain, or so Project Gutenberg thinks. I haven't proofed it yet, so it probably has misspellings galore. I'm going to revise my program to convert text to the Deseret Alphabet to try to get it to do a better job. When I do, I may or may not revise this.

Thursday, April 28, 2011

The Tale of the Three Ahs

Once upon a time, there were three Ahs.

One Ah was named ๐‰. She could
not be bothered with many of the things her father wanted, so she set off on her own, looking for a spot where she could be happy. In the end, she was lost.

One Ah was named ๐ƒ. Her
fault was that there was naught she wanted from her father, and she also left, looking for a lord who could give her all she wanted. No word ever came back from her.

The third Ah was named ๐‚. She was her
father’s favorite and herself wanted no part of her sisters’ unfiliality. Instead, she stayed with him in his cottage. Since she loved the arts, she spent her days creating beautiful things for them both to enjoy; and together they still are, happy and contented.

Enough, say I, is enough.

This whole “ah” business has been gnawing at me, so I decided to try to actually get it straight in my head. To start with, I took the title page of the Book of Mormon and wrote down all the words containing any of the three “ah” letters, ๐‚, ๐ƒ, and ๐‰. I then looked them up in Oxford New American Dictionary that comes with Mac OS X and Wiktionary to get both American and British pronunciations in IPA.

Let’s go through them in reverse alphabetical order. For each letter, I’ll list all the words, then their American pronunciations (Wiktionary first) and their British pronunciations (again, Wiktionary first). For some words, the dictionaries only gave the pronunciations of their roots.

๐ค๐‰๐“ /nษ‘t/ |nษ‘t| /nษ’t/ |nษ’t|
๐‰๐š /สŒv/ |ษ™v| |ษ™| /ษ’v/ |ษ’v|
๐˜๐‰๐” /ษกษ‘หd/ |gษ‘d| /ษกษ’d/ |gษ’d|
๐‰๐™ /ษ‘f/ |ษ‘f| /ษ’f/ |ษ’f|
๐๐‘๐‰๐“๐ข๐‡๐ /spษ‘t/ |หˆspษ‘tlษ™s| /spษ’t/ |spษ’t|

These words are on the bother end of the bother-father merger. This merger hasn’t taken place in all varieties of British English but has throughout almost all of North America, so the British pronunciations give the intended sound, ษ’.

๐ƒ๐ข๐๐„ /หˆษ”l.soสŠ/ |หˆษ”lsoสŠ| /หˆษ”หl.sษ™สŠ/ |หˆษ”หlsษ™สŠ|
๐ข๐ƒ๐ก๐” /lษ”หd/ |lษ”rd| /lษ”หd/ |lษ”หd|
๐™๐ƒ๐ก /fษ”ษน/ |fษ”(ษ™)r| /fษ”ห(ษน)/ |fษ”ห|
๐ƒ๐ข /ษ”l/ |ษ”l| /ษ”หl/ |ษ”หl|
๐™๐ƒ๐ข๐“๐ /fษ”lt/ |fษ”lt| /fษ”หlt/ |fษ”หlt|

These words are on the caught end of the cot-caught merger. This merger isn’t considered “standard” in American English, although it has taken place in Utah English, so all the dictionaries agree on what it should sound like, namely ษ”. It’s just not the way I pronounce it.

๐‚๐ก /ษ‘ษน/ |ษ‘r| /ษ‘ห(ษน)/ |ษ‘ห|
๐™๐‚๐›๐‡๐ก๐ž (NA) |หˆfษ‘รฐษ™r| /หˆfษ‘ห.รฐษ™(ษน)/ |หˆfษ‘หรฐษ™|
๐—๐‚๐๐“ /kรฆst/ |kรฆst| /kษ‘หst/ |kษ‘หst|
๐‘๐‚๐ก๐“ /pษ‘ษนt/ |pษ‘rt| /pษ‘หt/ |pษ‘หt|

This vowel is on the winning wide of both mergers, father and cot, and is the only form of “ah” I can pronounce without a conscious effort. It’s ษ‘.

(I actually got a bit mixed up when first I tried to coordinate this data with what Wikipedia has to say about the two mergers, but Ken Beesley corrected me.)

This is actually bad news, largely because I somehow got the impression that the vowel of cot was the more O-ish ษ”, and because I somehow got the impression that it was written with ๐‰. As a result, there are entries in the Deseret Alphabet wiki for ๐๐น๐ฑ๐ฟ and ๐Œ๐‘†๐ฒ๐ฟ ๐ˆ๐‘†๐ฒ๐‘‹๐ฑ๐‘‚, even though my actual pronunciations are ๐๐น๐ช๐ฟ and ๐ˆ๐‘†๐ฒ๐‘‹๐ช๐‘‚, respectively. As for how they’re “correctly” spelled in the Deseret Alphabet—well, I’m giving up. A cobbler should stick to his last. From now on, in any wiki entry I work on or any other DA text I compose, every “ah” is ๐‚ unless I know better, having learned the distinct pronunciation from a dictionary or other unimpeachable source.

(As for our Vulcan and biochemist friends, I don’t know who would be the ultimate authority on how to pronounce “Spock,” but it should by rights either be Gene Roddenberry—who is dead and therefore unavailable—or Leonard Nimoy. Nimoy is from Boston, where they preserve all three vowels, so if one listens very closely, one could probably figure out how “Spock” should be written.

(Asimov was a Russian Jew raised in Brooklyn, and New York is another one of the places that preserves all three “ah” sounds—albeit, that may not include Brooklyn. He is known to have told people that, to pronounce his name, say “has him off” but drop the h's. I could use that to argue that the final vowel was indeed ๐‰, but I think I’m not going to press my luck. First of all, I have audio recordings of the man pronouncing his own name, and he ended it with a /v/, not an /f/. Secondly, he also rhymed it with “stars above” and “mazel tov.” The best thing would be to have someone who can really hear the differences listen to the recordings and figure out which vowel Asimov actually used.)



Wednesday, April 27, 2011

Take the Name Challenge

I was thinking last night about the issue of writing with a phonetic alphabet being not quite as simple as one would think, and I came up with a good way of illustrating this. Take the name challenge—try to write you own name in the Deseret Alphabet. For example, let’s try my name.

John. Starting off with my given name, I’m already in trouble. I speak a dialect of English which has undergone what’s called the caught-cot merger, and that means that whereas some dialects of English distinguish |ษ‘| and |ษ”| (that’s ๐‰ and ๐‚, respectively, in the Deseret Alphabet—or is it the other way around?), mine doesn’t. I can hear the distinction if I’m listening for it, and I can make it if I want to, but those are both conscious processes. Since I grew up not making the distinction, I can’t off-hand predict with anything near 100% certainty which one will occur in any given word.

So with the name John, I don’t know whether to spell it ๐–๐ฑ๐‘Œ or ๐–๐ช๐‘Œ. I’ve been spelling it ๐–๐ฑ๐‘Œ, but I have to check either in a dictionary or in some of the extent materials in the Deseret Alphabet (in this case the Deseret Third Reader) to be sure. Fortunately, it says ๐–๐ฑ๐‘Œ and I happen to have done it correctly.

Now, a child growing up with the Deseret Alphabet wouldn’t have this problem. Even if they spoke Utah English as I do, they would simply learn that John is spelled with a ๐‰ and not a ๐‚, the same way that French kids grow up knowing that chat is a boy cat and chatte is a girl cat, or, for that matter, the way an English-speaking kid grows up knowing that John usually has a silent h, but since it can be spelled Jon, you have to learn for any particular John you meet which one it is. Since John is short and common, kids would probably pick up on the proper Deseret Alphabet spelling without even realizing it.

At the same time, this does mean that at some point a lot of the Deseret Alphabet generation will come home and complain to their parents, “I thought we were supposed to spell everything the way we pronounce it? So what’s with all this crap about ๐‰, ๐‚ and ๐ƒ?”

Howard. Even worse. To start with, is that 'w' there simply as part of the “ow” vowel we start with, or do I actually pronounce a |w| sound? Am I saying “how-ard” or “how-ward”? It sounds to me like there’s a |w| sound in there, that my mouth isn’t just pretending to make a |w| on its way from the one vowel to the other without actually doing so, but I’m not entirely sure.

As for the second vowel, it precedes an |r|, and that always screws up vowels. Since it’s written with an ‘a’, you would assume that there either is or was an “ah” sound there—I’m guessing ๐ช, but I’m not sure. Still, it’s an unstressed vowel, and those tend to turn into schwas, and when I sound the word out, it sounds a bit more schwa-y than not. I’ll go therefore with ๐๐ต๐ถ๐ฒ๐‘‰๐ผ, but again I don’t know without checking a dictionary. “Howard” isn’t any of the published Deseret Alphabet materials so far as I know, but a modern dictionary says |หˆhaสŠษ™rd|, which means I’m wrong and it should have been ๐๐ต๐ฒ๐‘‰๐ผ.

Jenkins. There are two ways to pronounce my surname, my way and the wrong way. My way has a long vowel in it, |e|. The wrong way has a short vowel in it, |ษ›|.

The ‘n’ is a bit problematic. The problem is that it precedes a ‘k’, and in English, |n| tends to turn into |ล‹| when this happens. The tongue is in the same position for |ล‹| and |k|, you see, and so it tends to move into that position a bit early when it’s working with an |n| in order to get ready for the |k|.

Historically, the name is Jen-kins or Jan-kins, “Jen/Jan” being one of the many forms of “John” out there and “-kin” or “-kins” being a diminutive (think lamby-kins, and yes, that means my name is “John Johnny”). That means that it was definitely an |n| sound way back when and I do hear people pronounce “Jenkins” with a very distinct |n|. In listening to what I say, however, and paying attention to what my tongue is doing, I’m pretty sure I’ve got a |ล‹|.

(This, by the way, is a major defect of standard English spelling and one place where the Deseret Alphabet has a very distinct advantage. The DA may be missing a letter for schwa, but it has letters for both |ล‹| and |ส’|, whereas the standard English alphabet has no consistent way of spelling them. |ล‹| is usually “ng”, but sometimes, as here, it isn’t indicated in the spelling at all.)

As for the second vowel, it sounds like a |ษช| to me, but since it’s unstressed that may be another schwa. The ‘i’ indicates that it was historically |ษช|-ish, but that’s not a big help. Anyway, I’m going with |ษช|. The net result here is ๐–๐ฉ๐‘๐ฟ๐ฎ๐‘Œ๐‘†.

And I’m wrong again. The Mac OS X dictionary application says |หˆสคษ›ล‹kษ™nz|, which would be ๐–๐ฏ๐‘๐ฟ๐ฒ๐‘Œ๐‘†. Wikipedia says “Jen-kins & Jon-kins”, which is no help at all (except they think it’s an |n|, apparently), and Dictionary.com says |หˆdส’ษ›ล‹kษชnz|, which would still be ๐–๐ฏ๐‘๐ฟ๐ฎ๐‘Œ๐‘†, not ๐–๐ฉ๐‘๐ฟ๐ฎ๐‘Œ๐‘†. All this unanimity on the first vowel surprises me, because it very much sounds like an |e| when I say it. (It goes without saying that the extant Deseret Alphabet publications are no help.)

Unlike John, though, Jenkins isn’t exactly common. It’s not rare, of course, but it’s rare enough that a child may very well go all the way through school never learning from the school the proper spelling in the Deseret Alphabet. They would simply spell it the way the adults do, and as an adult, I would have said ๐–๐ฉ๐‘๐ฟ๐ฎ๐‘Œ๐‘† without really thinking and certainly without consulting a dictionary. I think the net result would be that multiple spellings would come into use, and some people would be ๐–๐ฉ๐‘๐ฟ๐ฎ๐‘Œ๐‘†, some would be ๐–๐ฏ๐‘๐ฟ๐ฒ๐‘Œ๐‘†, and some would be ๐–๐ฏ๐‘๐ฟ๐ฎ๐‘Œ๐‘†. (And some would be ๐–๐ฏ๐‘Œ๐ฟ๐ฎ๐‘Œ๐‘†, too.) It’s not unlike the fact that we have Jenkins and Jenkin (or John and Jon). Once this starts showing up on legal records, the spellings tend to get frozen even if in retrospect they’re wrong.

(To name an example near and dear to my heart, the late writer Isaac Asimov had the spelling of his surname fixed when his family moved to the US from the Soviet Union in 1923. His father didn’t know English at the time and so did the best he could coming up with the spelling—but the name was, in practice, pronounced with a final |f|, not a final |v|, and the spelling was incorrect. By the time anybody realized this, though, it was too late.)

Now names are notoriously tricky things. Since they’re part of people’s identieis, people get very possessive about them and do insist on certain spellings and pronunciations, even when they don’t really make any sense. I’m willing to bet that had the DA gone into widespread use, we would have pretty quickly been seeing some boys named ๐–๐ช๐‘Œ who would get as upset if you spelled it ๐–๐ฑ๐‘Œ as I do if you spell my first name “Jon.”

Realistically, too, my name is unusually fraught with difficulties. My wife’s maiden name, for example, is very unambiguously ๐๐ด๐ผ๐จ ๐ค๐ฏ๐‘Š๐‘…๐ฒ๐‘Œ in the Deseret Alphabet, and our youngest clearly has ๐„๐‘Š๐ฎ๐‘‚๐จ๐ฒ ๐ก๐ฌ๐‘† for her given names. Our other kids, though—Mary Catherine and Joseph Richard—well, of the four names there, “Joseph” is the only one that’s entirely straightforward.

And I’m not going to deny that standard English spelling is any better, because it is genuinely worse. Not only do you have John-Jon and the like, but you have people who deliberately come up with cutsie spellings like Shellee or you have people with legitimate but rare names, like my sister Maren who has spent her life trying to explain to people that it rhymes with “Karen” and isn’t some bizarre variant of Maureen. (In Deseret Alphabet-land, it would only be a matter of time before you found girls called ๐Ÿ๐ฏ๐‘Š๐จ๐จ. That’s just the way people are, more’s the pity.)

My point, however, is something quite different. The Deseret Alphabet spellings would be considerably more straightforward than standard English spellings are, there’s no doubt of it. Nonetheless, even with a nominally “phonetic” alphabet, coming up with a standard, agreed-upon spelling for a word may be more complicated than one would think.

Tuesday, April 26, 2011

In Which I Get Taken Down a Notch (or Two)

Write-up still on its way. I promise.

Meanwhile, a question was raised last week on the Deseret Alphabet discussion group by one Bob Moultrie, who asked, “But that makes me wonder if learning to read and write would be much easier if we used Deseret. I know that this is the reason Brigham Young wanted to develop the Deseret Alphabet, but do you guys think the Deseret Alphabet could really deliver on this goal?”

Well, I could hardly resist a challenge like this and so pontificated on the various shortcomings I think the DA had in practice that would keep it from succeeding in that mission. Among them was the usual spiel about ascenders and descenders.

I was taken to task for that by none other than Joshua Erickson, who has designed some very nice Deseret Alphabet typefaces. He chided me (and rightly so) for being behind the times on the subject and provided a number of helpful links indicating that we do not, in fact, recognize words by their overall shape, but that we do process the individual letters. Interestingly enough, we process a number of letters all at once rather than each by itself.

I don’t think that changes the essential point of the argument, that the lack of ascenders and descenders adds to the difficulty of reading the DA. As Ken Beesley pointed out, one of the things we use to recognize the shape of the individual letters is, in fact, whether or not they have ascenders or descenders, and that lack does hinder our processing.

It, does, however explain very nicely some other things. For example, I have a much harder time reading Shavian than the Deseret Alphabet, despite the fact that the formal is systematic, very elegant, and has ascenders and descenders all over the place. The fact is, however, that many of the letters in the Deseret Alphabet either are the same as letters in the Latin alphabet or very like various Latin letters. That is, the shapes of the letters of the Deseret Alphabet are already half-familiar if not completely familiar, so the process of recognizing them as individuals can take advantage of skills we already have. In the case of Shavian, however, not only are the letters almost entirely different from their Latin “counterparts,” but they are designed along entirely different lines, so the process of distinguishing them involves different, well, algorithms, if you will. Instead of being able to leverage the techniques we've already learned to distinguish letters, we have to learn new ones.

The other thing that it explains is how people read various scripts which inherently lack ascenders and descenders. Modern Hebrew is pretty much in this boat, although some letters do have descenders when written in final positions, but the examples I have in mind are naturally East Asian ones. Every character in Chinese, Japanese, Korean, or Yi (among others) is written in a square, and most characters pretty much extend to all four sides of the square. So how do East Asians read?

Well, it would be pretty much the same thing, although applied to parts of individual sinograms rather than individual letters. The issue in this case becomes one of what one learns to look for in order to distinguish the symbols, not the essential process of reading. As is the case with an alphabet, the phonetic content of the characters (vague though it generally is) is an aid to learning, and not an aid to reading. (Unfortunately for me, there are several sets of characters involving the same phonetic element which still confuse me because they just look too similar to one another in my eyes.)

When learning to read a language written with a modern Latin script, we learn that we have to distinguish letters as wholes and that one of the things that's useful in doing that is their presence (or lack) of ascenders and descenders. When learning to read sinograms, you learn to look at parts of individual characters and how to tell, for example, the water moon radical from the almost identical meat radical. (Answer: the meat radical tends to be slightly wider.) Presumably, when learning to read the Deseret Alphabet, particularly as a child, one would learn to look for the little curlicue inside some of the letters, the very curlicues which trip up us Latin-readers because they distinguish letters without changing their shape, and it’s largely shape changes we’re looking for.

Interestingly enough, this helps justify the non-phonetic aspects of various scripts. In the case of Chinese, the fact that there are multiple phonetics for the same sound makes it easier to distinguish homophones. All the various forms of Chinese have lots of homophones. One reason why the language developed tones was, apparently, to help distinguish homophones, and one reason why the Chinese haven’t abandoned their writing system is that there are a lot of words that really do sound exactly alike and can’t necessarily be distinguished simply by their pronunciation. For extended texts, context is usually enough to tell them apart, of course, otherwise speech would be pretty much impossible—but this is not necessarily true for shorter texts.

For the record, I’ve also seen people who justify some of the weirder aspects of English spelling because it means we can distinguish homophones like "two," "too," and "to," as if context weren’t enough to do that.

I’m not convinced that this is a significant advantage, but it would be unfair to deny that it does exist.

Meanwhile, I get to kick myself not only for being condescending to Joshua, but for actually praising his fonts in the third person without even noticing that I was talking to the man himself.