In my misspent youth, I occasionally would run across something odd. Occasionally, there would be a couple of odd dots on top of a vowel: zoölogy, Phaëthon, Boötes, noël, Nausicaä. I had no idea what was going on, and it never occurred to me to ask somebody. Either I knew how otherwise knew how to pronounce them, or, like Linus van Pelt reading Dostoyevski, I just “bleeped” over the others.
(I also ran across the mysterious symbol “æ”, which I found impossibly cool, in words like Cæsar and hæmoglobin. It was rather more obviously just an “a” and “e” smashed together for some unknown reason. Its capitalization, “Æ”, is also pretty cool.)
Eventually, I discovered the solution to the mystery. The double dots are called a diæresis—sorry, dieresis. In English and other languages, the dieresis serves the purpose of indicating that two vowels are to be pronounced separately in cases where one would naturally be inclined to pronounce them together.
The standard English alphabet, among its other quirks, has too few letters. In particular, we have more vowels than letters to represent them, so we resort to various tricks like the silent-e, vowel doubling, and so on. (And as usual, we end up with multiple ways of spelling the same sound.) So “goose” is [gus], “too” is [tu], “coop” is [kup], and “zoo” is [zu]. The temptation would be to pronounce “zoology,” therefore, as [zuləʤi] and not [zoʊˈɑləʤi], and so we put the two dots to avoid confusion.
There is another diacritic which looks very similar to the dieresis: the umlaut. It originated as a small “e” written on top of a vowel (seriously) and eventually became a pair of dots, just like the dieresis. It’s used in German and other languages to indicate a modified vowel sound, so you have “Mutter” (mother) and “Mütter” (mothers), which are pronounced with a very different initial vowel. Since it originated as an “e,” it can still be written using an “e,” as in “Muetter.”
Although technically there is a very subtle difference in the visual appearance between a dieresis and an umlaut—or, more correctly, a font designed by a German for use in German text will have umlauts ever so slightly closer to their vowels than a font designed by a Frenchman for French text will have its diereses—for all practical intents and purposes, they’re identical. Even if a particular typeface does distinguish them (and most of them ones I’m familiar with don’t), you’d have to know what to look for to tell them apart. Character encoding standards have traditionally used the same character for both. Unicode does this: both dieresis and umlaut are U+0308 COMBINING DIAERESIS. There is a standard way to distinguish them if you really want to, but almost nobody does.
The use of dieresis was never prominent in the United States. Even in Britain it’s rather archaic. For one thing, there are a lot of places where it should strictly be used and isn’t. You know, like in “diëresis” or “archaïc”, in case anybody was tempted to say [daɪrəsɪs] or [ɑrˈkaɪk]. For another, there’s the whole problem of the impact of typewriters on English typography. Moreover, as a rule, a fluent reader of English is almost never going to be thrown by the lack of a dieresis. Worst comes to worst, you can use a hyphen (as in “co-operate”). As for names like Boötes, you can bleep over it, pronounce it like “booties”, or not go in for stargazing. They’re all foreign names, anyway, except Brontë, and why would one ever want to talk about someone named Brontë?
(On an unrelated note, expect a Deseret Alphabet version of Jane Eyre sometime in 2015.)
There is one other diacritic used in English-only text: the grave accent. This one I was able to figure out on my own, since I encountered it first when reading Shakespeare. Sonnet 78, for example, includes the couplet, “In others’ works thou dost but mend the style,/And arts with sweet graces gracèd be.” A cursory examination of the scansion makes it clear that “gracèd” is intended to be pronounced with two syllables for the sake of the meter, rather than the usual one.
So what does all this have to do with the Deseret Alphabet?
English has, among its many quirks, a number of diphthongs: two vowel sounds smoothly combined into one syllable. In fact, most long vowels in English are actually diphthongs. If, for example, you say “may” very slowly, you can tell that you’re saying [meɪ] and not [me]. Even some of our short vowels can be pronounced as diphthongs. I know a lot of people who pronounce “and” as [æɪnd] instead of [ænd]. We generally don’t think of such vowels that way, of course. We generally reserve the word “diphthong” for sounds like “ow” and “ay”, where there can be no doubt.
This is reflected in the organization of the Deseret Alphabet, which begins with twelve vowels: six long and six short. Next are two diphthongs: “ow” and “ay”. Two glides come next, then sixteen consonants (a combination of stops and fricatives), then two liquids, and finally four nasals. That accounts for the thirty-eight letters of the standard Deseret Alphabet.
Some sounds are still missing, however. In addition to having no schwa, the Deseret Alphabet is missing one common diphthong: [ɔɪ] as in “boy”. There is a forty-letter version of the Deseret Alphabet for which some evidence exists, one that was never part of the standard description or printed materials. It appends the letters OY and EW at the end. The former is a distinct phoneme in English; the latter could arguably be one, too.
If one restricts oneself to the standard thirty-eight letters, a problem occurs. There can be an ambiguity when something like [ɔɪ] or [oɪ] occurs in the middle of a word, because that could be either one syllable or two. I ran into this when trying to transcribe the word “prohibition”. I pronounce it something like [ˌproʊəˈbɪʃən] or [ˌproʊɪˈbɪʃən]. (The problem of reduced vowels in English is another blog entry or two.) Because of my spelling conventions, I would spell it 𐐹𐑉𐐬𐐮𐐺𐐮𐑇𐐲𐑌, but is that proy-bishun or pro-ibishun?
Now, there are sensible solutions to the problem. One would be to abandon my spelling rule and go with 𐐹𐑉𐐬𐐲𐐺𐐮𐑇𐐲𐑌. Another would be to abandon the standard Deseret Alphabet and go with 𐑎 for words like “𐐺𐑎” so that 𐐹𐑉𐐬𐐮𐐺𐐮𐑇𐐲𐑌 is unambiguous. Or we could just assume that a fluent English speakers would know that proy-bishun isn’t a word, just like we know that coop-erate isn’t a word.
Naturally, I want to be cool, not sensible. I use a dieresis and spell it 𐐹𐑉𐐬𐐮̈𐐺𐐮𐑇𐐲𐑌.
Now, there is a vaguely similar problem, one with a vaguely similar solution. There are three reasons why we double vowels when writing English. One is to spell a certain vowel sound in a frankly unobvious way, as in “goose.” Another is when we want to indicate a vowel stretched out unnaturally, as in, “I am your faaaaaaaather’s ghooooooost.” Finally, there are cases where you genuinely have two separate vowels, as in “zoology.”
For me, this came up with the word “medieval” (or “mediæval” or “mediëval”). I pronounce it [ˌmidiˈivəl]. (I believe that [ˌmɛdiˈivəl] is more likely to be considered correct, but let’s ignore that for the moment.) Just to make absolutely sure that people don’t try to pronounce it as if they were a ghooooooost from the Middle Aaaaaages, it would be good to indicate the fact that there are four syllables there, not three. And so I opted for 𐑋𐐨𐐼𐐨𐐨̈𐑂𐐲𐑊.
Now, there are some nay-sayers who will say “nay,” and that I’m being rather silly. Nobody will seriously be confused by spellings like 𐐹𐑉𐐬𐐮𐐺𐐮𐑇𐐲𐑌 or 𐑋𐐨𐐼𐐨𐐨𐑂𐐲𐑊. I have to confess that they’re right, but I nonetheless have two points in response.
Firstly, I’m being silly enough to be churning out Deseret Alphabet content of any sort. A little extra silliness won’t hurt.
Secondly, diereses are cool.
Oh, and does anybody know where I can get a fez in my size?
Unicode Regular Expressions v21 Released
3 weeks ago