Wednesday, March 28, 2012

An Oz. of Prevention is Worth a Lb. of Cure

While working on today’s XKCD, I ran across something I hadn’t given a lot of thought to before. The text of the comic uses the venerable abbreviation, “oz.” How to proceed?

By and large, most unit names one runs across are English in the sense that if they’re written out in full, they’re written without italics: ounce, pound, foot, mile. Even SI units are so treated: gram, meter (or metre in the UK), newton, joule. Some units not at all used in the Anglosphere also have English equivalents (catty, talent).

Abbreviations for all these things are therefore English abbreviations. This is a bit more complicated for SI units, because they don’t strictly speaking have abbreviations. They have symbols, which is why we write "km/s" without any periods. (The English would leave out the periods anyway, but that’s their problem.) You’re supposed to use the symbols regardless of the writing system you’re using, so “kilometer” should always be represented with “km,” and never “k,” “κμ,” or “𐐿𐑋,” let alone “公里,” but that doesn’t seem to stop people.

The flies in the ointment are a small number of very, very old units—units so old that the standard English abbreviations used for them are not derived from the English word. The most widely used of these are the two related to weight: ounce and pound, which are abbreviated to “oz.” (from the old Italian onza) and “lb.” (from the Latin libra).

My general policy with regard to abbreviations has been to respect the language of origin. “Common Era” consists of two English words and so is abbreviated to “𐐗.𐐀.”—I pronounce the word /'irə/, after all, even if /'ɛrə/ is preferred. “Anno Domini,” however one may pronounce it, is Latin, not English. It ends up, therefore, as “A.D.”

Initialisms are just one kind of abbreviation, so I tend to treat them similarly: “HTML” gets turned into “𐐐𐐓𐐣𐐢.” XKCD is a special case because “XKCD” isn’t actually an initialism or abbreviation for anything. It’s just a name made up of four Latin letters. Acronyms present a problem of their own, inasmuch as turning “SCUBA” into “𐐝𐐗𐐊𐐒𐐈” gives a rather different result from turning “scuba” into “𐑅𐐿𐐭𐐺𐐲.” “Scuba” has become a naturalized English word, after all; most people probably don’t know that it originally was an acronym, let alone what it was an acronym for. And then there are things like “SAT,” which could either be “𐐝𐐈𐐓” or “𐐝𐐊𐐓,” depending on whether or not one thinks it’s a word and what one thinks it stands for.

The simple fact is that spoken languages evolve around their written forms. One reason why China has found it impossible to abandon sinograms is that for three thousand years, speakers have modified the way they speak on the assumption that words are written using them. Spoken and written Chinese exist in symbiosis, and neither can change without having an impact on the other.

English, as usual, ups this trend to eleven. Not only has it been stealing words from other languages with careless abandon for centuries and spelling them every which way, but since the mid-2oth century, acronyms have become a major way by which its vocabulary is extended. This even goes for the foreign words we acquire. (I'd give obvious examples, but that would end up involving Godwin’s law.) So we have initialisms which are abbreviations, initialisms which are full words, initialisms which are treated like words but pronounced as if they were abbreviations, camel-case words, and every possible combination of the above. I won’t even get into IM-speak (or r u going 2 insist?) and l33t.

Among the barriers the Deseret Alphabet—as well Shavian et al.—faces in trying to be taken seriously as a writing system, then, is the fact that the language it is intended to write is spoken on the assumption that its being written in a completely different script. If you prefer, significant chunks of spoken English don’t make sense unless you’re using the Latin script for writing.

As for our friends “ounce” and “pound,” I decided that since they’re English words, I should give them English abbreviations: “𐐵𐑌.” and “𐐹𐐼.,” respectively. (“𐐍𐑌.” is a pretty useless abbreviation, of course since the word in full would only have one more letter. It's like abbreviating “June” as “Jun.” It just seems unnecessary.) If the old Italians or ancient Romans object to either abbreviation—well, I’ll cross that bridge if and when I ever come to it.