Monday, August 20, 2007

Slavic script swapping

Since it's largely a matter of historical circumstances (as opposed, for example, to phonological appropriateness) whether a given Slavic language ended up being represented by the Roman or Cyrillic script, I've often wondered how difficult it would be to switch things around. How hard could it be to replace one Slavic language's orthography with that of another?

Actually, it turns out to be quite a lot harder than I had, at first, imagined. If I intend to keep letter values more or less as they are in the modern languages, I'm faced with the fact that there is not a one-to-one correspondence between Roman and Cyrillic characters (except in Croatian and Serbian, obviously, but that's a whole other story). The two script families also have rather different ways of conceptualizing various morphophonological processes, particularly palatalization.

To illustrate the problem, take the Russian word for "seven," семь. The onset and nucleus are unproblematic in a Roman orthography: se-. Note that we don't have to do anything special to indicate the palatalization preceding /e/, as back jers in Russian changed to o instead of e as in West Slavic, which means that there are very, very few /e/s without initial palatalization. How to represent the final palatalized /m/, though? There is nothing in the Roman script like the Russian soft sign, a letter in its own right which indicates that the preceding letter is palatalized. The analogy with other Roman characters in Slavic languages would be "m" with an acute accent -- but that would be inventing, which I'm trying not to do. Any wise thoughts on this point would be thoroughly appreciated.

Luckily, my sample text has no words with final palatalized segments absent from Western and Southern Slavic. This is from War and Peace by Leo Tolstoy:

«Что это? я падаю? у меня ноги подкашиваются», подумал он и упал на спину. Он раскрыл глаза, надеясь увидать, чем кончилась борьба французов с артиллеристами, и желая знать, убит или нет рыжий артиллерист, взяты или спасены пушки. Но он ничего не видал. Над ним не было ничего уже, кроме неба — высокого неба, не ясного, но все-таки неизмеримо высокого, с тихо ползущими по нем серыми облаками.

I've done my best to make an elegant conversion of this into Roman script. There are quite a number of issues I'm still not sure of the best way of handling. For instance, should the genitive ending -ого be changed to -ogo to match traditional spelling in Cyrillic, or -ovo to match pronunciation? In the same vein, should что end up as čto or što? I need some Russian speakers to help me with this one. An aesthetic question is what to do with palatalized consonants followed by back vowels. Should I follow the Polish route with e.g. sia, or the more Croatianish śa? Here's my first attempt:

"Čto eto? ja padaju? u menia nogi podkašivajutsia", podumal on i upal na spinu. On raskryl glaza, nadejaś uvidać, čem končilaś boŕba francuzov s artilleristami, i želaja znać, ubit ili net ryžij artillerist, vziaty ili spaseny puški. No on ničevo ne vidal. Nad nim ne bylo ničevo uže, krome neba — vysokovo neba, ne jasnovo, no vsio-taki neizmerimo vysokovo, s tiho polzuščimi po nem serymi oblakami.

Polish presents far more problems in an artful Cyrillic transcription. It is the only remaining Slavic language which still has nasal vowels, for one thing. Happily Old Church Slavonic does provide us with ready symbols for these, albeit they make a text containing them look thoroughly medieval. Both front and back jers became /e/ in Polish, which means that there are oodles of nonprepalatalized /e/s floating around, and I really don't much care for the Cyrillic grapheme э. I've decided to use the good ol' back jer for these ("hard sign" for you Russian speakers), which is historically accurate anyway. My sample text is from The Peasants by Władysław Reymont:

Mateusz się porwał w ten mig do niego, ale nim mógł zmiarkować co bądź, już Antek skoczył jak ten wilk wściekły, chycił go jedną ręką za orzydle, przydusił aż tamten dech y głos stracił, drugą ujął za pas, wyrwał z miejsca jak kierz, nogą drzwi wywalił na dwór, i poniósł go prędko za tartak, do rzeki ogrodzonej płotem i cisnął z całej mocy, aż cztery żerdki trzasły kiej słomki, a Mateusz niby kloc ciężki padł we wodę.

The presence of those back jers makes me want to throw all the jers back in, just like in pre-reform Russian. I might as well recall the jat while I'm at it, though I'm not quite sure where this ideally belongs etymologically...actually, this is a problem with the jers as well. I can deduce certain ones -- for example, from okno "window" and okien "of the windows," I can tell that there was once a back jer between the "k" and the "n," so okъno and okъnъ. In many other cases, though, I've just had to guess until I can get my hands on a Polish etymological dictionary someday. Below is my initial attempt, looking as old-fashioned as possible. I hope that the current state of our Unicode-enabled browsers is sophisticated enough to display all these weird characters.

Матэушь сѩ поръвалъ въ тънъ мигъ до него, але нимъ муглъ змярковать цо бѫдь, южь Антъкъ скочилъ якъ тънъ вилкъ вьстеклы, хытилъ го еднѫ рѧкѫ за оридле, придусилъ ажь тамътънъ дъхъ и глосъ сътратилъ, другѫ уѭлъ за пасъ, выръвалъ зъ мейсца якъ керь, ногѫ дрьви вывалилъ на двуръ, и понюслъ го прѧдъко за тартакъ, до рѣки огроѕонэй плотъмъ и тисънѫлъ зъ цалэй моцы, ажь чьтъры жердъки трясълы кей сломъки, а Матэушь нибы клёць тѩжьки падлъ въ водѧ.

If I take out all the unpronounced jers and other bits of creative anachronism, I'm left with

Матэуш сѩ порвал в тън миг до него, але ним мугл змярковать цо бѫдь, юж Антък скочил як тън вилк встеклы, хытил го еднѫ рѧкѫ за оридле, придусил аж тамтън дъх и глос стратил, другѫ уѭл за пас, вырвал з мейсца як керь, ногѫ дрьви вывалил на двур, и понюсл го прѧдко за тартак, до реки огродзонэй плотъм и тиснѫл з цалэй моцы, аж чтъры жердки тряслы кей сломки, а Матэуш нибы клёц тѩжки падл въ водѧ.

It's kind of weird, but I don't think it looks thoroughly ridiculous. The only thing I'm seriously dissatisfied with is the treatment of the Polish ó/o distinction, which I've simply transcribed phonetically and thereby lost its morphological logic. One final possibility would be to do away with the back jer for /e/ in favor of "e," and then find a new symbol for prepalatalized /e/. Ukrainian does this with є, so maybe it's worth a shot:

Матеуш сѩ порвал в тен миг до нєго, алє ним мугл змярковать цо бѫдь, юж Антек скочил як тен вилк встєклы, хытил го єднѫ рѧкѫ за оридлє, придусил аж тамтен дех и глос стратил, другѫ уѭл за пас, вырвал з мєйсца як кєрь, ногѫ дрьви вывалил на двур, и понюсл го прѧдко за тартак, до рєки огродзоней плотем и тиснѫл з цалей моцы, аж чтеры жердки тряслы кєй сломки, а Матеуш нибы клёц тѩжки падл ве водѧ.

I'm really very curious whether the above attempts make Cyrillic-language readers feel more comfortable, or just laugh really hard.

I should probably note that I'm not actually advocating any of these systems, lest anyone think that I am once again completely insane. But I do think it's a fascinating exercise.

No comments: