Pohela Baishak or Pahela Baishakh?

How do you write Bengali New Year’s Day পহেলা বৈশাখ in Roman script — pohela boisak or pahela baishakh? The former comes arguably close to the original pronunciation but the latter specifies the original orthography, as spelt in the Bengali script — pa-he-la bai-sha-kh(a). Which one to choose? Well, writing Bengali names or phrases in Roman script can often be problematic because Bengali script is characteristically different from the Roman one. The Roman script, employed to write English, is alphabetic — characters placed one after another add to the pronunciation in a linear fashion. The script used to write Bengali is alphasyllabic, where characters first form syllables and then the same syllables change pronunciation in different ways depending on the context. There lies the problem.

We do often encounter such anomalies in the conversion of Bengali names or phrases into English text and such anomalies manifest themselves in print and create confusion. But the problem is not typical of conversion from the Bengali to the Roman script. The former Soviet premier Хрущёв, whose name is a six-letter word in Russian, comes out as Khrushchev, Xrushchev, Xrushchef and any other such combinations in English. The same name is written as Chruschtschow by a German or Hrusjtsjev by a Swedish. Chaikovsky (Чайковский) is spelt as Tchaikofsky, Tschaikowskij, etc. This shows that while the same word or phrase can be spelt in many different ways by the speakers of a language, speakers of different languages also spell them in even more different ways because of their initiation in their mother tongues.

Well, is there any standard? Should there be any? Or what happens is what the French call chacun à son goût (everyone to his taste). There are in fact two methods in use in the script conversion process — transcription and transliteration. Transcription tries to depict the pronunciation of the source text using the sound system of the target language, without considering the original orthography. Transliteration represents the orthography of the source script, using one-to-one correspondence with the target script, without considering pronunciation.

Since transcription of a given language is related to the sound system of the source language vis-à-vis the sound system and phonics, simple pronunciation rules, of the target language, it fails to work in most cases. Bengali has an inventory of about 45, some say 46, sounds and (British) English has 46 sound segments; the English as spoken in the US has more or less 37 sounds. But the vowel sounds of Bengali do not correlate with the sounds in English. So is the case with many consonant sounds. This poses the problem of even approximately representing the Bengali sounds with Roman scripts, using English phonics.

Bengali has two distinct t-letters (ট and ত): writing ‘tal’ can mean two different words in Bengali — ‘palm’ (তাল) and ‘tipsy’ (টাল). The language has three s-letters (শ, ষ and স): ‘asa’ can mean both ‘to come’ (আসা) and ‘hope’ (আশা). If we write ‘asha’ (আশা) to mean hope and differentiate it from ‘asa’ (আসা), to come, we might run into trouble differentiating the third s-letter as in ‘bhasha’ (ভাষা), language. The vowel sound scenario is more complicated. Speakers of Bengali, who are initiated into the English language, are often prone to represent the final sound in ‘labh’ (লাভ), ‘profit’ in Bangla, with v as in ‘love’; this again may lead to some objectionable transcription of some Bangla words.

Bengali has seven vowel sounds and English has 22. The first letter of the Bengali alphabet produces a sound that cannot be unambiguously coded by the Roman script. It is something like the vowel sound in ‘hot’, or more specifically, the vowel sound in ‘horse’, but half the duration. The Bangla sound can be expressed with a series of letters or letter clusters like a as in falcon, o as in hot, au as in aura or aw as in awkward. But if we use a, someone might relate it to the sound as in bat; if o, to the sound as in polar; and if au and aw, to the sound as in house. Another point: if there is the bit- or put-sound in the syllable next to the one with the hot-sound, the hot-sound changes to what the Americans pronounce boat; we write kari (করি), but pronounce kori /kori/.

While such a scheme attempts to represent the pronunciation of the source text, it very often fails someone, who does not know Bengali very well, to go back to the original. Moreover, all of what we have discussed are with reference to the English sound pattern. If we do the conversion for the speakers of French or German, who also use Roman script, it becomes altogether a different picture. What ‘ch’ symbolises to an English speaker is usually represented by ‘tch’ or ‘tsch’ to a Frenchman. The French relate ‘ch’ to what the English understand by ‘sh’.

A general form of transcription has for long been used in the early period of such conversion processes, starting from 1667 to 1894. The earliest example of such transcription is found in a book called China monumentis by Athanasius Kircher, published in 1667, which printed a specimen of the Bangla alphabet, along with corresponding Roman letters. For the next century and a quarter, numerous schemes of conversion evolved, one differing from another. William ‘Oriental’ Jones, who founded the Asiatic Society, drew up a transliteration scheme in 1788 in his Dissertation on the Orthography of Asiatick words in Roman Letters, for the Sanskrit alphabet that uses Devanagari characters. The scheme uses diacritics to differentiate two t-letters, two n-letters, three s-letters or long or short vowels. Since the Bengali alphabet is modelled on Panini’s Devanagari alphabet, the same scheme worked well for Bengali and other Indic languages for some time.

In September, 1894 a transliteration committee was set up at the Geneva Oriental Congress to work out a standard for writing Sanskrit in Roman characters. A scheme, with some minor modifications to that of Jones’s, was agreed upon, which has till date been broadly adhered to by the researchers of Indic language and literature.

The scheme has been in use for a long time. But after Ishwarchandra Vidyasagar in 1855 in his most famous primer called Varna Parichaya had normalised the Bangla alphabet, with some supplementary characters, anomalies cropped up. One of the d-letters came to be used for an r-variant (ড়); the letter y was used to mean both the dot-less version (য) that is pronounced in the same manner as j (as in judge) in ‘yayabar’ (nomad) and the dotted version (য়) that is pronounced as y as in ‘maya’ (illusion). In earlier times, Bengali had two b-letters, one of which was represented by v; although they still look and are pronounced differently in many Indic languages, including Sanskrit, they are represented by the same character in Bengali, both in shape and pronunciation.

The Geneva-based international standards committee, the International Organisation for Standardisation or ISO for short, few years ago worked out a transliteration scheme, covering 10 Indic languages, to address such problems of Modern Indo-Aryan languages that are somehow linked to Sanskrit and use the Paninian scheme of Sanskrit alphabet. The scheme, formally known as ISO 15919 Transliteration of Devanagari and related Indic scripts into Latin characters, unambiguously addressed such problems, using diacritics.

But this is pure transliteration and goes by the orthography of Bengali. Although it works fine with scholarly research publications or with academia, it fails to apply to popular print media or common publications because it disregards the pronunciations of the texts or phrases and it is hard to employ unless the printers have the required diacritic typefaces.

How then has everything been managed through all these years? A Briton named William Wilson Hunter, who joined the Indian Civil Service in 1862 and was posted to Birbhum in the then lower provinces of Bengal, collected local traditions and records that formed the materials for his work Time Annals of Rural Bengal, a book which did much to stimulate public interest in the details of Indian administration.

Hunter adopted a transliteration scheme for vernacular placenames, by which means the correct pronunciation is ordinarily indicated. The Hunterian system is the national system of Romanisation, transcription of transliteration into Roman characters, in India. He dispensed with diacritics in transcribing consonants, using letter clusters in some cases, running the risk of being ambiguous in back-transliteration.

The popular Romanisation scheme, which is currently in use in the print media, is basically, knowingly or unknowingly, modelled on the Hunterian system, with one bold effort of dropping the inherent-a that goes with each of the consonants in Bengali. Since the Bengali alphabet is alphasyllabic, each consonant on its own subsumes the first vowel of the Bengali alphabet, called inherent a in linguistic parlance. In strict transliteration, every inherent-a is coded using the standard (diacritic) scheme. But Bengali by its very nature drops this inherent-a syllable finally, although it remains hidden in the orthography. In the popular relaxed scheme, the silent-but-orthographically-important inherent-a is dropped syllable-finally. This is also the convention we use when we name our people or places in Roman script.

  Jones Geneva ISO Hunter   Jones Geneva ISO Hunter
a a a a ā ā ā a/ā
i i i i ī ī ī i/ī
u u u u ū ū ū i/ū
ri e e e/ē e
ai ai ai ai o o o/ō o
au au au au          
k k k k kh kh kh kh
g g g g gh gh gh gh
n/ng ch c c ch
chh ch ch chh j j j j
jh jh jh jh ñ ñ ñ ny
t ṭh ṭh ṭh th
d ḍh ḍh ḍh dh
n t t t t
th th th th d d d d
dh dh dh dh n n n n
p p p p ph ph ph ph
b b b b bh bh bh bh
m m m m y y y y
r r r r l l l l
v v v/w ś ś sh
sh sh s s s s
h h h h d
b b b d y y y/j
m n/m
n ~ n/m


There is no problem for scholarly publications; they all use the diacritic scheme, which is absolutely reversible without good knowledge of any of the target and source languages or scripts. But people, who do not like to be saddled with lots of diacritics, or the printers who do not have a repository of diacritic typefaces or font files, prefer the relaxed Romanisation scheme, which is basically the same diacritic scheme without the diacritics, with a little modification. This becomes more of a transliteration and less of a transcription and is considered a standard in popular prints. But problems emerges when the scheme becomes less of a transliteration and more of a transcription, and that too in a very inconsistent way.

Preference for either of the systems has its merits and demerits. It is up to the people to decide the system. Of course, 10 different schemes on the same thing are certainly worse than a single universal intelligible-to-all scheme. Moreover, the state of being politically correct or incorrect naturally goes with either of the choices. Someone writing ‘varna’ for the Bengali word বর্ণ for ‘letter’ might be labelled as pedantic or pro-Sanskrit by someone, oblivious of the fact that although Bengali has a script of its own, it has borrowed the Sanskrit alphabet, or the Sanskrit alphabet has been imposed on Bengali, as some might prefer to say. Again if someone writes ‘barna’, he might be accused of under-differentiating the two b-letters, as far as grammar is concerned. Inspired to keep to the pronunciation, someone else may write ‘borno’, and thus he might leave the more linguistically attuned people wondering what are the vowels in the word — o as in hot, which is written a, or o as in note, written o. He might by a larger section of the people be labelled as ‘crazy’. So, pohela boishak or pahela baishakh? Choose what you may, but do not forget the labels you may be stamped with.


Modified on what was published
in Holiday, p 5, 30 August 2002


Revised: 5 April 2011