[homepage] [computing] [sitemap&revision]

7-bit ASCII to ISO translator for Bengali transliteration

Download the source file or Windows executable as zipped files. The program translates 7-bit ASCII to 8-bit printable code for Bangla text transliterated in ISO 15919. Compilation instruction is given in the header of the file. Then download nialang.tex, as zipped file, which is needed by Plain TeX to render printable transliteration of modern Indic languages.

The lexer takes plain text file as input, normally, or through redirection symbol, and outputs the text in plain TeX file through redirection.

The 7-bit transliteration flagged by \bgnbg ... \endbg is translated. Currently it does not allow any TeX code inside the flagged stream of text. This is left to be accomplised in future.

For Bangla text to be transliterated into 8-bit printable, TeX needs a macro file to render some characters with diacritics, especially the nasal vowels and two rarely used vowels.

The lexer can transcribe the long vowels (e and o) which are normally also unambiguous. It can also transcribe the intermediate v if you say \strict after \bgnbg. Switching between normal and strict transliteration is not still incorporated in the program. But that will be accomplished soon.

Any suggestions or modifications are welcome. I only request the people who modify the source to send back a copy to me.

A sample input file and the output

% loads Neo Indo-Aryan Language macros
\input nialang.tex

% following line is a plain TeX command
\obeylines

% flags beginning of conversion
\bgnbg

% flags strict conversion, bar over o or e, to mean length
\strict

ni.hsa;ngataa
(ye tumi hara.na karo, aabula haasaana)

% flags end of conversion
\endbg

\vskip1pc

\bgnbg
ata.tuku caa;yani baalikaa!
ata ;sobhaa, ata svaadhiinataa!
ce;yechila aaro kichu kama,
aa;yanaara d~aa.re deha mele
base thaakaa saba.taa dupura, ce;yechila
maa bakuka, baabaa taara bedanaa dekhuka!
ata.tuku caa;yani baalikaa!
ata hai rai loka, ata bhii.ra, ata samaagama!
ce;yechila aaro kichu kama!
eka.ti jalera khani
taake dika t,r.s.naa ekhani, ce;yechila
eka.ti puru.sa taake baluka rama.nii!
\endbg

% plain TeX command
\bye

niḥsaṅgatāā
(ye tumi haraṇa karo, ābula hāsāna)

The poem, by Abul Hasan, is in Bangla script below:

ataṭuku cāẏni bālikā!
ata śobhā, ata sbādhinatā!
ceẏechila āro kichu kama,
āẏnāra dā̃ṛe deha mele
base thākā sabaṭā dupura, ceẏechila
mā bakuka, bābā tāra bedanā dekhuka!
ata hai rai loka, ata bhīṛa, ata samāgama!
ceẏyechila āro kichu kama!
ekaṭi jalera khani
tāke dika tr̥ṣṇā ekhani, ceẏechila
ekaṭi puruṣa tāke baluka ramaṇī

The \strict option will render a strict print transliteration as in the following:

niḥsaṅgatāā
(yē tumi haraṇa karō, ābula hāsāna)

ataṭuku cāẏni bālikā!
ata śōbhā, ata sbādhinatā!
cēẏēchila ārō kichu kama,
āẏnāra dā̃ṛē dēha mēlē
basē thākā sabaṭā dupura, cēẏēchila
mā bakuka, bābā tāra bēdanā dēkhuka!
ata hai rai lōka, ata bhīṛa, ata samāgama!
cēẏyēchila ārō kichu kama!
ēkaṭi jalēra khani
tākē dika tr̥ṣṇā ēkhani, cēẏēchila
ēkaṭi puruṣa tākē baluka ramaṇī

No such option will, by default, transliterate the poem into normal print transliteration as in the following:

Run lexer < infile.txt > outfile.tex in command shell and run TeX on it. Do not forget to load nialang.tex, which is a transliteration macro for Neo Indo-Aryan Languages, in the infile or the outfile.

 

Revised: 5 April 2011