Codepage     «Newspaper»   «Home»   «Map&Rev»


Bangla computing: follies all the way

Characters to the code page are what letters are to the alphabet. A set of letters is what makes an alphabet, which is used to write a certain language. A set of characters, or numbers assigned to characters, in a block or table is what makes a code page that allows the writing of more than one languages using similar scripts on computer. And, characters, which can in an extreme generalisation be called letters, are visible only when they become manifest in their glyphs. And a glyph, the visible form of a single character, can be variously manifest — in letters that are upright, slanted or emboldened. They can also look different depending on the style and genre — or face for that matter.

The presence of poor understanding, or the absence of any understanding at all, of the issues appears to have marked the development of, and marred the efforts for, a code page that could have made Bangla computing universal — across programs on different operating systems — when the issues primarily came up in public discussion, with one group of people, basically technologists, struggling to work out something effective, another group of people, basically traders of technological devices, trying to oppose what the technologists would say in fears of losing their business and, yet, another group of people, basically experts in other areas but hardly attuned to computing, floundering in between, without having the standing to say something resolutely in favour of or against what the two other groups would say. This was how it all began in the early 1990s.

The first computer in what now constitutes Bangladesh was, however, an IBM 1620 mainframe — a general-purpose, stored-program data processing system — that Pakistan received under the Colombo Plan. It was installed at the Dhaka centre of the Pakistan Atomic Energy Commission, which was later renamed as the Bangladesh Atomic Energy Commission, in the last quarter of 1964. The computer, a second-generation mainframe, came to be installed in Dhaka because there were none but one, Hanifuddin Mia who trained in analogue computer technique and digital computer programming in Prague in 1960, in the whole of Pakistan who knew how to run a computer. He was offered an overture in West Pakistan but he resolved not to leave East Pakistan. The computer, thus installed in Dhaka, catered to universities and government and semi-government research and development agencies in the whole of Pakistan. The machine — which IBM announced on October 21, 1959 and withdrew on November 19, 1970 — was declared dysfunctional on July 5, 1980 and it has been housed in the Science and Technology Museum since 2001.

Although it was not until the mid-1980s that computers could effectively process texts in Bangla in a significant way, the proof of concept work on Bangla computing is reported to have begun roughly in 1977 after the microprocessor that had arrived in 1971 paved the way for the IBM personal computer in 1974 and Apple Macintosh in 1976; the IBM compatibles became available in 1982. Efforts of some individuals came to be reported while efforts of some other may have gone unnoticed. A Sweden-based Bangladeshi student is reported to have first used the computer to publish a newsletter in Bangla in 1977. In July 1982, the Bangladesh University of Engineering and Technology came up, after six years of work, with a hardware-based solution, partially completed, where a low-cost microcomputer could write Bangla. The letters, which were fixed-pitch, hardly looked good but any further work was thwarted by compounding problems of conjunct letters. The project remained at the research level and was finally shelved.

Efforts in this direction progressed on the Apple Macintosh front. The first workable program to write Bangla text came about in January 1985, reportedly after two years of work. Saifuddoha Shahid, an employee of Beximco Ltd, began the work to create a font on Mac in 1983 and he could finish designing a bitmapped font in Apple’s Resource Editor in 1984. Zafar Iqbal developed a system to process Bangla script on Mac at CalTech in the United States about the same time. Saifuddoha Shahid released Shahid Lipi, which could process Bangla text on Mac with a keyboard overlay and some TrueType fonts, in January 1985. The program was later ported to Windows, but it is reported not to have properly worked much later with newer versions of computer and operating systems coming in and the program gradually fell out of use. The National Mass Media Institute, which had plans to develop a fully-fledged word processor since 1984, began work with two Macs, two image-writers or dot-matrix printers, a laser printer and a copy of the Shahid Lipi program under a scheme funded by UNESCO in 1986. But it did not proceed further.

A Kolkata-based Indian enterprise named Rahul Commerce is reported to have, meanwhile, designed a font named Bankim, a Postscript font, for use on computer. A Bangladeshi named Gautam Sen is reported to have designed the font. Saifuddoha Shahid is reported to have been asked to create a laser font, but Bankim had reached Dhaka before he could finish his work. A flight engineer named Mahmud Hossain is reported to have developed the second laser font. Saifuddoha Shahid had already completed designing his font, but it could be of little use in the absence of a keyboard driver.

Next came a system devised by Abdul Mottalib said to have been talked about in a presentation. Mainul Lipi, developed by Mainul Islam, came out in 1987. It was used in a publication in May 1987. But many thought that the typeface was not good to look at. Some improvements were made to the font by September 1987, but it basically remained the same. It was then Mustafa Jabbar and Gholam Faruque Ahmed who developed a system named Bijoy, which was released in December 1988 — a set of fonts and a keyboard overlay with the software reportedly written by an Indian programmer by the name of Devendra Joshi, then working as a programmer in the Apple distributor’s office in Delhi.

Mustafa Jabbar developed a font called Ananda in December 1987 and it began to be used in a Bangla-language newspaper called the Azad. This font was better than the earlier ones. But using the Jabbar overlay, which was developed by Mustafa Jabbar in 1982 mostly modelled on the typewriter’s Munier overlay, it was very difficult to type faster. Because the keyboard assigned four letters to a single key on the board, in four tiers, normal, shift, option, which is the same as the control key on PC, and shift option. The Bijoy interface system and fonts, released on December 16, 1988, made keyboard manipulation simple and fast. A horde of fonts coming out earned the system a huge market share. Another interface named Basundhara came out on Mac in 1992. Until towards the end of the 1980s, it was all an affair running on graphical window on Macs, with fonts, the keyboard overlay and the driver bundled as add-ons running on top of word processors or layout software.

The table had by then turned for the IBM personal computers and compatibles. People started work on writing Bangla on personal computers. A company called Compico made the first attempt at a hardware-based approach in 1987. But low-quality printing and an absence of the ease of use held it from gaining ground. Another firm called Computer Land started marketing DuangJan, a full-screen international word processing program for the IBM PC and compatibles which allowed users to edit text concurrently in English and another foreign language and it had Bangla on offer on the platter. Computer Land started marketing the program, which had been in use in the United States around 1986, in 1988. It was developed by the Philadelphia-based Megachomp Company. A few copies of the program were sold, but it began to be heavily copied soon after.

Two local firms — Computer Solutions Ltd and Graphics Information System — tried at the localisation process that was developed in India. Although the fonts created using the technology produced better results, the whole affair was expensive and both the companies failed to gain any market share. Shamsul Haque Chowdhury, of Automation Engineers, worked in the same direction. He started working in 1987 and could use Bangla on computers with the help of a hardware card and a program called Abaha, which was released in November 1988. Immediately after Abaha, Unidev Computer Solutions tried to market the Indian GIST card-based technology. Graphics Intelligent Script Technology used an indigenously designed special purpose VLSI chip called GIST 9000. But they both failed in a Mac-dominated publishing industry. Subsequently evolved Onirban and Barna. Onirban, not a basic word processor in itself, was released in March 1990. It was developed in Pascal programming language.

Barna came out to be the first basic fully-fledged word processor on DOS. Work on Barna began in 1988 and the first release came out in 1990. The program offered three Bangla and three Roman typefaces and an ‘easy’ overlay in addition to three popular overlays in use. The developer duo of Barna later founded the SAfeworks to take over the marketing of the software. Barna came to ship a spell-checker called Pandit with 60,000 basic word entries and the word-processor was ported to Windows in 1993 as Barnana. A Bangladeshi student in China, Maruf Hassan, is learnt to have been developing a Bangla script text-processing software using multi-byte codes around 1992; but nothing of it was heard thereafter. DOS-based programs could offer a good means for office and personal use. They had commanded the market for a long time, but they failed to offer any appreciable means for desktop publication.

With people having more access to graphical user interface such as Windows on the IBM and compatibles, efforts on fonts and overlays, which included the driver and the layout, got a going. Several add-ons that could work on Windows entered the scene in a short span. Basundhara became available for Windows in 1993 a year after it had been released for Mac. Barnana, a Windows version add-on of Barna, came out in 1993. Onirban was upgraded to version 3.0 for Windows that year. Proshika Computer Systems brought out Proshika Shabda in 1994 and Lekhani, both on Windows and Mac, was out in December 1994. Two more programs — Asha and Prabartan — came out about the same time.

A few years after, in 1997, Hi-Tech professionals developed Anmana. Before the turn of the century, there were a number of options, with varied capabilities, on offer for people to choose one from — Orcosoft Borno, Adarsha Bangla, Bhasha Sainik, Natural Bangla developed by CDS-IT, Lekhak, Bangla 2000 and the like, coupled with some programs that allowed users to type texts with mouse clicks; 8 Phalgun, released in 1998 by Microtek, and Duranta Bangla, which came out in 1999, were two examples of this kind. But almost all of the late entrants failed to gain any notable market share as did other add-ons developed by Access Ltd, Micrologic, Flora Limited, etc. None of them could make any dent in the market already dominated by Bijoy, Lekhani, Proshika and Barnana.

While all this that happened on the user side left the sphere of Bangla computing with a horde of programs and more than a hundred fonts, what became the most troublesome issue is the various font encodings that the developers used in laying out their fonts, mostly TrueType. There were more than a dozen font encodings, which are basically graphical forms of Bangla letters or part of letters placed in a certain order in a file that had 256 places, but the use of 190 or so of them was allowed. This soon made texts processed with individual programs unintelligible to one another, creating a Tower of Babel. The problem of interoperability of text or electronic data interchange could have been effectively solved if the government had taken the right approach when it set up a committee for the implementation of the Bangla language on computers in 1987.

But, by the mid-1990s, the situation curdled enough to create effective hurdles to a meaningful resolution of code page issues. Indolence, and even inadequacy, on part of the government, the emergence of traders dealing in electronic typewriters, early birds in the word-processing and add-on trade, and the formation of a software services group lent enough strength to the opposition of anything that could prove good in Bangla computing. The government set up the National Computer Committee in 1983 which was meant more to control the use of computers. With some deregulations, the entity was turned into the National Computer Board in 1988. It became the National Computer Council in 1990 with an ordinance.

The committee on the implementation of the Bangla language on computers kept floundering until 1992. All the while the committee flustered about what to do to design a standard Bangla keyboard overlay or a code set for Bangla characters, agencies in India, especially the Department of Electronics and the Centre for Development of Advanced Computing, did the research and work on laying out a common code page for most of the Indian, or Indic, languages. It had begun work in 1986, a year before Bangladesh set up the committee. The Bureau of Indian Standards published the first code page for Indic languages, which included Bangla, in 1991 after the work that spanned from 1986 to 1988. It was formally named BIS 13194:1991, or the Indian Standard Code for Information Interchange, modelled on the name of ASCII or the American Standard Code for Information Interchange that has been in existence and use since 1963, initially named as ASA X3.4-1963, which underwent two minor revisions.

The ISCII code page, laid out in two versions of 7- and 8-bit, mainly dealt with Devanagari letters employed to write Hindi, Marathi, Nepali and some other languages. Letters of nine other Indic languages are mapped onto the Devanagari characters. This is the first instance of an official Bangla code page. The code page had a limited, regional use, yet it formed the basis for the Unicode Bangla as it came out as an international standard in October 1991. As Unicode started to gain grounds, across international boundaries and across platforms, it give birth to opposition by software vendors, who goaded the policymakers, to Unicode and the opposition still continues on a limited but ineffectual scale. It was in April 1992 that the then head of the computer science and engineering department at the Bangladesh University of Engineering and Technology was entrusted with laying out a Bangla code page or coded character set. A draft could be readied by August that year. The draft was approved in June 1993 and sent to the Bangladesh Computer Council in July. The Computer Council sent the documents to the Bangladesh Standards and Testing Institution for approval towards the end of July.

The opposition to Unicode on the premise that it was based on the Indian Standard Code for Information Interchange carried no meaning in that it is all about the early bird catching the worm. Unicode people had ISCII before them and they modelled the Bangla code block on ISCII but with noticeable changes, modification and addition, especially in the ligation control method and the formation of extended characters. The proverb is also true about the dominant keyboard overlay called Bijoy. It came first, with ease, and gained ground.

Bangladesh so far has had five code page happenings, including the 1993 efforts that fell through. The first standard, which is the second effort, that came out in 1995 was something that technologists here devised by cudgelling their brain. But it was the worst of all by any definition of a code page and it suggested that the people involved in the process had neither any understanding of what a code page should be nor a proper understanding of how Unicode works. The three other efforts — in 2000, 2011 and 2018 — only rubber-stamped the international Unicode standard, with some minor suggestions that the Unicode Consortium has never cared to bother about.

A Bangla keyboard implementation committee, meanwhile, could come up with a keyboard overlay and a font scheme under the stewardship of the Bangla Academy in 1992. It was a private computer company, CiTech Co Ltd, that designed the overlay for the academy free and it hardly warranted any efforts from other members on the committee. The ‘ideal’ keyboard was advertised in early December that year, but it failed to take off primarily because of the objection of a vendor against the overlay with allegations of copying the key distribution scheme from the vendor’s overlay. Some of the vendors wanted the keyboard to have 96 keys while some wanted the number to be 94 keeping to the number of keys available in electronic typewriters depending on the make. The committee had been largely sandwiched between technologists, on the one hand, and vendors of electronic typewriters and Bangla add-on developers, on the other hand. Media reports of the time suggest that all the quarters stood their ground so as not to lose the market share that they gained. And, such quarters included typewriter vendors who feared that once a different overlay was officially decided, it could keep them out of the market.

No keyboard overlay efforts in Bangladesh have so far trodden any scientific path although there is a set course for this. The process is simple. All it takes is a frequency list of all the characters, or letters, coupled with punctuation marks or any other signs needed to write text in the script. Such a list gives the most used character and the least used character with all others in between in an ascending or descending order. There are research that decide the efficiency of the fingers in relation to the distribution of keys on a physical keyboard. The next job is to assign the characters to the fingers in relation to the efficiency of the keys. An analysis of bigrams, or the most frequent associations of two characters, and even trigrams, or the most frequent associations of three characters, could make the overlay more efficient as this stops the same finger being used twice consecutively, more so in cases of the fingers that are less efficient in relation to the physical distribution of the keys to ensure a minimum typing speed. The scientific principle has not largely been adhered to in designing an efficient keyboard overlay although developers off and on claim to have done this. An efficient keyboard overlay based on this scientific approach has still been missing from the scene.

The Bangladesh Computer Council came about in 1990. The Bangladesh Association of Software and Information Services was set up in 1997. Vendors and technologists have by then had rounds of fights over the standardisation of a keyboard overlay that could be made official. All the pressure groups were there and interests of all quarters were at play when the Unicode Consortium released its first version, which included the Bangla standard, described as Bengali, modelled on ISCII.

While the fight against and the opposition to the ‘Bengali’ block of Unicode continued, the committee on the implementation of the Bangla language on computer under the stewardship of the then head of the computer science and engineering department of the Bangladesh University of Engineering and Technology, given the task in April 1993, came up with the draft layout of the Bangla code page on June 30. The code page was called Bangla Standard Code for Information Interchange, or BSCII, in short. On the approval of the committee, the layout, along with a keyboard overlay, was sent to the Bangladesh Computer Council on July 13 and the council sent it to the Bangladesh Standards and Testing Institution on August 4. But the layout is heard of having been sent to the International Organisation for Standardisation for adoption on August 24, but it later failed to earn an approval of the Standards and Testing Institution. An acknowledgement of the receipt of the layout by the ISO on September 7 that year was also reported.

The 256-code table had ASCII characters in ASCII positions and Bangla characters in the higher ASCII block, the last 128 positions, with the table beginning with numerals in hex positions of 80–89, symbols in hex positions of 8A–8F, with only three occupied, canonical characters in hex positions of 90–C1, vowel markers in hex positions of C2–CF, with 10 occupied, consonants that could occur after another consonant in hex positions of D0–F2 for conjunct formation, and the hasanta, or invocaliser, in the hex position of F3, and hex positions of F4–FE left reserved for future use. The code page, with its inherent flaws, was better than what came up later in 1995. The main flaw of the code page was the repetition of consonants that could occur after another consonant having been given separate code points. This could have been solved with a marker instead, as has been done in Unicode.

The efforts stalled understandably in the face of opposition by vendors and traders of the time. The Bangladesh Computer Samity, the trade association of computer and accessories vendors that was founded in 1987, had already been there. Every vendor appears to have feared to lose their labour spent on the systems that they had so far devised or sold. The code page that was prepared also appears to have exposed the vendors and many of the technologists of the time to an uncharted area as most of them were comfortable with a font encoding that they could manipulate through a keyboard driver. Adherence to the code page could mean beginning anew, which they mostly shied away from.

While the first of the efforts lay with ISO, and, of course, without any chance for an official approval, the Standards and Testing Institution asked the Bangladesh Computer Council to work out a coded character set for Bangla again in December 1994. A new table came up in 1995, accepted on July 15, as a standard called BDS 1520:1995, which was, in fact, a repertoire of glyphs, set in a certain scheme, but not a set of characters, which a code page is. This was a marked deviation from the idea of code page and a jump into font encoding that vendors of the time perhaps liked because it provided all that they needed out-of-the-box, the glyphs that can be employed to write Bangla text, and the table included several instances of a single character without having any different behaviour. The set that had 224 glyphs was hardly adhered to by most of the add-on vendors. The Standards and Testing Institution also sent the standard to a working group of the International Organisation for Standardisation in June 1997, requesting the ‘incorporation’ of the code table of BSD 1520:1995 ‘Bangla Coded Character Set’ into ISO/IEC 10646, which along with Unicode parallelly define the Universal Character Set.

The International Organisation for Standardisation in an internal document dated 1998-03-18 for internal discussion, which erroneously refer to ‘BDS’ as ‘BSD’, says, ‘The WG2 has made a thorough review of the code table in N1634 [the contribution from Bangladesh Standards and Testing Institution, dated 1997-06-29] and has compared it to the Bengali code table in ISO/IEC 10646 and has found that BSD 1520 is mappable to 10646 as it currently stands, that no change to 10646 is required and that with an adequate table-lookup it will be possible for data coded in BSD 1520 to be transformed into 10646.’ The document further says, ‘Almost all of the characters from the range are glyph representations of an underlying coding compatible with 10646 coding for this script. The character appears (because of the characters CA, and D0-D6) to be based on an Apple Macintosh implementation for Bengali.’ The document also contains a four-page list of glyphs or glyph groups, as contained in BDS 1520:1995, mapped onto ISO/IEC 10646. The document ends by saying, ‘If careful analysis of BSD 1520 shows that one or more characters cannot be mapped directly (or with reasonable, local, context analysis) to 10646, then those characters may be candidates to be added to the standard.’ A reading of the text suggests that a poor understanding of the Universal Character Set by at least the Bangladesh Standards and Testing Institution, and perhaps all the people involved in the process, has been adequately exposed.

The science and information and technology ministry on October 5, 1998 yet again set up a committee involving the Bangla Academy on the formulation of a Bangla character set for Unicode. A lecturer in computer science in the University of Dhaka was also appointed a fellow on October 28 on the committee who was to review BDS 1520:1995 in the light of Unicode and ISO/IEC 10646 and submit a report to the committee. The fellow submitted the report on June 23 next year and the report was sent back to the fellow on August 1 with opinions of committee members and a note of dissent by a member. The final report was submitted on November 18, with the recommendation for the inclusion of ‘khanda-ta’, or the broken ‘ta’, a letter composed of the canonical Bangla ‘ta’, the sixteenth consonant of the Bangla alphabet, invocalised by a hasanta. The committee gave its approval to the report on January 17, 2000 and sent it to the ministry for its subsequent handling by the Standards and Testing Institution and, later, the International Organisation for Standardisation.

Interviews of almost all the people involved in the process that time showed that the final report on the validation of BDS 1520:1995 in the light of Unicode and ISO/IEC 10646 was geared towards the adoption of the Unicode Bengali block, with the inclusion of ‘khanda-ta’ and the word ‘Bangla’ in place of ‘Bengali’ as Unicode then said and still says. Blame was traded enough between the technologists, who perhaps could see no other option but to conform to the Unicode standard, and the vendors, who blamed the absence of industry experience in the academy and desperately sought a way out from their poor understanding of the Universal Character Set with the help of the fellow who was appointed for the review. The Bangladesh Standards and Testing Institution subsequently came up with the ‘first revision’ of its 1995 standard, called BDS 1520:2000 ‘Bangla Coded Character Set for Information Interchange’ on July 25, 2000.

The 2000 standard, which the Standards and Testing Institutions preferred to call the ‘first revision’, was a complete reversal of what the institution had till then batted for. It completely negated its 1995 version. The national standards agency sent the standard to the International Organisation for Standardisation on August 23, 2000, additionally seeking the incorporation of ‘khanda-ta’. Michael Everson in an internal posting on September 20, 2000, still listed on the Unicode Mail Archive, sought opinions about the ‘khanda-ta’ while he also sought to know: ‘Another question, is does BDS 1520:2000 completely replace BDS 1520:1997, or is the old standard still valid (and being implemented)?’ The posting had the 1995 standard year wrong, but it suggested that the consortium had thought about the contribution that the Standards and Testing Institution had made. Another poster, in reply, noted that the ‘khanda-ta’ was, in fact, ‘ta’ + virama (hasanta) and said that BDS 1520:2000 presented the table without any ligation control characters other than virama (hasanta). As for replacement issue, the poster noted that BDS 1520:1997, erroneously mentioned to refer to BDS 1520:1995, was ‘based on a font encoding…. It is also the encoding used in many web sites. BDS 1520:2000 is a complete replacement, being based on the ISO/IEC10646 character encoding model.’

The Unicode Consortium explained the ‘khanda-ta’, a letter unique to the Bangla script system, in the fourth version of the standard released in April 2003 saying that ‘ta’ + ‘hasanta’ + zero-width joiner would produce a ‘khanda-ta’ and a zero-width non-joiner, instead of the zero-width joiner, would produce a visible hasanta below the canonical ‘ta’. Although the consortium had not incorporated the ‘khanda-ta’ in the Unicode block until its version 4.1.0, released in March 2005, several programs, which included a word processor developed by Duke University, allowed users to write ‘khanda-ta’ with the mechanism that the consortium explained in version 4. A few technologists involved in Bangladesh efforts also argued, not so vehemently though, against the inclusion of the ‘khanda-ta’ as a canonical form. It appeared that vendors and developers of add-ons for the Bangla script system wanted a solution from the fellow appointed to the committee on how to develop software using the Bangla code block in Unicode — a hands-on guide of a sort on the Unicode system.

As Bangla glyphs had till then been directly called from the font file, most of the developers had no idea that with a set of canonical characters running in the background for text storing and interpretation, developers needed to use a lookup table to call all forms of letters, allographs, or alternative forms of one or more characters, and conjunct characters, which are contained in the Private Use Area in a Unicode font file. Most of the people involved in working out the code page for Bangla in Bangladesh that time were attuned to a natural sort order and a direct glyph representation, without having to know that they all should be done by way of a third-party interference keeping to Unicode, which has decided not to offer these functions since the very beginning. The fully-fledged word-processor named Barna that came out to run on DOS in the pre-Unicode era also employed a similar look-up table for all the glyphs with characters directly handled from the keyboard.

With Unicode having started to become all-pervasive, it was difficult for Bangla add-on developers to set aside the reality on the ground. The opposition to Unicode took another turn. Some demanded that the Bangla Unicode block should have its own punctuation marks, especially the ‘danda’ that works as a full stop, because Unicode lays out that the ‘danda’ in the Devanagari block should be employed as Bangla ‘danda’, too, as is the case with all Indic languages. The opposition did not appear logical as the same people conveniently use other punctuation marks such as the comma, the semicolon, the colon, the quotation marks, the exclamation mark and the question mark from the Latin, extended Latin and general punctuation blocks. Some of the opposition was related to the inclusion of two characters, broadly known as Assamese ‘va’ with a diagonal below the Bangla ‘ba’ and the Assamese ‘ra’ with a diagonal intersecting the counter of the Bangla ‘ba’. But all appear to have been oblivious to the fact that the Assamese ‘ra’ was also the Bangla ‘ra’ in the middle Bengali period, extensively used in handwritten codices and much throughout the early days of Bangla printing.

While all this went on, the Computer Council gave a committee involving people from the Bangladesh University of Engineering and Technology yet another task, under a project named ‘Standard Coding for Bangla Characters and Conversion’, to make BDS 1520 a standard code set, to deal with the limitation of characters in laying them out in ascending and descending orders and to make its conversion to and from other code sets. It could not be known whether the report was submitted at all. But this was a work that warranted attention because all the while the developers of script system add-ons opposed Unicode saying that it has not provided for any mechanism for sorting. They all believed that the characters as laid out in the Unicode block could not ensure a dictionary sort order. This is true. But they all appear to have ignored that an alphabetical sort order of the Bangla letters is quite different from the sort order that has traditionally been employed in dictionaries.

The Unicode Standard in its first version that came out in October 1991 in Chapter II, General Principles of the Unicode Standard, says, ‘The design of the Unicode encoding scheme is independent of the design of basic text processing algorithms… Unicode implementations are assumed to contain suitable text processing and/or rendering algorithms… In particular, sorting and string comparison algorithms cannot assume that the assignment of Unicode character code numbers provide an alphabetical ordering for lexicographic string comparison.… The expected sort sequence for the same characters differs across languages, so in general no single linear ordering exists.’ And, it says, ‘There is no reason to expect text processors in general to be as simple as they are for English.’ The developers of add-ons and technologists, attuned to mostly ASCII and Bangla font encodings designed in an alphabetical order, appear to have made fuss about the code table, without reading the text that accompanied and not knowing that Unicode does not talk about sorting or any sort order.

The next two standards — BDS 1520:2011, adopted on February 15, 2011, and BDS 1520:2018, adopted on February 26, 2018 — which the Standards and Testing Institutions call the second and the third revision of the BDS 1520:1995 standard, are basically updates on the BDS 1250:2000 standards conforming to the Unicode standards of the time — Version 6 in the case of BDS 1520:2011 and Version 10 in the case of BDS 1250:2018. Unicode in its version 11, released in June 2018, encoded three more signs in the Bangla block. The characters are used in old manuscripts, but the Standards and Testing Institution needs to issue another update if it wants to keep abreast of the Unicode standard.

The latest three BSTI standards on the coded character set for Bangla appear to be conforming to ISO/IEC 10646, which only assigns code points to characters, as they all deal with the character codes in a table and they have been through till date without any additional instructions such as the conjunct formation method, ligation control mechanism or the behaviour and property of the characters the way the Unicode standard has. It is wise for the agency not to issue any further ISO/IEC 1064-like standard, which without the rules cannot be put to work in the future if it cannot issue the instructions that Unicode does. Or it should adopt what Unicode does, to be, again, wise.


Akkas, Abu Jar M. (2021 Dec. 16). Bangla Computing: Follies all the way. New Age. 9


Rev.: vii·xi·mmxxii