Bengali alphabet

Bengali abugida
Bangla abugida
Languages Bengali, Meithei, Bishnupriya Manipuri, Kokborok
Time period
11th century to the present[1]
Parent systems
Child systems
Sister systems
Assamese, Tibetan
Direction Left-to-right
ISO 15924 Beng, 325
Unicode alias

The Bengali alphabet or Bangla alphabet (Bengali: বাংলা লিপি Bangla lipi) is the writing system for the Bengali language and is the 6th most widely used writing system in the world. The script is shared by Assamese with minor variations, and is the basis for the other writing systems like Meithei and Bishnupriya Manipuri. Historically, the script has also been used to write Sanskrit in the region of Bengal.

From a classificatory point of view, the Bengali script is an abugida, i.e. its vowel graphemes are mainly realized not as independent letters, but as diacritics attached to its consonant letters. It is written from left to right and lacks distinct letter cases. It is recognizable, as are other Brahmic scripts, by a distinctive horizontal line running along the tops of the letters that links them together which is known as মাত্রা matra. The Bengali script is however less blocky and presents a more sinuous shape.


The Bengali script evolved from the Siddhaṃ script, which belongs to the Brahmic family of scripts. In addition to differences in how the letters are pronounced in the different languages, there are some typographical differences between the version of the script used for Assamese language and that used for Bengali language:

The version of the script used for Manipuri is also a different variation; it uses the , represented as in Bengali script without the different representation as in Assamese script. It also uses the Assamese script character sounding , represented as , which is absent in the Bengali script.

The Bengali script was originally not associated with any particular language but was often used in the eastern regions of the Middle kingdoms of India and then in the Pala Empire. It later continued to be specifically used in the Bengal region. It was later standardized into the modern Bengali script by Ishwar Chandra under the reign of the East India Company. Today, the script holds official script status in Bangladesh and India, and it is associated with the daily life of Bengalis.


The Bengali script can be divided into vowel diacritics, consonant and vowel letters (including consonant conjuncts), modifiers, digits, and punctuation marks.


The Bengali script has a total of 11 vowel graphemes, each of which is called a স্বরবর্ণ sbôrôbôrnô "vowel letter". The sbôrôbôrnôs represent six of the seven main vowel sounds of Bengali, along with two vowel diphthongs. All of them are used in both Bengali and Assamese, the two main languages using the script.

The table below shows the vowels present in the modern (since the late nineteenth century) inventory of the Bengali alphabet, which has abandoned three historical vowels, rri, li, and lli, traditionally placed between ri and e.

Bengali vowels (স্বরবর্ণ sbôrôbôrnô)
Name of
full form
Name of
diacritic form
Romanization IPA
স্বর অ
sbôrô ô
- - ô and o /ɔ/ and /o/[2]
স্বর আ
sbôrô a
আ কার
a kar
a /a/
হ্রস্ব ই
hrôsbô i
ি হ্রস্ব ই কার
hrôsbô i kar
i /i/
দীর্ঘ ই
dirghô i
দীর্ঘ ই কার
dirghô i kar
i and ee /i/
হ্রস্ব উ
hrôsbô u
হ্রস্ব উ কার
hrôsbô u kar
u /u/
দীর্ঘ উ
dirghô u
দীর্ঘ উ কার
dirghô u kar
u and oo /u/
হ্রস্ব ঋ
hrôsbô ri
হ্রস্ব ঋ কার
hrôsbô ri kar
ri /ṛi/
দীর্ঘ ঋ
dirghô ri
দীর্ঘ ঋ কার
dirghô ri kar
rii /ṛii/
হ্রস্ব ঌ
hrôsbô li
হ্রস্ব ঌ কার
hrôsbô li kar
li /ḷi/
দীর্ঘ ঌ
dirghô li
দীর্ঘ ঌ কার
dirghô li kar
lii /ḷii/
স্বর এ
sbôrô e
এ কার
e kar
e and ê /e/ and /æ/[3]
স্বর ঐ
sbôrô ôi
ঐ কার
ôi kar
ôi and oi /ɔi/ and /oi/
স্বর ও
sbôrô u/o
ও কার
u/o kar
u and o /ʊ/[4] and /o/
স্বর ঔ
sbôrô ôu
ঔ কার
ôu kar
ôu and ou /ɔu/ and /ou/
The consonant () along with the diacritic form of the vowels অ, আ, ই, ঈ, উ, ঊ, ঋ, এ, ঐ, ও and ঔ.


Consonant letters are called ব্যঞ্জনবর্ণ bænjônbôrnô "consonant letter" in Bengali. The names of the letters are typically just the consonant sound plus the inherent vowel ô. Since the inherent vowel is assumed and not written, most letters' names look identical to the letter itself (the name of the letter is itself ghô, not gh).

Bengali consonants
(ব্যঞ্জনবর্ণ bænjônbôrnô)
বর্গীয় বর্ণ (Generic sounds)
Voicing অঘোষ (Voiceless) ঘোষ (Voiced) অঘোষ (Voiceless) ঘোষ (Voiced)
Aspiration অল্পপ্রাণ (Unaspirated) মহাপ্রাণ (Aspirated) অল্পপ্রাণ (Unaspirated) মহাপ্রাণ (Aspirated) অল্পপ্রাণ (Unaspirated) মহাপ্রাণ (Aspirated)
(Vocal) [nc 1]



(Palatal) [nc 2]


/dzɔ~ɔ/[nc 3]
/ɕɔ~ʃɔ/[nc 4]
(Postalveolar/Alveolar)[nc 5]
/nɔ/[nc 6]

/ʂɔ/[nc 4]





/sɔ/[nc 4]


/ɸɔ/[nc 7]


Other letters ড়ṛô


  1. Though, in modern Bengali letters ক, খ, গ, ঘ, ঙ are actually জিভামূলীয় (velar consonants) and হ is actually a glottal consonant, texts still describe with Sanskrit name "কণ্ঠ্য" (vocal).
  2. Palatal letters phonetically represent palato-alveolar sounds but in Eastern dialects they mostly are depalatalised or depalatalised and deaffricated.
  3. In Sanskrit, "য" represented voiced palatal approximant /j/. In modern Bengali, it represents two different sounds, voiced palato-alveolar affricate /ɔ/ (merging with জ) and semivowel /ɔ/. When reforming the script, Ishwar Chandra Vidyasagar introduced য় to represent /ɔ/ and reserved য for /ɔ/. In words, য now pronounced similarly as জ /ɔ/ and also represents the voiced alveolar sibilant affricate /dzɔ/.
  4. 1 2 3 In Bengali, there are three letters for sibilants: শ, ষ, স. Originally all three had distinctive sounds. In modern Bengali, the most common sibilant varies between /ʃ~ɕ/ - originally represented by শ, but today, স and ষ in words are often pronounced as /ɕ~ʃ/. The other sibilant in Bengali is /s/, originally represented by স, but today, শ and ষ, in words, can sometimes be pronounced as /s/. Another, now extinct, sibilant was /ʂ/, originally represented by ষ but found today. ষ is mostly pronounced as /ɕ~ʃ/, but in conjunction with apical alveolar consonants, /ʂ/ sound can sometimes be found.
  5. In modern text often the name পশ্চাৎ দন্ত্যমূলীয়-Post-dental is used to describe letters previously described as retroflex more precisely.
  6. Original sound for ণ was /sɔ/ but in modern Bengali it is almost always pronounced /nɔ/ same as ন; except for in ligatures with other retroflex letters, original sound can be occasionally found.
  7. Sanskrit influence makes "ফ" also be sounded as /ɔ/

Consonant conjuncts

The consonant ligature ndrô (ন্দ্র) : ন () in green, দ () in blue and র () in maroon.

Up to four consecutive consonants not separated by vowels can be orthographically represented as a typographic ligature called a "consonant conjunct" (Bengali: যুক্তাক্ষর juktakkhôr or যুক্তবর্ণ juktôbôrnô). Typically, the first consonant in the conjunct is shown above and/or to the left of the following consonants. Many consonants appear in an abbreviated or compressed form when serving as part of a conjunct. Others simply take exceptional forms in conjuncts, bearing little or no resemblance to the base character.

Often, consonant conjuncts are not actually pronounced as would be implied by the pronunciation of the individual components. For example, adding underneath shô in Bengali creates the conjunct শ্ল, which is not pronounced shlô but slô in Bengali. Many conjuncts represent Sanskrit sounds that were lost centuries before modern Bengali was ever spoken, as in জ্ঞ, which is a combination of and , but it is not pronounced jnô, Instead, it is pronounced ggô in modern Bengali. Thus, as conjuncts often represent (combinations of) sounds that cannot be easily understood from the components, the following descriptions are concerned only with the construction of the conjunct, and not the resulting pronunciation.

Fused forms

Some consonants fuse in such a way that one stroke of the first consonant also serves as a stroke of the next:

Approximated forms

Some consonants are written closer to one another simply to indicate that they are in a conjunct together.

Compressed forms

Some consonants are compressed (and often simplified) when appearing as the first member of a conjunct.

Abbreviated forms

Some consonants are abbreviated when appearing in conjuncts and lose part of their basic shape.

Variant forms

Some consonants have forms that are used regularly but only within conjuncts.


When serving as a vowel sign, উ u, ঊ u, and ঋ ri take on many exceptional forms.

Modifiers and others

Modifier and other graphemes in Bengali
Name Function Romanization IPA
Suppresses the inherent vowel [ɔ] (ô) - -
Final velar nasal (ng sound) ng /ŋ/
1. Doubles the next consonant sound without the vowel (spelling feature) in দুঃখ dukkhô, the k of khô was repeated before the whole khô
2. "h" sound at end, examples: এঃ eh!, উঃ uh!
3. Silent in spellings like অন্তঃনগর ôntônôgôr meaning "Inter-city"
4. Also used as abbreviation like কিঃমিঃ kimi, it is shortening the word কিলোমিটার "kilometer" as কিঃমিঃ kimi which is similar to "km" in English, another example can be ডাঃ dôh stands for ডক্টর dôktôr "doctor"
h /ḥ/
Vowel nasalization ñ /ñ/
‍্য যফলা
It is used to derive two types of pronunciation in modern Bengali, like in spellings like এ্যাকাডেমী êkademi it is pronounced /ækademi/, but in spelling like লক্ষ্য lôkkhyô, it is pronounced as /lɔkkhe̯ɔ/
It is sometimes used as a diacritic to indicate non-Bengali vowels of various kinds in transliterated foreign words. For example, the schwa is indicated by a jôfôla, the French u and the German umlaut ü as উ্য uyô, the German umlaut ö as ও্য oyô or এ্য eyô, etc.
ê / yô /æ/ or /e̯ɔ/
‍্ব বফলা
It is always silent in modern Bengali. It is used in spellings only if they were adopted from Sanskrit and are still preserved and remains silent in pronunciation.
Example 1: স্ব sbô is pronounced /ʃɔ/ rather than /sbɔ/ (omitting the b).
Example 2: ত্ব tbô is pronounced /tɔ/ rather than /tbɔ/ (omitting the b)
and with any other consonant the "‌্ব" is silent, always omitting the b sound.
- -
used for prolonging vowel sounds
Example1: শুনঽঽঽ shunôôôô meaning "listennnn..." (listen), this is where the default inherited vowel sound ô in is prolonged.
Example2: কিঽঽঽ? kiiii? meaning "Whatttt...?" (What?), this is where the vowel sound i which is attached with the consonant is prolonged.
- -
represents the name of a deity or also written before the name of a deceased person - -
আঞ্জী / সিদ্ধিরস্তু
anji /siddhirôstu
used at the beginning of texts as an invocation - -

-h and -ng are also often used as abbreviation marks in Bengali, with -ng used when the next sound following the abbreviation would be a nasal sound, and -h otherwise. For example, ডঃ dôh stands for ডক্টর dôktôr "doctor" and নং nông stands for নম্বর nômbôr "number". Some abbreviations have no marking at all, as in ঢাবি dhabi for ঢাকা বিশ্ববিদ্যালয় Dhaka Bishbôbidyalôy "University of Dhaka". The full stop can also be used when writing out English letters as initials, such as ই.ইউ. i.iu "E.U.".

The apostrophe, known in Bengali as ঊর্ধ্বকমা urdhbôkôma "upper comma", is sometimes used to distinguish between homographs, as in পাটা pata "plank" and পা'টা pa'ta "the leg". Sometimes, a hyphen is used for the same purpose (as in পা-টা, an alternative of পা'টা).

ৎ (called খণ্ড-ত khôndô tô "broken ") is always used syllable-finally and always pronounced as /t̪/. It is predominantly found in loan words from Sanskrit such as ভবিষ্যৎ bhôbishyôt "future", সত্যজিৎ sôtyôjit (a proper name), etc. It is also found in some onomatopoeic words (such as থপাৎ thôpat "sound of something heavy that fell", মড়াৎ môrat "sound of something breaking", etc.), as the first member of some consonant conjuncts (such as ৎস tsô, ৎপ tpô, ৎক tkô, etc.), and in some foreign loanwords (e.g. নাৎসি natsi "Nazi", জুজুৎসু jujutsu "Jujutsu", ৎসুনামি tsunami "Tsunami", etc.) which contain the same conjuncts. It is an overproduction inconsistency, as the sound /t̪/ is realized by both ত and ৎ. This creates confusion among inexperienced writers of Bengali. There is no simple way of telling which symbol should be used. Usually, the contexts where ৎ is used need to be memorized, as they are less frequent. In the native Bengali words, syllable-final ত /t̪ɔ/ is pronounced /t̪/, as in নাতনি /nat̪ni/ "grand-daughter", করাত /kɔrat̪/ "saw", etc.

Digits and numerals

Main article: Bengali numerals

The Bengali script has ten numerical digits (graphemes or symbols indicating the numbers from 0 to 9). Bengali numerals have no horizontal headstroke or মাত্রা "matra".

Bengali numerals
Arabic numerals 0 1 2 3 4 5 6 7 8 9
Bengali numerals

Numbers larger than 9 are written in Bengali using a positional base 10 numeral system (the decimal system). A period or dot is used to denote the decimal separator, which separates the integral and the fractional parts of a decimal number. When writing large numbers with many digits, commas are used as delimiters to group digits, indicating the thousand (হাজার hazar), the hundred thousand or lakh (লাখ lakh or লক্ষ lôkkhô), and the ten million or hundred lakh or crore (কোটি koti) units. In other words, leftwards from the decimal separator, the first grouping consists of three digits, and the subsequent groupings always consist of two digits.

For example, the English number 17,557,345 will be written in traditional Bengali as ১,৭৫,৫৭,৩৪৫.

Punctuation marks

Bengali punctuation marks, apart from the downstroke দাড়ি dari (|), the Bengali equivalent of a full stop, have been adopted from western scripts and their usage is similar: Commas, semicolons, colons, quotation marks, etc. are the same as in English. Capital letters are absent in the Bengali script so proper names are unmarked.

Characteristics of the Bengali text

An example of handwritten Bengali script. Part of a poem written by Nobel Laureate Rabindranath Tagore in 1926 in Hungary.

Bengali text is written and read horizontally, from left to right. The consonant graphemes and the full form of vowel graphemes fit into an imaginary rectangle of uniform size (uniform width and height). The size of a consonant conjunct, regardless of its complexity, is deliberately maintained the same as that of a single consonant grapheme, so that diacritic vowel forms can be attached to it without any distortion. In a typical Bengali text, orthographic words, words as they are written, can be seen as being separated from each other by an even spacing. Graphemes within a word are also evenly spaced, but that spacing is much narrower than the spacing between words.

Unlike in western scripts (Latin, Cyrillic, etc.) for which the letter-forms stand on an invisible baseline, the Bengali letter-forms instead hang from a visible horizontal left-to-right headstroke called মাত্রা matra. The presence and absence of this matra can be important. For example, the letter ত and the numeral ৩ "3" are distinguishable only by the presence or absence of the matra, as is the case between the consonant cluster ত্র trô and the independent vowel এ e. The letter-forms also employ the concepts of letter-width and letter-height (the vertical space between the visible matra and an invisible baseline).

Grapheme Percentage

According to Bengali linguist Munier Chowdhury, there are about nine graphemes that are the most frequent in Bengali texts, shown with its percentage of appearance in the table on the right.[6]


In the script, clusters of consonants are represented by different and sometimes quite irregular forms; thus, learning to read is complicated by the sheer size of the full set of letters and letter combinations, numbering about 350. While efforts at standardizing the alphabet for the Bengali language continue in such notable centres as the Bangla Academy at Dhaka (Bangladesh) and the Pôshchimbônggô Bangla Akademi at Kolkata (West Bengal, India), it is still not quite uniform yet, as many people continue to use various archaic forms of letters, resulting in concurrent forms for the same sounds. Among the various regional variations within this script, only the Assamese and Bengali variations exist today in the formalized system.

It seems likely that standardization of the alphabet will be greatly influenced by the need to typeset it on computers. The large alphabet can be represented, with a great deal of ingenuity, within the ASCII character set, omitting certain irregular conjuncts. Work has been underway since around 2001 to develop Unicode fonts, and it seems likely that it will split into two variants, traditional and modern. In this and other articles on Wikipedia dealing with the Bengali language, a Romanization scheme used by linguists specializing in Bengali phonology is included along with IPA transcription. A recent effort by the Government of West Bengal focused on simplifying the Bengali orthography in primary school texts.

There is yet to be a uniform standard collating sequence (sorting order of graphemes to be used in dictionaries, indices, computer sorting programs, etc.) of Bengali graphemes. Experts in both Bangladesh and India are currently working towards a common solution for the problem.


Romanization of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanization systems of Bengali, created in recent years but failed to represent the true Bengali phonetic sound. While different standards for romanization have been proposed for Bengali, they have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit.[nb 2] The Bengali alphabet has often been included with the group of Brahmic scripts for romanization in which the true phonetic value of Bengali is never represented. Some of them are the International Alphabet of Sanskrit Transliteration or "IAST system" [7] "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards),[8] and the extension of IAST intended for non-Sanskrit languages of the Indian region called the National Library at Kolkata romanization.[9]

Sample texts

Article 1 of the Universal Declaration of Human Rights

Bengali in Bengali alphabet

ধারা ১: সমস্ত মানুষ স্বাধীনভাবে সমান মর্যাদা এবং অধিকার নিয়ে জন্মগ্রহণ করে। তাঁদের বিবেক এবং বুদ্ধি আছে; সুতরাং সকলেরই একে অপরের প্রতি ভ্রাতৃত্বসুলভ মনোভাব নিয়ে আচরণ করা উচিৎ।

Bengali in phonetic Romanization

Dhara æk: Šomosto manush šadhynbhabe šoman morjada æbong odhikar niye jonmogrohon kore. Tãder bibek æbong buddhi achhe; šutôrang sokoleri æke oporer proti bhratritbošulobh mono̊bhab niye achoron kora uchit.

Bengali in IPA

d̪ʱara æk ʃɔmɔst̪ɔ manuʃ ʃad̪ʱinbʱabe ʃɔman mɔrdʒad̪a ebɔŋ ɔd̪ʱikar nie̯e dʒɔnmɔɡrɔhɔn kɔre. t̪ãd̪er bibek ebɔŋ budd̪ʱːi atʃʰe; sut̪ɔraŋ sɔkɔleri æke ɔpɔrer prɔt̪i bʱrat̪rit̪ːɔsulɔbʱ mɔnobʱab nie̯e atʃɔrɔn kɔra utʃit̪.


Clause 1: All human free-manner-in equal dignity and right taken birth-take do. Their reason and intelligence exist; therefore everyone-indeed one another's towards brotherhood-ly attitude taken conduct do should.


Article 1: All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience. Therefore, they should act towards one another in a spirit of brotherhood.


Bengali script was added to the Unicode Standard in October, 1991 with the release of version 1.0.

The Unicode block for Bengali is U+0980U+09FF:

Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+09Bx ি
1.^ As of Unicode version 9.0
2.^ Grey areas indicate non-assigned code points

See also


  1. Different Bengali linguists give different numbers of Bengali diphthongs in their works depending on methodology, e.g. 25 (Chatterji 1939: 40), 31 (Hai 1964), 45 (Ashraf and Ashraf 1966: 49), 28 (Kostic and Das 1972:6-7) and 17 (Sarkar 1987).
  2. In Japanese, there is some debate as to whether to accent certain distinctions, such as Tōhoku vs Tohoku. Sanskrit is well-standardized because the speaking community is relatively small, and sound change is not a large concern.


  1. Ancient Scripts
  2. The natural pronunciation of the grapheme অ, whether in its independent (visible) form or in its "inherent" (invisible) form in a consonant grapheme, is /ɔ/. But its pronunciation changes to /o/ in the following contexts:
    • অ is in the first syllable and there is a ই /i/ or উ /u/ in the next syllable, as in অতি ôti "much" /ɔt̪i/, বলছি bôlchhi "(I am) speaking" /ˈboltʃʰi/
    • if the অ is the inherent vowel in a word-initial consonant cluster ending in rôfôla "rô ending" /r/, as in প্রথম prôthôm "first" /prɔt̪ʰɔm/
    • if the next consonant cluster contains a jôfôla "jô ending", as in অন্য ônyô "other" /onːo/, জন্য jônyô "for" /dʒɔnːɔ/
  3. Even though the near-open front unrounded vowel [æ] is one of the seven main vowel sounds in the standard Bengali language, no distinct vowel symbol has been allotted for it in the script, thought is used.
  4. /ʊ/ is the original pronunciation of the vowel , thought a secondary pronunciation /o/ entered the Bengali phonology by Sanskrit influence. In modern Bengali, both the ancient and adopted pronunciation of can be heard in spoken. Example: The word নোংরা (meaning "foul") is pronounced as /nʊŋra/ and /noŋra/ (Romanized as both nungra and nongra), both.
  5. Mazumdar, Bijaychandra (2000). The history of the Bengali language (Repr. [d. Ausg.] Calcutta, 1920. ed.). New Delhi: Asian Educational Services. p. 57. ISBN 8120614526. yet it is to be noted as a fact, that the cerebral letters are not so much cerebral as they are dental in our speech. If we carefully notice our pronunciation of the letters of the 'ট' class we will see that we articulate 'ট' and 'ড,' for example, almost like English T and D without turning up the tip of the tongue much away from the region of the teeth.
  6. See Chowdhury 1963
  7. "Learning International Alphabet of Sanskrit Transliteration". Sanskrit 3 - Learning transliteration. Gabriel Pradiipaka & Andrés Muni. Archived from the original on 12 February 2007. Retrieved 2006-11-20.
  8. "ITRANS — Indian Language Transliteration Package". Avinash Chopde. Retrieved 2006-11-20.
  9. "Annex-F: Roman Script Transliteration" (PDF). Indian Standard: Indian Script Code for Information Interchange — ISCII. Bureau of Indian Standards. 1 April 1999. p. 32. Retrieved 2006-11-20.


This article is issued from Wikipedia - version of the 11/17/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.