B-ToBI: words tier

In B-ToBI, the text of Bengali utterances is represented on two tiers: the words tier, which represents the text according to the phonological form of the pronunciation, and can be considered a phonemic transcription, and the English tier, which can be considered a semi-morphological gloss.

"Phonemic" means that person-, environment-, or situation-specific variation will not be reflected in the transcription, i.e. free variation and environmentally-conditioned allophony should be suppressed. So, whether a person pronounces the word পড়তে 'to read' as [poɽt̪e], [poɹt̪e], or [pot̪ːe], it should always be transcribed poRte.

Furthermore, "transcription" rather than "transliteration" means that the silent letters and other idiosyncrasies of Bengali spelling will not be represented here. Rather, the pronunciation alone should guide transcription. So, স্বপ্ন 'dream' should be transcribed or SOpno as it is pronounced [ʃɔpno], not swapna or sbOpnO.

Transcription of consonants

The transcription of consonants is fairly straightforward, with only a few peculiarities:

  • Lower-case tthd, and dh are used for dental sounds (দন্ত্য বর্ণ) /t̪ t̪ʰ d̪ d̪ʱ/, whereas upper-case TThD, and Dh are used for "cerebral" sounds (মূর্ধন্য বর্ণ). Cerebral sounds can be pronounced as alveolar /t tʰ d dʱ/ or retroflex /ʈ ʈʰ ɖ ɖʱ/ depending on the speaker, the register, and the phonological environment. Note that Bengali has no "cerebral" nasal or fricative in its phoneme inventory, even though the letters ণ and ষ continue to be called মূর্ধন্য ণ and মূর্ধন্য ষ.
  • Lower-case is used for /s/ as in আস্তে /ast̪e/ aste 'softly' whereas upper-case S is used for /ʃ/ as in আসতে /aʃt̪e/ aSte 'to come'. Some speakers (especially those in West and North Bengal) do not consistently distinguish these sounds.
  • Lower-case is used for /ɹ/ as in করা /kɔɹa/ kOra 'doing' whereas upper-case R is used for /ɽ/ as in কড়া /kɔɽa/ kORa 'intense'. Some speakers (especially those in East Bengal) do not consistently distinguish these sounds.
  • Lower-case is used for ন, ণ, and ঞ /n/ as in আন্টি /anti~anʈi/ anTi 'auntie' whereas upper-case is used for ঙ and ং /ŋ/, as in আংটি /aŋti~aŋʈi/ aGTi 'ring'. Note that the combination /ŋɡ/ as in বঙ্গ /bɔŋɡo/ bOGgo has both the and the g.
  • Lower-case is used to represent aspiration of a stop or affricate, while upper-case is used for the consonant হ [h] on its own. This can help distinguish single aspirated consonant ঝ /dʑʱ/ jh in মাঝে /madʑʱe/ majhe from the sequence of two consonants জহ /dʑh/ jH in রাজহাঁস /ɹadʑhãʃ/ rajHaMS 'swan'.
  • Not all speakers distinguish /pʰ/ ph from /f/ f, or /bʱ/ bh from /v/ v, but the contrast can arise especially in unscripted speech, where borrowings are frequent. It is up to the discretion of the transcriber whether to show these distinctions.
Bengali 
letter(s)
Pronunciation(s)
in IPA 
(Khan 2010)
B-ToBI
representation(s)
Example
k k কাশ kaS
'wild sugarcane'
kʰ, x kh খাস khaS
'you eat'
ɡ g গাস gaS
'you sing'
ɡʱ gh ঘাস ghaS
'grass'
ঙ, ং ŋ G (ṅ) বাংলা baGla
'Bengali'
tɕ~cɕ~tʃ~ts~c
s
c
s
চাল cal
'rice grains'
tɕʰ~cɕʰ~tʃʰ~tsʰ~cʰ
s
ch
s
ছাল chal
'skin'
জ, য dʑ~ɟʑ~dʒ~dz~ɟ
z
j
z
জাল jal
'net'
dʑʱ~ɟʑʱ~dʒʱ~dzʱ~ɟʱ
z
jh
z
ঝাল jhal
'spicy'
t~ʈ T (ṭ) টাক Tak
'bald spot'
tʰ~ʈʰ Th (ṭh) ঠাকুর Thakur
'lord'
d~ɖ D (ḍ) ডাক Dak
'call'
dʰ~ɖʱ Dh (ḍh) ঢাক Dhak
'cover!'
t তাক tak
'shelf'
t̪ʰ th থাক thak
'let it be'
d দাগ dag
'stain'
d̪ʱ dh ধাক্কা dhakka
'push'
ন, ণ, ঞ n n নাক nak
'nose'
p~ɸ p পাটি paTi
'grass mat'

f
ph
f
ফাটি phaTi, faTi
'I burst'
b b বাটি baTi
'bowl'

v
bh
v
ভাটি bhaTi
'kiln'
m m মাটি maTi
'ground'
ɹ~ɾ r হ্রাস raS
'reduction'
l l লাশ laS
'dead body'
শ, স, ষ ʃ S (š) শ্বাস SaS
'breath'
শ, স, ষ, চ, ছ s s সাফ saf
'clean'
হ, ঃ h~V̤ H হাঁস HaMS
'duck'
ড়, ঢ় ɽ~ɹ~ɾ R (ṛ) বড় bORo
'big'

Transcription of vowels, diphthongs, and nasalization

The transcription of vowels, diphthongs, and nasalization can differ depending on the availability of special diacritics.

  • Lower mid vowels অ্যা /ɛ~æ/ E and অ /ɔ/ use upper-case letters; if desired/available, ê can be used to represent E, and ô can be used to represent O.
  • Nasalization ঁ /Ṽ/ M, is represented by upper-case M following the nasalized vowel if ṃ is unavailable.
  • Note that nasalization can be transcribed even when nasalization is not perceived in the recording. This is up to the discretion of the transcriber, if he/she wants to convey that the vowel is underlyingly nasalized in the speaker's phonology.
  • Letters y w Y W represent off-glides of i u e o respectively within diphthongs.
Bengali 
letter(s)
Pronunciation(s) in IPA
(following Khan 2010)
B-ToBI
representation(s)
Example
ɔ O (ô) দশ dOS
'ten'
a a দাস daS
'slave'
ই, ঈ i i দিস diS
'give-2i'
উ, ঊ u u

দু'শ duSo
'two hundred'

e e দেশ deS
'country'
অ্যা, এ ɛ~æ E (ê) দেখ্‌ dEkh
'look!'
ও, অ o o দোষ doS
'blame', 'guilt'
অয় ɔe̯ OY (ôŷ) হয় HOY
'become-3'
অও ɔo̯ OW (ôŵ)

হও HOW
'become-2'

আই aj ay পাই pay
'get-1'
আয় ae̯ aY (aŷ)

পায় paY
'get-3'

আউ aw aw পাউরুটি pawruTi
'(Western) bread'
আও ao̯ aW (aŵ) পাও paW
'get-2'
ইউ iw iw পিউ piw
'Piu' (name)
উই uj uy পুঁইশাক puyMSak
'Malabar spinach'
এই ej ey নেই ney
'take-1'
এউ ew ew ঢেউ Dhew
'wave'
ওই, ঐ, অই oj oy বই boy
'book'
ওয় oe̯ oY (oŷ) শোয় SoY
'lie.down-3'
ওউ, ঔ, অউ ow ow বউ bow
'bride'
অ্যায়, এয় ɛe̯~æe̯ EY (êŷ) দেয় dEY
'give-3'
অ্যা, এও ɛo̯~æo̯ EW (êŵ) নেও nEW
'take-2'
  ̃ M (ṃ) হাঁস HaMS
'duck'

Return to B-ToBI

Now that you know how to transcribe Bengali speech for B-ToBI annotation, you can return to the B-ToBI main page.