InTraSAL: basic tones

The primary contributions to InTraSAL are the annotation system and descriptive model of the basic tones used in transcribing Bengali intonation (B-ToBI). As in B-ToBI, these basic tones of InTraSAL interact with f-marked tones used to convey focus. Also like B-ToBI, InTraSAL provides an annotation scheme for the text of utterances as well.

InTraSAL proposes a system of tones, some of which are aligned to "stressed" or "accented" syllables and some of which are aligned with the right edges of sentences . Each of these types of tones is described below, along with tune-meaning mappings and phonological characteristics. The current description and examples are drawn from Bengali, but can be extended with minor adjustments to other South Asian languages e.g. Tamil, Hindi-Urdu, Assamese, Malayalam, etc.

Stress or accent

"Stress", "accent", "prominence", and "focus" are messy terms, and many people use them interchangeably. The terms are particularly controversial when it comes to the phonology of South Asian languages, e.g. Bengali. Some linguists claim Bengali has fixed initial stress/accent, others say it is predictable but weight-sensitive, while yet others claim there is no stress/accent at all.

For the purposes of intonation, words in Bengali (and some other South Asian languages) can be seen as accented on the first syllable. This "accent" is simply a special phonological status that the first syllable has, manifested in the fact that only the initial syllable can host the oral-nasal vowel contrast, the lax-tense mid vowel contrast, and (optionally) phonetic characteristics that resemble stress in other languages. (This last part - the phonetic characteristics of stress, e.g. high intensity, longer duration - need not be a feature of accented syllables in South Asian languages, and should not be used as the primary cue to accent.)

A pitch accent is simply a tone that highlights the accented syllable of a word, which in Bengali can only be the initial syllable. Placing a pitch accent on the initial syllable provides more "prominence" to that word.

In Bengali, this accented syllable can bear a low pitch accent (L*), high pitch accent (H*), rising pitch accent (L*H), or no pitch accent at all. Not all underlyingly-accented syllables will bear a pitch accent.

Adding in "focus" complicates things a bit further.

Prosodic units or sentences

Bengali has three types of sentences relating to intonation. These or "prosodic units" are Accentual Phrases (AP), Intermediate Phrases (ip), and Intonational Phrases (IP). Each utterance in Bengali is made of one or more IPs, each of which is made of one or more of one or more other Each of these is described below, from the smallest sentence (AP) to the largest (IP).

Prenuclear accentual sentence (AP)

Almost every word is its own Accentual Sentence (AP), composed of a pitch accent (low L* or high H*) and an AP boundary tone (high Ha or low La) on the right edge. One pitch accent and one AP boundary tone make up a prenuclear AP, which comes in two types: rising (L *...Ha) and falling (H*...La).

Rising AP (L * ... Ha)

The rising AP (L*...Ha) is by far the most commonly seen pattern for prenuclear APs in formal registers of Bengali. Example [Na19] below shows low pitch on the first syllable and high pitch at the end of each of the first two words, mônoara 'Monoara' (a name) and make 'mother-ACC'.


[Na19] mônoara maake nie elo. 'Monoara brought (her) mother.'

Falling AP (H*...La)

The falling AP (H*...La) is less common in formal registers, but still observed in examples like [By37] below. (Note that the extreme changes in the pitch contour on gelen are due to pitch tracking error during creaky voice.)


[By37] mirar nana mara gelen. 'Mira's grandfather passed away.'

The falling AP is more common in less formal registers, especially when conveying sarcasm or surprise, the second of which is found in the unscripted speech of example [ByS134].


[ByS134] êkṭa pukure mone hôŷ cheleṭa pore gêlo. '[It's] in a pond [that] it seems the boy fell.'


The two tones in an AP cannot be both H or both L, due to the Obligatory Contour Principle (OCP). This means that every prenuclear AP will show a rise or fall, even if the contour is subtle.

Some APs are composed of multiple words. This is especially common in faster speech rates, when function words group together with nearby content words, or when two content words group together into a tight grammatical unit. For example, the speaker in [A18] puts the whole postpositional phrase into one rising AP, combining the noun complement dhakka '(a) shove' with its deverbalized postpositional head die 'with/given'.


[A18] ora dhakka die dôrja khullo. 'They opened the door with a shove.'

AP downtrend

Note that the H tone of each AP (H* or Ha) is lower than its preceding H tone. This is called AP downtrend.


[Fa50] rumu nepaler ranir malider namgulo mone rakhte pare ni. 'Rumu couldn't remember the names of the gardeners of the queen of Nepal.'

AP downtrend can only be violated in three instances: (1) the H tone of a longer word is higher than the H tone of a preceding shorter word, as in malider 'of the gardeners' vs. ranir 'of the queen' in [Fa50], (2) the H tone of a content word is higher than the H tone of a preceding function word as in mirar 'of Mira' vs. karon 'because' in [Fa37], or (3) the H tone is conveying focus.


[Fa37] ...karon mirar nana mara gelen. '...because Mira's grandfather passed away.'

Nuclear accentual phrase (AP)

After a sequence of prenuclear APs comes the nuclear AP, which is composed of only a pitch accent, and no AP boundary tone. The three nuclear pitch accents are low (L*), high (H*), and rising (L*H). This is the final AP within the sequence (=intermediate phrase).

Low pitch accent (L*)

The low pitch accent (L*) can be considered the default. It is often not all that "low", but it is lower than a H* would be in this position.


[Tu01] mônoara romilake nie elo. 'Monoara brought Romila.'

High pitch accent (H*)

The high pitch accent (H*) is also fairly common. As the nuclear AP is typically the head of the verb phrase, the use of nuclear H* can often signal that the verb head is important or unexpected, but not exactly under "focus".

In example [Ba51], the speaker seems to convey that the conjunct verb ভুলে গেলেন bhule gêlen 'forgot' is unexpected or noteworthy. As seen in this example, a nuclear H* typically cooccurs with a preceding falling AP, in accordance with the OCP.


[Ba51] šey namgulo bhule gêlen. '(He/she/they) forgot those (aforementioned) names.'

Rising pitch accent (L*H)

The rising pitch accent (L*H) is a third option. Like H*, it conveys a level of emphasis, but not to the level of "focus". It is normally transcribed L*+H in B-ToBI (with the +), with no change in interpretation.

The speaker in [Re57] uses the rising pitch accent at the end of each of the two sentences, presumably highlighting that it might not have been obvious that the mirrors are মুনিমার munimar 'Munima's', and that the Aunty পছন্দ করেন না pôchondo kôren na 'does not like' the mirrors / this fact.


[Re57] ey aŷnagulo munimar. mami kintu pôchondo kôren na. 'These mirrors are Munima's. (Be aware that) Aunty doesn't like (it/that fact/them).'

In unscripted example [FaS158], the speaker uses the rising pitch accent (L*H) presumably to indicate that the appearance of a বিরাট হরিণ biraṭ horin 'huge deer' might be unexpected.


[FaS158] ...ber hoe ešlo êkṭa biraṭ horin. '...(there) came out a huge deer.'

Intermediate phrase (ip)

The intermediate phrase (ip, note the lack of capitalization here) is a group of one or more APs, often (but not always) part of a tight syntactic unit, e.g. a topic, subject, or postpositional phrase. It is marked on its right edge by lengthening of the final syllable, an optional pause, the interruption of AP downtrend, and one of two boundary tones: high (H-) and low (L-).

Of these, the high ip boundary tone (H-) is by far the more commonly used.


[Sh35] amar naraŷongônje jaŵa holo na. 'I didn't get to go to Narayanganj.'

Notice that the high ip boundary tone (H-) differs from a high AP boundary tone (Ha) in terms of height (H- is higher than Ha), contour (H- has a final elbow while Ha does not), and length (the final syllable is lengthened when it bears H-, but not when it bears Ha).

Concurrent boundary tone overriding

Recall that the last AP in the ip does not have an AP boundary tone. Instead, it has an ip boundary tone. The B-ToBI and InTraSAL models assume the expected AP tone is overridden by the presence of an ip boundary tone in the same position. This phenomenon is known as concurrent boundary tone overriding, and it is also seen in Hindi, Tamil, and Korean. It differs from the boundary tone "stacking" pattern seen in American English, German, and Japanese.

Intonation phrase (IP)

The intonation phrase (IP) is the largest unit that is marked by intonation. Whole sentences can be a single IP, but smaller chunks (even a single word) can serve as an IP as well. The IP is marked on its right edge by lengthening of the final syllable, an optional pause, and one of five boundary tones: low (L%), high (H%), low rising (LH%), high falling (HL%), or high dipping (HLH%).

Like the ip boundary tone, the IP boundary tone overrides other tones aiming for the same location. That means that in IP-final position, both the ip boundary tone and AP boundary tone are overridden, leaving only the IP boundary tone.

Low IP boundary tone (L%)

The low IP boundary tone (L%) can be seen as a default, as it is the most frequent IP boundary tone, and it is seen in the widest range of constructions. These include declaratives (as in [Fa24]), direct imperatives, exclamations, and "plain" wh-questions.


[Fa24] mônoara lina mamike nie elo. 'Monoara brought Aunt Lina.'

High IP boundary tone (H%)

The high IP boundary tone (H%) is also common, but only within a narrower range of constructions. These include requests (which can be interpreted as indirect imperatives), confirmation questions (typically with sentence particle na in second or final position as in [Fa06], or naki in second position), echo wh-questions, and the first of two conjoined clauses.


[Fa06] mônoara romilake nie elo na? 'Didn't Monoara bring Romila?'

High falling IP boundary tone (HL%)

The high falling IP boundary tone (HL%) is primarily seen in two constructions. These include yes/no questions and topicalized phrases (as in [FaS90]). Despite its name, note that this contour involves rising pitch until the final syllable, and the final sharp fall is concentrated within that final syllable.


[FaS90] ey dike or kukurṭa... '(And) over here his dog...'

Low rising IP boundary tone (LH%)

The low rising IP boundary tone (LH%) is similar to the L-H% tone of Mainstream American English and German ToBI models in shape and use. It involves a steady drop in the pitch until the final syllable, where a sharp upturn is seen. The tone conveys "polite" or "softened" wh-questions (as in [SB47] below), as well as continuation (ie the speaker has more to say).


[SB47]  rumu nepaler ranir malider ki jiniš mone rakhte pare ni? 'What thing could Rumu not remember of the gardeners of the queen of Nepal?'

High dipping IP boundary tone (HLH%)

Lastly, the high dipping IP boundary tone (HLH%) involves a steady rise in pitch until the second-to-last syllable, in which there is a drop in pitch, followed by a rise in the final syllable. In some instances, the fall and final rise are both realized in the final syllable. The meaning conveyed is largely equivalent to the "continuation" use of LH%.


[Fa35] jehetu mirar nana mara gelen...  'Because Mira's grandfather passed away ...'