Next: Hindi Speech Database
Up: Hindi Synthesis
Previous: Letter to Sound Rules
In Hindi, words could be composed of basic characters (example samay
[time]), as well as complex clusters of C*VC* (example san'sthaa
[organization]). For the latter cases, there is need to come up with rules
to break the word into syllables. We derived certain simplistic rules
for syllabification i.e. rules for grouping
clusters of C*VC* based on heuristic analysis of several words in Telugu and
Hindi languages.
- When nasals such as /n'/, half pronounced /m/ or /n/ sound, (refer to Figure 1 where Hindi characters are represented in ITRANS-3, a transliteration scheme) succeed a vowel immediately, they would be treated as a part of the vowel
and also the same syllable. For example, /n'/ in san'sthaa
will be a part of syllable containing /sa/.
- When there are three or more consonants between two consecutive
vowels, the first consonant would be a part of the coda of the previous
syllable while the remaining consonants would be onset of the next
syllable. Applying these
rules to san'skrit [sanskrit], the obtained syllable sequence would be
/san's/ /krit/.
- When there are exactly two consonants between two vowels, the
first consonant would be part of coda of previous syllable and the
second would be onset of the next syllable. For example, dharti
[earth] would be split as /dhar/ /ti/. Exceptions for this rule
are the following cases.
- When the second consonant is a member of the set { /r/
/s/ /sh/ /shh/ }, both the consonants would be a part of onset
of the next syllable. For example, yaatra [tour] would be split
as /yaa/ /tra/.
Next: Hindi Speech Database
Up: Hindi Synthesis
Previous: Letter to Sound Rules
Alan W Black
2003-10-20