Budapest University of Technology and Economics Department of Telecommunications and Media Informatics Home send e-mail
Back to the first page



Hungarian Speech Databases


MTBA - Hungarian Telephone Speech Database

  Project coordinator: Klara Vicsi

  Project members:
  K. Vicsi, Z. Valyon, Cs. Teleki, G. Gordos  
  Budapest University of Technology and Economics

  L. Toth, A. Kocsor, J. Csirik  
  University of Szeged


This was a project for the creation of the fixed line and mobil telephone voices based Hungarian speech database.

The goal of the project was collecting speech telephone database, in which some major dialectal variants are represented. This database provided a realistic base both for the training and testing of the present-day teleservices, and - because of the phonetically richness - the training of real speaker independent recognisers

The database contains records based on the definition in SpeechDatE for the dialectical, age and sex balance and vocabulary. Important and different from the SpeechDatE database is, that the phonetically rich sentences and words have been segmented and labelled at phoneme level. Thus the database gives possibility to train phoneme based recognisers. During planning the corpus, we took into consideration not only the variety of the dialectical aspects, but the special characteristics of Hungarian language too. Since the Hungarian is an agglutinative language, we need to create a larger vocabulary in some categories, than it is mandatory. We tried to pay an extra attention to the topic 'phonetically rich sentences and words', to create a phonetically well balanced speech database for text independent speech recognizers. A detailed statistical analysis was prepared to examine the statistics of phonemes, diphones, triphones and syllables.

The voice of 500 speakers have been recorded from all over the country, which provided the balanced distribution of the dialects.

The speakers has to read a given text material into the phone.

After recording we prepare the so-called annotation and segmentation process. This means that we listened to every recorded speech, and created label files containing information about the speaker and the speech according to the database definitions.

An automatic labelling system has been developed for helping the handmade segmentation.


The database has two parts. The first part (2 CDs) contains labeled application words, numbers, dates, spelling and names, the second part (1 CD) contains labeled and segmentated phonetically rich sentences and words.
These are available for everybody.

For examples, click here.

The MTBA database is available according to the following licence: Licence

For further informations, please contact us!