hex23.gif
hex38.gif
hex32.gif
hex26.gif
hex35.gif
hex29.gif

Budapest University of Technology and Economics Department of Telecommunications and Media Informatics Home send e-mail
Back to the first page
 


 

 
 

Hungarian Speech Databases


 

Tesztel – Hungarian Telephone Speech Test Database

  Project coordinator: Klara Vicsi

 
  Staff:
 
  Cs. Teleki, Gy. Szaszák, Z. Valyon  
  Budapest University of Technology and Economics

 
 

The aim of this project was to create a mobile phone voice based Hungarian speech database recorded in noisy environments for testing purposes (also called Tesztel).

The database contains voices of 100 speakers, recorded through mobile telephone in noisy environments.

The main goal of creating this database was to test phoneme based recognizers, which have been already trained, so the corpus must have been compact and had to cover as good as possible the specific character of the Hungarian language.
The text that the speaker had to tell was designed to contain at least one of every Hungarian phoneme, taking in consideration the statistics of phonemes, diphones, triphones and syllables in Hungarian language.

The corpus contains not only continuously told sentences, but command words, spelled forenames, numbers, dates, different currency types, city names, questions with yes/no answer, phonetically rich words. The database contains mostly spontaneous speech.

Since the whole database contains speech recorded in noisy environments, we wanted to find out an average value of the signal to noise ratio for the recorded speech. But this parameter depends on multiple and different factors, like the type and intensity of the background noise, the parameters of the channel. Probably this is the reason why the measured signal to noise ratio varies on a very large scale, between 5dB and 25dB. The lowest value (approximately 5dB) was measured at the recordings that were made near busy highways or on public transport (mainly on old trams) in rush hour. The highest values (approximately 25dB) were measured at the recordings that were made on side streets, public transport or room (mainly late at night).


For further informations, contact us!