Budapest University of Technology and Economics Department of Telecommunications and Media Informatics Home send e-mail
Back to the first page



Hungarian Speech Databases


BABEL - a Multi -Language Database

Project No. 1304


  Project coordinator: Klara Vicsi
  Project Assistant: Attila Vig




1. Recording Protocol

1.1 Recording Environment
1.2 Recording Equipment
1.3 System Calibration
1.4 Recording Features

2. Text material

3. Ortographic Prompting Text

3.1 Passages
3.2 Numbers
3.3 CVC Words
3.4 Context Words

4. SAMPA Phonotypical Transcription

4.1 Passages
4.2 Numbers
4.3 CVC Words
4.4 Context Words

5. Translation of the First 30 Passages

6. Distribution of Talker Set

6.1 Distribution of Few Talkers
6.2 Distribution of Many Talkers

7.Sound material

7.1 Speaker Selection
7.2 Distribution of Talkers on Prompting Texts
7.3 Recording Mode and Prompting Style
7.4 Recording Control

8. Recording Procedure

9. Segmentation and labelling

10. Collation of Recordings

11. Structure of the CDROM-s

12. Attachment

13. Prosodic and syntactic annotation, phoneme level segmentation

A part of the BABEL database (approx 75% of the phonetically rich paragraphs) is available for free research use according to the following licence: Licence

Please note that this add-on for the database does not contain the speech samples, which you will need from the BABEL Hungarian database disributed by ELRA.