Arabic Automatic Speech Recognition (AASR)
التعرف الآلي علي الكلام العربي
باستعمال أدوات سفنكس
We use Sphinx tools, developed at Carnegie Mellon University
B- Creating the AASR project environment
C- Arabic Phoneme set (AASR.phone)
D- Arabic phonetic dictionary (AASR.dic ) (23,841 entries)
E- Simple Filler Dictionary (AASR.filler)
F- Sample speech corpus ( 10 news stories, 140 files )
G- Feature vectors for 4.5 hours corpus. (95 MB), corresponding text
I- Trained model parameters 16MB (ready for use, place in the project directory)
J- Recognition results, and Error analysis
K- Generating a phonetic dictionary from a text file (with full tashkeel)
M- Arabic news corpus (4.5 Hours) (Send email to: elshafei at kfupm.edu.sa)
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
A- How to install Sphinx tools
1- 1- Down load and Install a Perl interpreter as Active Perl
http://www.activestate.com/Products/activeperl/index.mhtml
2- Install CYGWIN on your PC. Cygwin creates a Linux like window under MS Windows operating systems. Copy the directory Cygwin_raw from the CD into your PC and install it using the setup.exe file in the cygwin directory. Cygwin can be downloaded (580 MB) from http://www.cygwin.com/
3- Now you can open Cygwin by clicking the icon of Cygwin Bash Shell located on the desktop. You will see a window similar to the command window of the MS Windows.
4- Cyqwin will create a directory structure which contains a directory carrying your username as follows c:\cygwin\home\username\. Copy the compressed files of SphinxTrain, Sphinx3, Sphinx4, and cmuclmtk ( CMU language modeling tool kit) in your subdirectory.
You can download all Sphinx tools in a zipped tar format from
http://cmusphinx.sourceforge.net/html/cmusphinx.php
5- Open Cygwin command window and go to your Cygwin home directory. Unzip the compressed files one by one using the tar command from the Cygwin window
>> tar –xvzf filename
This will create a subdirectory for each of the unzipped tar file in your Cydwin home directory. The directories may come with suffixes indicating the release and build number, e.g. “ sphinx3-0.6”. If this is the case, edit their names to remove the release and build number, e.g. “sphinx3”.
6- To install Sphinx3 execute the following commands
>> cd sphinx3
>> ./configure
>> make
>> make install
7- To install Sphinx train tools, from your home directory execute the following commands
>> cd SphinxTrain
>> ./configure
>> make
8- To install Sphinx 4 you need first to install Java run-time environment from http://www.java.com/en/download/index.jsp . You need also Apache Ant
From http://ant.apache.org/.
9- To install Sphinx4, go to sphinx4 sub directory and execute the command ant instead of the “make” command in the previous steps.
B- Creating the AASR project environment
Sphinx training tools come with Perl scripts to automate the lengthy training procedure. However, the user must prepare his project according to specific format and directory structure. The Perl package requires the developer to create a directory for the training process. The Perl script setup_SphinxTrain.pl can be used to create this structure. Here is an explanation of the directory structure created by this script:
Table 2: Directory structure created by setup_SphinxTrain.pl.
Directory
|
Description
|
bin/
|
Contains the executables that will be used in the training process
|
bwaccumdir/
|
A buffer directory used by the Baum-Welch algorithm to store intermediate results
|
etc/
|
This folder contains the configuration file (.cfg). Other files supplied by the user will be added here (explained later).
|
feat/
|
The output of feature extraction should be added here.
|
gifs/
|
Some images used in the HTML log file.
|
logdir/
|
This directory contains the log for each operation done during the training process.
|
Model_architecture/
|
Contains the model definition files. The linguistic questions file is also included here.
|
Model_parameters/
|
Contains binary files representing the HMM models.
|
Scripts_pl/
|
Contains Perl scripts used in the training process.
|
wav/
|
Contains the audio files used for training.
|
For instance, in order to create the directory structure for a the AASR, the following commands should be executed starting from the user home directory, assuming the sphinx training package is located in the directory SphinxTrain :
>> mkdir AASR
>> cd AASR
>> ../SphinxTrain/scripts_pl/setup_SphinxTrain.pl –task AASR
These commands will create a directory structure under the new project directory “AASR”. One of the important files created by this script is the configuration file, called AASR.cfg, located under the “etc/” directory. This file contains many configuration parameters, some of which have major effects on efficiency. These parameters are mostly numerical, and they control the training process; for example, structure of the acoustic models, the number of iterations of the Baum-Welch algorithm, the target number of senons, type and length of feature vectors, etc. We will refer to this directory structure during the subsequent discussions on data preparation.
WORD: %Correct=90.13 , %Accuracy=88.29 (WER =11.71) [H=8371, D=85, S=832, I=171, N=9288]
substitution errors (), deletion errors () and insertion errors (). The percentage correct is then
Percent Correct
where is the total number of labels in the reference transcriptions. Notice that this measure ignores insertion errors. For many purposes, the percentage accuracy defined as
Percent Accuracy
The reported WER in this work is considered
WER =100-Percent Accuracy