Arabic Automatic Speech Recognition  (AASR)

التعرف الآلي علي الكلام العربي

باستعمال أدوات سفنكس

We use Sphinx tools, developed at Carnegie Mellon University

A-How to install Sphinx tools 

B- Creating the AASR project environment

C- Arabic Phoneme set (AASR.phone)

D- Arabic phonetic dictionary (AASR.dic  ) (23,841 entries)

E- Simple Filler Dictionary  (AASR.filler)

F- Sample speech corpus ( 10 news stories, 140 files )

G- Feature vectors for 4.5 hours corpus. (95 MB), corresponding text

H- Language Model

I- Trained model parameters   16MB (ready for use, place in the project directory)

J- Recognition results, and Error analysis

K- Generating a phonetic dictionary from a text file (with full tashkeel)

L- Other useful tools

M- Arabic news corpus (4.5 Hours) (Send email to: elshafei at kfupm.edu.sa)

 

--------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------

A- How to install Sphinx tools 

1-                 1-         Down load and Install a Perl interpreter as Active Perl

      http://www.activestate.com/Products/activeperl/index.mhtml

2-      Install CYGWIN on your PC. Cygwin creates a Linux like  window under MS Windows operating systems. Copy the directory Cygwin_raw from the CD into your PC and  install it using the setup.exe file in the cygwin directory. Cygwin can be downloaded (580 MB) from   http://www.cygwin.com/

3-      Now you can open Cygwin by clicking the icon of Cygwin Bash Shell located on the desktop. You will see a window similar to the command window of the MS Windows.

4-      Cyqwin will create a directory structure which contains a directory carrying your username as follows  c:\cygwin\home\username\. Copy the compressed files of SphinxTrain, Sphinx3, Sphinx4, and cmuclmtk ( CMU language modeling tool kit) in your subdirectory.

You can download all Sphinx tools in a zipped tar format from

http://cmusphinx.sourceforge.net/html/cmusphinx.php

5-      Open Cygwin command window and go to your Cygwin home directory. Unzip the compressed files one by one using the tar command from the Cygwin window

>>   tar –xvzf  filename

This will create a subdirectory for each of the unzipped tar file in your Cydwin home directory. The directories may come with  suffixes indicating the release and build number, e.g. “ sphinx3-0.6”. If this is the case, edit their names to remove the release and build number, e.g. “sphinx3”.

6-      To install Sphinx3 execute the following commands

>> cd   sphinx3

      >> ./configure

      >> make

      >> make install

7-      To install Sphinx train tools, from your home directory execute  the following commands

 >> cd   SphinxTrain

      >> ./configure

      >>  make

 

8-      To install Sphinx 4 you need first to install Java run-time environment  from http://www.java.com/en/download/index.jsp .  You need also Apache Ant

From http://ant.apache.org/.  

9-      To install Sphinx4, go to sphinx4 sub directory and execute the command ant instead of the “make” command in the previous steps.

 

B- Creating the AASR project environment

     Sphinx training tools come with Perl scripts to automate the lengthy training procedure. However, the user must prepare his project according to specific format and directory structure. The Perl package requires the developer to create a directory for the training process. The Perl script setup_SphinxTrain.pl can be used to create this structure. Here is an explanation of the directory structure created by this script:

 

Table 2: Directory structure created by setup_SphinxTrain.pl.

Directory
Description
bin/
Contains the executables that will be used in the training process
bwaccumdir/
A buffer directory used by the Baum-Welch algorithm to store intermediate results
etc/
This folder contains the configuration file (.cfg). Other files supplied by the user will be added here (explained later).
feat/
The output of feature extraction should be added here.
gifs/
Some images used in the HTML log file.
logdir/
This directory contains the log for each operation done during the training process.
Model_architecture/
Contains the model definition files. The linguistic questions file is also included here.
Model_parameters/
Contains binary files representing the HMM models.
Scripts_pl/
Contains Perl scripts used in the training process.
wav/
Contains the audio files used for training.

 

   For instance, in order to create the directory structure for a the AASR, the following commands should be executed starting from the user home directory, assuming the sphinx training package is located in the directory SphinxTrain :

>> mkdir AASR
>> cd AASR
>> ../SphinxTrain/scripts_pl/setup_SphinxTrain.pl –task AASR

 

   These commands will create a directory structure under the new project directory “AASR”. One of the important files created by this script is the configuration file, called AASR.cfg,  located under the “etc/” directory. This file contains many configuration parameters, some of which have major effects on efficiency. These parameters are mostly numerical, and they control the training process; for example, structure of the acoustic models, the number of iterations of the Baum-Welch algorithm, the target number of senons, type and length of feature vectors, etc. We will refer to this directory structure during the subsequent discussions on data preparation.

 

 

J- Error Analysis

WORD: %Correct=90.13 , %Accuracy=88.29 (WER =11.71) [H=8371, D=85, S=832, I=171, N=9288]

substitution errors ($ S$), deletion errors ($ D$) and insertion errors ($ I$).  The percentage correct is then

                    Percent Correct   

where $ N$is the total number of labels in the reference transcriptions. Notice that this measure ignores insertion errors. For many purposes, the percentage accuracy defined as

              Percent Accuracy   

 The reported WER in this work is considered

                            WER =100-Percent Accuracy