Record Detail
Advanced Search
Text
Syllable-based Speech Recognition System Using Pitch Detection on Time–Frequency Domain Feature Extraction
This research presents the segmentation of single-syllable sounds for speech recognition using an artificial neural network. The network combines key features from speech signals in the time and frequency domains. The approach involves dividing speech signals into frames using the short-time energy waveform. Pitch markers are then extracted from the frames and used as reference points to split them into sections. The sections are further analyzed using window searching to identify positions, amplitudes, local minimum and maximum values, and maximum slope values, which serve as key features in the time domain. In the frequency domain, cepstrum coefficients on the Mel scale are used as additional key features. The two types of key features are combined for speech recognition using the artificial neural network. The study also compares the performance of the combined and separated key features in the time and frequency domains when fed into the neural network. The results demonstrate that using the artificial neural network with two input layers (Mel frequency cepstral coefficient and time domain features) and the same hidden layers yields the highest recognition accuracy of 96.97% and 88.43% for blind tests.
Availability
No copy data
Detail Information
Series Title |
-
|
---|---|
Call Number |
-
|
Publisher | International Journal of Computing and Digital Systems : Bahrain., 2023 |
Collation |
006
|
Language |
English
|
ISBN/ISSN |
2210-142X
|
Classification |
NONE
|
Content Type |
-
|
Media Type |
-
|
---|---|
Carrier Type |
-
|
Edition |
-
|
Subject(s) | |
Specific Detail Info |
-
|
Statement of Responsibility |
-
|
Other Information
Accreditation |
Scopus Q3
|
---|
Other version/related
No other version available
File Attachment
Information
Web Online Public Access Catalog - Use the search options to find documents quickly