In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.
Published in | Journal of Electrical and Electronic Engineering (Volume 2, Issue 6) |
DOI | 10.11648/j.jeee.20140206.12 |
Page(s) | 89-93 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2015. Published by Science Publishing Group |
Speech Recognition, Embedded Development, Image Retrieval, DTW Algorithm, ARM Development
[1] | Shen Y T. Portable personal multimedia terminal: U.S. Patent D689, 856[P]. 2013-9-17. |
[2] | Rasiwasia N, Costa Pereira J, Coviello E, et al. A new approach to cross-modal multimedia retrieval[C]//Proceedings of the international conference on Multimedia. ACM, 2010: 251-260. |
[3] | Rabiner L R, Schafer R W. Digital Speech Processing [J]. The Froehlich/Kent Encyclopedia of Telecommunications, 2011, 6: 237-258. |
[4] | Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups [J]. Signal Processing Magazine, IEEE, 2012, 29(6): 82-97. |
[5] | Muscillo R, Schmid M, Conforto S, et al. Early recognition of upper limb motor tasks through accelerometers: real-time implementation of a DTW-based algorithm [J]. Computers in biology and medicine, 2011, 41(3): 164-172. |
[6] | Zhu B B, Yan J, Li Q, et al. Attacks and design of image recognition CAPTCHAs[C]//Proceedings of the 17th ACM conference on Computer and communications security. ACM, 2010: 187-200. |
[7] | Lux M, Klieber W, Granitzer M. Caliph & Emir: semantics in multimedia retrieval and annotation[C]//Proceedings of the 19th International CODATA Conference. 2004: 64-75. |
[8] | Viswanathan M, Viswanathan M. Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale [J]. Computer Speech & Language, 2005, 19(1): 55-83. |
APA Style
Zhengxi Wei, Jinming Liang. (2015). Embedded System for Speech Recognition and Image Processing. Journal of Electrical and Electronic Engineering, 2(6), 89-93. https://doi.org/10.11648/j.jeee.20140206.12
ACS Style
Zhengxi Wei; Jinming Liang. Embedded System for Speech Recognition and Image Processing. J. Electr. Electron. Eng. 2015, 2(6), 89-93. doi: 10.11648/j.jeee.20140206.12
AMA Style
Zhengxi Wei, Jinming Liang. Embedded System for Speech Recognition and Image Processing. J Electr Electron Eng. 2015;2(6):89-93. doi: 10.11648/j.jeee.20140206.12
@article{10.11648/j.jeee.20140206.12, author = {Zhengxi Wei and Jinming Liang}, title = {Embedded System for Speech Recognition and Image Processing}, journal = {Journal of Electrical and Electronic Engineering}, volume = {2}, number = {6}, pages = {89-93}, doi = {10.11648/j.jeee.20140206.12}, url = {https://doi.org/10.11648/j.jeee.20140206.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20140206.12}, abstract = {In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.}, year = {2015} }
TY - JOUR T1 - Embedded System for Speech Recognition and Image Processing AU - Zhengxi Wei AU - Jinming Liang Y1 - 2015/02/06 PY - 2015 N1 - https://doi.org/10.11648/j.jeee.20140206.12 DO - 10.11648/j.jeee.20140206.12 T2 - Journal of Electrical and Electronic Engineering JF - Journal of Electrical and Electronic Engineering JO - Journal of Electrical and Electronic Engineering SP - 89 EP - 93 PB - Science Publishing Group SN - 2329-1605 UR - https://doi.org/10.11648/j.jeee.20140206.12 AB - In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages. VL - 2 IS - 6 ER -