Embedded System for Speech Recognition and Image Processing

Zhengxi Wei; Jinming Liang

doi:doi:10.11648/j.jeee.20140206.12

| Peer-Reviewed

Embedded System for Speech Recognition and Image Processing

Zhengxi Wei, Jinming Liang

Published in Journal of Electrical and Electronic Engineering (Volume 2, Issue 6)

Received: 16 December 2014 Accepted: 23 December 2014 Published: 6 February 2015

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.

Published in	Journal of Electrical and Electronic Engineering (Volume 2, Issue 6)
DOI	10.11648/j.jeee.20140206.12
Page(s)	89-93
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2015. Published by Science Publishing Group

Keywords

Speech Recognition, Embedded Development, Image Retrieval, DTW Algorithm, ARM Development

References

[1]	Shen Y T. Portable personal multimedia terminal: U.S. Patent D689, 856[P]. 2013-9-17.
[2]	Rasiwasia N, Costa Pereira J, Coviello E, et al. A new approach to cross-modal multimedia retrieval[C]//Proceedings of the international conference on Multimedia. ACM, 2010: 251-260.
[3]	Rabiner L R, Schafer R W. Digital Speech Processing [J]. The Froehlich/Kent Encyclopedia of Telecommunications, 2011, 6: 237-258.
[4]	Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups [J]. Signal Processing Magazine, IEEE, 2012, 29(6): 82-97.
[5]	Muscillo R, Schmid M, Conforto S, et al. Early recognition of upper limb motor tasks through accelerometers: real-time implementation of a DTW-based algorithm [J]. Computers in biology and medicine, 2011, 41(3): 164-172.
[6]	Zhu B B, Yan J, Li Q, et al. Attacks and design of image recognition CAPTCHAs[C]//Proceedings of the 17th ACM conference on Computer and communications security. ACM, 2010: 187-200.
[7]	Lux M, Klieber W, Granitzer M. Caliph & Emir: semantics in multimedia retrieval and annotation[C]//Proceedings of the 19th International CODATA Conference. 2004: 64-75.
[8]	Viswanathan M, Viswanathan M. Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale [J]. Computer Speech & Language, 2005, 19(1): 55-83.

Cite This Article

Plain Text BibTeX RIS

APA Style

Zhengxi Wei, Jinming Liang. (2015). Embedded System for Speech Recognition and Image Processing. Journal of Electrical and Electronic Engineering, 2(6), 89-93. https://doi.org/10.11648/j.jeee.20140206.12

Copy | Download

ACS Style

Zhengxi Wei; Jinming Liang. Embedded System for Speech Recognition and Image Processing. J. Electr. Electron. Eng. 2015, 2(6), 89-93. doi: 10.11648/j.jeee.20140206.12

Copy | Download

AMA Style

Zhengxi Wei, Jinming Liang. Embedded System for Speech Recognition and Image Processing. J Electr Electron Eng. 2015;2(6):89-93. doi: 10.11648/j.jeee.20140206.12

Copy | Download

@article{10.11648/j.jeee.20140206.12,
  author = {Zhengxi Wei and Jinming Liang},
  title = {Embedded System for Speech Recognition and Image Processing},
  journal = {Journal of Electrical and Electronic Engineering},
  volume = {2},
  number = {6},
  pages = {89-93},
  doi = {10.11648/j.jeee.20140206.12},
  url = {https://doi.org/10.11648/j.jeee.20140206.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20140206.12},
  abstract = {In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.},
 year = {2015}
}

Copy | Download

TY  - JOUR
T1  - Embedded System for Speech Recognition and Image Processing
AU  - Zhengxi Wei
AU  - Jinming Liang
Y1  - 2015/02/06
PY  - 2015
N1  - https://doi.org/10.11648/j.jeee.20140206.12
DO  - 10.11648/j.jeee.20140206.12
T2  - Journal of Electrical and Electronic Engineering
JF  - Journal of Electrical and Electronic Engineering
JO  - Journal of Electrical and Electronic Engineering
SP  - 89
EP  - 93
PB  - Science Publishing Group
SN  - 2329-1605
UR  - https://doi.org/10.11648/j.jeee.20140206.12
AB  - In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.
VL  - 2
IS  - 6
ER  -

Copy | Download

Author Information

Zhengxi Wei

School of Computer Science, Sichuan University of Science & Engineering, Zigong Sichuan 643000, PR China
Jinming Liang

School of Computer Science, Sichuan University of Science & Engineering, Zigong Sichuan 643000, PR China

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Zhengxi Wei, Jinming Liang. (2015). Embedded System for Speech Recognition and Image Processing. Journal of Electrical and Electronic Engineering, 2(6), 89-93. https://doi.org/10.11648/j.jeee.20140206.12

Copy | Download

ACS Style

Zhengxi Wei; Jinming Liang. Embedded System for Speech Recognition and Image Processing. J. Electr. Electron. Eng. 2015, 2(6), 89-93. doi: 10.11648/j.jeee.20140206.12

Copy | Download

AMA Style

Zhengxi Wei, Jinming Liang. Embedded System for Speech Recognition and Image Processing. J Electr Electron Eng. 2015;2(6):89-93. doi: 10.11648/j.jeee.20140206.12

Copy | Download

@article{10.11648/j.jeee.20140206.12,
  author = {Zhengxi Wei and Jinming Liang},
  title = {Embedded System for Speech Recognition and Image Processing},
  journal = {Journal of Electrical and Electronic Engineering},
  volume = {2},
  number = {6},
  pages = {89-93},
  doi = {10.11648/j.jeee.20140206.12},
  url = {https://doi.org/10.11648/j.jeee.20140206.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20140206.12},
  abstract = {In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.},
 year = {2015}
}

Copy | Download

TY  - JOUR
T1  - Embedded System for Speech Recognition and Image Processing
AU  - Zhengxi Wei
AU  - Jinming Liang
Y1  - 2015/02/06
PY  - 2015
N1  - https://doi.org/10.11648/j.jeee.20140206.12
DO  - 10.11648/j.jeee.20140206.12
T2  - Journal of Electrical and Electronic Engineering
JF  - Journal of Electrical and Electronic Engineering
JO  - Journal of Electrical and Electronic Engineering
SP  - 89
EP  - 93
PB  - Science Publishing Group
SN  - 2329-1605
UR  - https://doi.org/10.11648/j.jeee.20140206.12
AB  - In recent years, the products of voice terminal and image retrieval show the intelligentized trend, but the mature commodities are rare in the market. This paper presents an embedded design method of intelligent voice terminal based on pattern recognition. The design adopts Samsung S3C2410 ARM as target board, Philips Uda1341TS as audio codec, embedded Linux OS as software platform, and speech recognition is implemented through small-vocabulary voice training. To improve the recognized effect, we use the image retrieval technology as an auxiliary tool, which helps speech recognition module create or more accurately find a personal voice-training library. By means of image recognition, the experimental results prove that the effect of speech recognition achieves an average increase of 10 percentages.
VL  - 2
IS  - 6
ER  -

Copy | Download