%0 Journal Article %T 基于喉部振动的语音自动识别系统的设计
Design of Automatic Speech Recognition System Based on Throat Vibration %A 陆心怡 %A 卜朝晖 %J Modeling and Simulation %P 365-376 %@ 2324-870X %D 2024 %I Hans Publishing %R 10.12677/MOS.2024.131035 %X 现有且成熟的语音识别系统基本局限于健康群体及主流语言,并不适用于声带受损的患者。因此,本文研究设计了一款基于喉部振动的语音自动识别系统,旨在为声带受损患者及言语障碍残疾群体的康复训练与正常生活提供一种可行的方案。采用智能数字听诊器Mintti Smartho-D2对喉部软骨振动信号进行检测,借助于主流的语音识别深度学习算法:卷积神经网络模型、卷积长短时记忆神经网络模型、卷积递归神经网络模型,对喉振信号数据集分别进行多次训练,以期实现喉部软骨振动信号到正常语音信号的转换。通过对比实验,得出三种模型的测试字错率分别为0.1572、0.2018、0.06787,其中识别效果最佳为卷积递归神经网络模型,实现了字错率在安静环境下低于0.07的效果。本文可初步验证该设计的可行性及CRNN模型能够在效率和识别效果上取得较好的性能。
Existing and mature speech recognition systems are basically limited to healthy groups and main-stream languages, and are not applicable to patients with impaired vocal cords. Therefore, in this paper, an automatic speech recognition system based on laryngeal vibration is designed with the aim of providing a feasible solution for the rehabilitation training and normal life of patients with impaired vocal folds and speech-impaired disabled groups. Intelligent digital stethoscope Mintti Smartho-D2 was used to detect the vibration signal of laryngeal cartilage with the help of main-stream speech recognition deep learning algorithms: The convolutional neural network model, convolutional short-duration memory neural network model and convolutional recurrent neural network model were trained several times on laryngeal vibration signal data set respectively, in order to realize the conversion of laryngeal cartilage vibration signal to normal speech signal. Through comparison experiments, it is concluded that the test word error rates of the three models are 0.1572, 0.2018, and 0.06787, respectively, among which the best recognition effect is the con-volutional recurrent neural network model, which realizes a word error rate of less than 0.07 in a quiet environment. This paper can initially verify the feasibility of the design and the CRNN model can achieve better performance in terms of efficiency and recognition effect. %K 喉部软骨振动,深度学习,语音识别,卷积神经网络
Laryngeal Cartilage Vibration %K Deep Learning %K Speech Recognition %K Convolutional Neural Network %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=79251