본문 바로가기
HOME> 논문 > 논문 검색상세

학위논문 상세정보

A Study Combining Fractal Dimension and Sound Signal Features used in Recognition Problem : 음성인식문제에 사용되는 프랙탈 차원 및 음성신호특징들의 결합기법 연구 원문보기

  • 저자

    LE TRAN SU

  • 학위수여기관

    울산대학교

  • 학위구분

    국내박사

  • 학과

    정보통신공학전공

  • 지도교수

    이종수

  • 발행년도

    2014

  • 총페이지

    114

  • 키워드

    emotion recognition engine fault diagnosis;

  • 언어

    eng

  • 원문 URL

    http://www.riss.kr/link?id=T13540283&outLink=K  

  • 초록

    Using sound signal features such as formant frequencies, pitch, fundamental frequencies is a common approach in recognition problems. In this dissertation, I focus on Speech Emotional Recognition (SER) and Engine Fault Diagnosis (EFD). However, the obtained results of these approaches indicated that the recognition rates is still a challenge. Research in this dissertation is to answer the question: How to achieve high and robust performances? To deal with this problems, firstly I review some approaches to deal with the SER and EFD problem. Each type is analyzed to study their advantages as well as disadvantages. I focus on the proposed methods using some sound signal features (MFCC, fundamental frequencies...) and fractal dimension to deal with two problems: SER and EFD. In this study, the important issue is the design of an appropriate classification scheme. Various classification techniques are reviewed and analyzed to find the suitable classifiers. 1. Emotion Recognition from Speech Signals Speech emotion recognition is a young interdisciplinary research field. For the first time the above described methodology was given a try not earlier than a decade ago. The first systems worked in lab conditions and were quite different from what can be used in real life applications. Nowadays, emotion recognition from speech is used to solve many problems in human computer interaction applications. Since then mathematicians, engineers, psychologists and various speech experts united their efforts in designing working speech emotional recognition systems, and significant progress has been made. Some old challenges have been successfully faced, many others remain. New applications keep appearing and in their turn pose new open problems and challenges. 1. How to achieve high and robust performances (the open system challenge)? 2. New high-level preferably perceptually adequate features are sought after (the open feature challenge), 3. How to craft classifiers for SER and go beyond the main-stream libraries for classification (the classifier challenge)? In this dissertation, we proposed answers to these problems. A novel system for emotion recognition from speech is presented. In our system, the combination of traditional speech features and fractal feature is used to achieve a high accuracy. Major results showed that the performance of our system is very similar to that obtained with the same database in subjective evaluation by human judges. 2. Engine faults diagnosis from sound signal Motor engine faults are serious when they occur inside an engine. Traditional engine fault diagnosis is highly dependent on technical engineer's skills and they have a high degree of failure rate. Hence, an improved method to diagnose engine faults is highly needed. In this research, we proposed two approaches. Firstly, in order to increase the accuracy of the fault detection, we apply the SURF algorithm for signals to generate the local features from images which are translated from the signals. In this approach, a sound or a vibration signal is translated into two dimension domain (image domain). After applying the SURF algorithm, the local features from images are generated. A classification technique in the fault diagnosis process uses the local features as the input faulty symptoms for classification. Neural networks [1], support vector machine [2], etc. were proposed to use for diagnosis model. In the second proposed approach, noisy sound from faulty engines was represented by the Mel Frequency Cepstrum (MFCC) coefficients, zero crossing rate, mean square and fundamental frequency features, which are used in Hidden Markov Model (HMM) for diagnosis. In our experiment are taken eight fault types, each of which shows various noisy sound signals. Experimental results indicate that the proposed method performs the diagnosis with a high accuracy rate of about 98% for the eight faulty types. Based on the good results obtained, we also suggest some future works to apply our proposed methods for smartphone applications. ?


 활용도 분석

  • 상세보기

    amChart 영역
  • 원문보기

    amChart 영역