Results (27)
Search Parameters:
Keyword: SpeechIntegrating Speech and Gesture for Generating Reliable Robotic Task Configuration
This paper presents a system that combines speech and pointing gestures along with four distinct hand gestures to precisely identify both the object of interest and parameters for robotic tasks. We utilized skeleton landmarks to detect pointing gestures and determine their direction, while a pre-trained model, trained on 21 hand landmarks from 2D images, was…
Read MoreBangla Speech Emotion Detection using Machine Learning Ensemble Methods
Emotion is the most important component of being human, and very essential for everyday activities, such as the interaction between people, decision making, and learning. In order to adapt to the COVID-19 pandemic situation, most of the academic institutions relied on online video conferencing platforms to continue educational activities. Due to low bandwidth in many…
Read MoreEmotion Mining from Speech in Collaborative Learning
Affective states, a dimension of attitude, have a critical role in the learning process. In the educational setting, affective states are commonly captured by self-report tools or based on sentiment analysis on asynchronous textual chats, discussions, or students’ journals. Drawbacks of such tools include: distracting the learning process, demanding time and commitment from students to…
Read MoreAn Alternative Approach for Thai Automatic Speech Recognition Based on the CNN-based Keyword Spotting with Real-World Application
An automatic speech recognition (ASR) is a key technology for preventing an ongoing global coronavirus epidemic. Due to the limited corpus database and the morphological diversity of the Thai language, Thai speech recognition is still difficult. In this research, the automatic speech recognition model was built differently from the traditional Thai NLP systems by using…
Read MoreA Model for the Application of Automatic Speech Recognition for Generating Lesson Summaries
Automatic Speech Recognition (ASR) technology has the potential to improve the learning experience of students in the classroom. This article addresses some of the key theoretical areas identified in the pursuit of implementing a speech recognition system, capable of lesson summary generation in the educational setting. The article discusses: some of the applica- tions of…
Read MoreDeaf Chat: A Speech-to-Text Communication Aid for Hearing Deficiency
Hearing impairments have a negative impact in the lives of individuals living with them and those around such individuals. Different applications and technological tools have been developed to help reduce this negative impact. Most mobile applications that have been developed that use Speech-to-Text technology have been inconsistent such that they are not inclusive of all…
Read MoreDistributed Microphone Arrays, Emerging Speech and Audio Signal Processing Platforms: A Review
Given ubiquitous digital devices with recording capability, distributed microphone arrays are emerging recording tools for hands-free communications and spontaneous tele-conferencings. However, the analysis of signals recorded with diverse sampling rates, time delays, and qualities by distributed microphone arrays is not straightforward and entails important considerations. The crucial challenges include the unknown/changeable geometry of distributed arrays,…
Read MoreNoise Cancellation Algorithm Based on Air- and Bone-Conducted Speech Signals by Considering an Unscented Transformation Method
Noise control is essential when applying speech recognition in noisy environments such as factories. In this study, a signal processing for noise cancellation is proposed by using a noise-insensitive bone-conducted speech signal together with an air-conducted speech signal. The speech signal is generally expressed by a nonlinear model. The extended Kalman filter is very famous…
Read MoreDifference in Speech Analysis Results by Coding
Mental health disorder is becoming a social problem, and there is a need for technology that can easily check for states of stress and depression as a countermeasure. Conventional methods of diagnostic support and screening include self-administered psychological tests and use of biomarkers. However, there are problems such as burden on subjects, examination costs, dedicated…
Read MoreAmplitude-Frequency Analysis of Emotional Speech Using Transfer Learning and Classification of Spectrogram Images
Automatic speech emotion recognition (SER) techniques based on acoustic analysis show high confusion between certain emotional categories. This study used an indirect approach to provide insights into the amplitude-frequency characteristics of different emotions in order to support the development of future, more efficiently differentiating SER methods. The analysis was carried out by transforming short 1-second…
Read MoreEmotional state recognition in speech signal
The matters regarding speech signal processing and analyzing in terms of emotional states recognition were presented in this paper. An experiment was conducted to perform both objective and subjective emotional states recognition tests for Polish language.
Read MoreAnalysis of Emotions and Movements of Asian and European Facial Expressions
The aim of this study is to develop an advanced framework that not only recognize the dominant facial emotion, but also contains modules for gesture recognition and text-to-speech recognition. Each module is meticulously designed and integrated into unified system. The implemented models have been revised, with the results presented through graphical representations, providing prevalent emotions…
Read MoreMachine Learning Algorithms for Real Time Blind Audio Source Separation with Natural Language Detection
The Conv-TasNet and Demucs algorithms, can differentiate between two mixed signals, such as music and speech, the mixing operation proceed without any support information. The network of convolutional time-domain audio separations is used in Conv-TasNet algorithm, while there is a new waveform-to-waveform model in Demucs algorithm. The Demucs algorithm utilizes a procedure like the audio…
Read MoreThe Design and Implementation of Intelligent English Learning Chabot based on Transfer Learning Technology
Chatbot operates task-oriented customer services in special and open domains at different mobile devices. Its related products such as knowledge base Question-Answer System also benefit daily activities. Chatbot functions generally include automatic speech recognition (ASR), natural language understanding (NLU), dialogue management (DM), natural language generation (NLG) and speech synthesis (SS). In this paper, we proposed…
Read MoreDependency Head Annotation for Myanmar Dependency Treebank
Complete manual annotation of dependency treebank needs resources like annotators and annotation tools and takes long time and has high possibility of inconsistent annotations for free word order languages such as Myanmar. This paper describes a dependency head annotation scheme with Universal part-of-speech and Universal Dependencies for Myanmar dependency treebank. Currently 22,810 sentences and 680,218…
Read MoreA Study on Intelligent Dialogue Agent for Older Adults’ Preventive Care – Towards Development of a Comprehensive Preventive Care System
Preventive care approaches have attracted much attention in Japan, which is one of the world’s most super-aged societies. These approaches aim to decrease the number of people who require nursing care or other human support. Our research group has developed several kinds of preventive care systems, including a fall prevention system, a cognitive training system,…
Read MoreInteractive Virtual Rehabilitation for Aphasic Arabic-Speaking Patients
Objective: Individuals with aphasia often experience significant problems in their daily lives and social participation. Technologies that address speech and language disorders deficit in merging between therapist’s major role and reinforcing the training between sessions at home. It also lacks the Arabic language attention; however, current systems are typically expensive and lack amusement. Moreover, cumulative…
Read MoreAdvances in Optimisation Algorithms and Techniques for Deep Learning
In the last decade, deep learning(DL) has witnessed excellent performances on a variety of problems, including speech recognition, object recognition, detection, and natural language processing (NLP) among many others. Of these applications, one common challenge is to obtain ideal parameters during the training of the deep neural networks (DNN). These typical parameters are obtained by…
Read MoreHuman-Robot Multilingual Verbal Communication – The Ontological knowledge and Learning-based Models
In their verbal interactions, humans are often afforded with language barriers and communication problems and disabilities. This problem is even more serious in the fields of education and health care for children with special needs. The use of robotic agents, notably humanoids integrated within human groups, is a very important option to face these limitations.…
Read MoreThe Sound of Trust: Towards Modelling Computational Trust using Voice-only Cues at Zero-Acquaintance
Trust is essential in many interdependent human relationships. Trustworthiness is measured via the effectiveness of the relationships involving human perception. The decision to trust others is often made quickly (even at zero acquaintance). Previous research has shown the significance of voice in perceived trustworthiness. However, the listeners’ characteristics were not considered. A system has yet…
Read MoreBilateral Communication Device for Deaf-Mute and Normal People
Communication is a bilateral process and being understood by the person you are talking to is a must. Without the ability to talk nor hear, a person would endure such handicap. Given that hearing and speech are missing, many have ventured to open new communication methods for them through sign language. This bilateral communication device…
Read MoreROS Based Multimode Control of Wheeled Robot
This research work mainly presents the design and development of a small-scaled wheeled robot, which can be controlled using multiple controlling interfaces using some new technological trends. Raspberry Pi 3 as the main controller, Python as the programming language integrated with the Robot Operating System (ROS) and Virtual Network Computing (VNC) for screen sharing are…
Read MoreSmart Ambulance: Speed Clearance in the Internet of Things paradigm using Voice Chat
In recent years, researchers have focused on the development of many applications of information and communication which could lead to enhance human life. The congestion and road traffic are one of the most problems facing the ambulance transportation to provide fast healthcare services for patients. In this work, a tracking and data transfer system has…
Read MoreVowel Classification Based on Waveform Shapes
Vowel classification is an essential part of speech recognition. In classical studies, this problem is mostly handled by using spectral domain features. In this study, a novel approach is proposed for vowel classification based on the visual features of speech waveforms. In sound vocalizing, the position of certain organs of the human vocal system such…
Read MoreMachine Learning Applied to GRBAS Voice Quality Assessment
Voice problems are routinely assessed in hospital voice clinics by speech and language therapists (SLTs) who are highly skilled in making audio-perceptual evaluations of voice quality. The evaluations are often presented numerically in the form of five-dimensional ‘GRBAS’ scores. Computerised voice quality assessment may be carried out using digital signal processing (DSP) techniques which process…
Read More
