AI-Powered Intelligent Speech Processing: Evolution, Applications and Future Directions

Gorde:

Xehetasun bibliografikoak
Argitaratua izan da:	International Journal of Advanced Computer Science and Applications vol. 16, no. 2 (2025)
Egile nagusia:	PDF
Argitaratua:	Science and Information (SAI) Organization Limited
Gaiak:	Macao Augmented reality Internet of Things Big Data Artificial neural networks Recurrent neural networks Speech processing Smart buildings Algorithms Semi-supervised learning Deep learning Machine learning Real time Ethical standards Synthesis Statistical models Speech recognition Language Personality traits Computer science Personality Text editing Speaking Human-computer interaction Technology Computers Artificial intelligence Voice recognition Neural networks Natural language processing Linguistics Consciousness Speech
Sarrera elektronikoa:	Citation/Abstract Full Text - PDF
Etiketak:	Etiketa erantsi Etiketarik gabe, Izan zaitez lehena erregistro honi etiketa jartzen!

MARC


LEADER	00000nab a2200000uu 4500
001	3180200463
003	UK-CbPIL
022			\|a 2158-107X
022			\|a 2156-5570
024	7		\|a 10.14569/IJACSA.2025.0160291 \|2 doi
035			\|a 3180200463
045	2		\|b d20250101 \|b d20251231
100	1		\|a PDF
245	1		\|a AI-Powered Intelligent Speech Processing: Evolution, Applications and Future Directions
260			\|b Science and Information (SAI) Organization Limited \|c 2025
513			\|a Journal Article
520	3		\|a This paper provides an overview of the historical evolution of speech recognition, synthesis, and processing technologies, highlighting the transition from statistical models to deep learning-based models. Firstly, the paper reviews the early development of speech processing, tracing it from the rule-based and statistical models of the 1960s to the deep learning models, such as deep neural networks (DNN), convolutional neural networks (CNN), and recurrent neural networks (RNN), which have dramatically reduced error rates in speech recognition and synthesis. It emphasizes how these advancements have led to more natural and accurate speech outputs. Then, the paper examines three key learning paradigms used in speech recognition: supervised, self-supervised, and semi-supervised learning. Supervised learning relies on large amounts of labeled data, while self-supervised and semi-supervised learning leverage unlabeled data to improve generalization and reduce reliance on manually labeled datasets. These paradigms have significantly advanced the field of speech recognition. Furthermore, the paper explores the wide-ranging applications of AI-driven speech processing, including smart homes, intelligent transportation, healthcare, and finance. By integrating AI with technologies like the Internet of Things (IoT) and big data, speech technology is being applied in voice assistants, autonomous vehicles, and speech-controlled devices. The paper also addresses the current challenges facing intelligent speech processing, such as performance issues in noisy environments, the scarcity of data for low-resource languages, and concerns related to data privacy, algorithmic bias, and legal responsibility. Overcoming these challenges will be crucial for the continued progress of the field. Finally, the paper looks to the future, predicting further improvements in speech processing technology through advancements in hardware and algorithms. It anticipates increased focus on personalized services, real-time speech processing, and multilingual support, along with growing integration with other technologies such as augmented reality. Despite the technical and ethical challenges, AI-driven speech processing is expected to continue its transformative impact on society and industry.
651		4	\|a Macao
653			\|a Augmented reality
653			\|a Internet of Things
653			\|a Big Data
653			\|a Artificial neural networks
653			\|a Recurrent neural networks
653			\|a Speech processing
653			\|a Smart buildings
653			\|a Algorithms
653			\|a Semi-supervised learning
653			\|a Deep learning
653			\|a Machine learning
653			\|a Real time
653			\|a Ethical standards
653			\|a Synthesis
653			\|a Statistical models
653			\|a Speech recognition
653			\|a Language
653			\|a Personality traits
653			\|a Computer science
653			\|a Personality
653			\|a Text editing
653			\|a Speaking
653			\|a Human-computer interaction
653			\|a Technology
653			\|a Computers
653			\|a Artificial intelligence
653			\|a Voice recognition
653			\|a Neural networks
653			\|a Natural language processing
653			\|a Linguistics
653			\|a Consciousness
653			\|a Speech
773	0		\|t International Journal of Advanced Computer Science and Applications \|g vol. 16, no. 2 (2025)
786	0		\|d ProQuest \|t Advanced Technologies & Aerospace Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3180200463/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3180200463/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch