Chinese Text Readability Assessment Based on the Integration of Visualized Part-of-Speech Information with Linguistic Features

Guardado en:
Detalles Bibliográficos
Publicado en:Algorithms vol. 18, no. 12 (2025), p. 777-793
Autor principal: Chi-Yi, Hsieh
Otros Autores: Jing-Yan, Lin, Chi-Wen, Hsieh, Bo-Yuan, Huang, Yi-Chi, Huang, Yu-Xiang, Chen
Publicado:
MDPI AG
Materias:
Acceso en línea:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3286250259
003 UK-CbPIL
022 |a 1999-4893 
024 7 |a 10.3390/a18120777  |2 doi 
035 |a 3286250259 
045 2 |b d20250101  |b d20251231 
084 |a 231333  |2 nlm 
100 1 |a Chi-Yi, Hsieh  |u The Institute of Chinese Language Education, National Kaohsiung Normal University, Kaohsiung 80201, Taiwan; t4136@mail.nknu.edu.tw 
245 1 |a Chinese Text Readability Assessment Based on the Integration of Visualized Part-of-Speech Information with Linguistic Features 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a The assessment of Chinese text readability plays a significant role in Chinese language education. Due to the intrinsic differences between alphabetic languages and Chinese character representations, the readability assessment becomes more challenging in terms of the language’s inherent complexity in vocabulary, syntax, and semantics. The article proposed the conceptual analogy between Chinese readability assessment and music’s rhythm and tempo patterns, in which the syntactic structures of the Chinese sentences could be transformed into an image. The Chinese Knowledge and Information Processing Tagger (CkipTagger) tool developed by Sinica-Taiwan is utilized to decompose the Chinese text into a set of tokens. These tokens are then refined through a user-defined token pool to retain meaningful units. An image with part-of-speech (POS) information will be generated by using the token versus syntax alignment. A discrete cosine transform (DCT) is then applied to extract the temporal characteristics of the text. Moreover, the study integrated four categories: linguistic features–type–token ratio, average sentence length, total word, and difficulty level of vocabulary for the readability assessment. Finally, these features were fed into the Support Vector Machine (SVM) network for the classifications. Furthermore, a bidirectional long short-term memory (Bi-LSTM) network is adopted for quantitative comparisons. In simulation, a total of 774 Chinese texts fitted with Taiwan Benchmarks for the Chinese Language were selected and graded by Chinese language experts, consisting of equal amounts of basic, intermediate, and advanced levels. The finding indicated the proposed POS with the linguistic features work well in the SVM network, and the performance matches with the more complex architectures like the Bi-LSTM network in Chinese readability assessments. 
651 4 |a Taiwan 
653 |a Readability 
653 |a Information processing 
653 |a Data processing 
653 |a Language instruction 
653 |a Politics 
653 |a Simulation 
653 |a Insurance policies 
653 |a Chinese languages 
653 |a Rhythm 
653 |a Discrete cosine transform 
653 |a Linguistics 
653 |a Speech 
653 |a Machine learning 
653 |a Semantics 
653 |a Syntax 
653 |a Short term memory 
653 |a Support vector machines 
653 |a Syntactic structures 
653 |a Neural networks 
653 |a Reading comprehension 
653 |a Complexity 
653 |a Syntactic complexity 
653 |a Sentences 
700 1 |a Jing-Yan, Lin  |u Department of Electrical Engineering, National Chiayi University, Chiayi City 600325, Taiwan; jimmy1000310@gmail.com 
700 1 |a Chi-Wen, Hsieh  |u Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; gary890825@gmail.com (B.-Y.H.); candy474189407@gmail.com (Y.-C.H.); b0967025078@gmail.com (Y.-X.C.) 
700 1 |a Bo-Yuan, Huang  |u Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; gary890825@gmail.com (B.-Y.H.); candy474189407@gmail.com (Y.-C.H.); b0967025078@gmail.com (Y.-X.C.) 
700 1 |a Yi-Chi, Huang  |u Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; gary890825@gmail.com (B.-Y.H.); candy474189407@gmail.com (Y.-C.H.); b0967025078@gmail.com (Y.-X.C.) 
700 1 |a Yu-Xiang, Chen  |u Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; gary890825@gmail.com (B.-Y.H.); candy474189407@gmail.com (Y.-C.H.); b0967025078@gmail.com (Y.-X.C.) 
773 0 |t Algorithms  |g vol. 18, no. 12 (2025), p. 777-793 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3286250259/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3286250259/fulltextwithgraphics/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3286250259/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch