Ethics code: IR.IAU.KERMAN.REC.1402.124
1- Ph.D. in Computer Engineering, Kerman Branch, Islamic Azad University, Kerman, Iran
2- Assistant Professor in Computer Engineering, Department of Computer Engineering, Kerman Branch, Islamic Azad University, Kerman, Iran , a.khatibi@srbiau.ac.ir
3- Assistant Professor in Electrical Engineering, Department of Electrical Engineering, Kerman Branch, Islamic Azad University, Kerman, Iran
Abstract: (346 Views)
Background and Aim: Medical reports and electronic health records are critically important for diagnosis, treatment, patient protection, and medical research. Correcting spelling errors in medical texts is essential to ensure accurate interpretation of information. This research was conducted to automatically correct spelling mistakes in Persian medical texts using neural networks.
Material and Methods: In this study, which was conducted in 2023, a computational model based on artificial intelligence neural networks and dual embedding techniques was developed using Python in a Windows environment. The dual embedding model was fine-tuned for correcting spelling errors in Persian sonography texts. The proposed model employs various techniques for automatic error detection, including dictionary lookup approach and contextual similarity coefficients. Furthermore, features specific to text processing, such as Edit-Distance, along with similarity coefficients, were utilized to automatically select the most appropriate substitute for a misspelled word. The training and testing data for the current model were sourced from a collection of sonography texts from the Imam Khomeini Hospital’s sonography clinic in Tehran.
Results: The proposed model which is based on artificial neural networks, leverages a novel dualembedding architecture to select the best candidate words for correcting both non-word and real-word errors. According to the evaluation results on Persian sonography text, the proposed model achieved an F-Measure accuracy of 90.5% in detecting real-word errors. Furthermore, it demonstrated an impressive 90% accuracy in automatically correcting these real-word errors. Additionally, the model exhibited a strong performance, achieving 90.8% accuracy in correcting non-word errors.
Conclusion: Based on the evaluation results, the proposed method is robust against various changes in word forms and can effectively manage a wide range of morphological and semantic errors, including replacements, transpositions, insertions, and deletions in medical texts. The integration of EditDistance with textual similarity coefficients extracted from the dual embedding model significantly enhanced the accuracy of spelling corrections in Persian sonography texts, ensuring greater validity of such documents. The authors believe that the proposed model represents a significant advancement in the detection and correction of spelling errors in Persian sonography texts.
Type of Study:
Original Research |
Subject:
Health Information Technology ePublished: 1399/07/23