Transformer-Based Natural Language Processing Model for Automated Clinical Diagnosis and Electronic Health Record Summarization

Authors

  • Rashmi Jain Department of Computer Science and Engineering, S.B. Jain Institute of Technology, Management and Research, Nagpur, Maharashtra-441501, India.
  • Dr. Shikhar Verma Professor , MSOPS, Maharishi University of Information Technology, Lucknow, Uttar Pradesh, India.
  • Dr. Jimmy Narayan Professor, Department of Gastroenterology, IMS and SUM Hospital, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India.
  • Manish M Goswami Department of Computer Science and Engineering, S.B. Jain Institute of Technology, Management and Research, Nagpur, Maharashtra-441501, India.
  • Sachin U. Balvir Department of Computer Science and Engineering, S.B. Jain Institute of Technology, Management and Research, Nagpur, Maharashtra-441501, India.
  • Anup Gade Department of Computer Science and Engineering, S.B. Jain Institute of Technology, Management and Research, Nagpur, Maharashtra-441501, India.
  • Dr. Komal Patel Consultant, Department of Gynaecology, Parul University, PO Limda, Tal. Waghodia, District Vadodara, Gujarat, India.
  • Pushpalatha P Computer Science, Assistant Professor, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamil Nadu, India.

Keywords:

Transformer-based NLP, Clinical diagnosis prediction, Electronic health records, Medical text summarization, Healthcare informatics, Deep learning in healthcare

Abstract

Electronic Health Records (EHRs) hold vast amounts of unstructured clinical notes that would be hard to efficiently analyze within traditional healthcare information systems. Correct interpretation of the clinical notes is crucial for prompt diagnosis, adequate treatment planning, and decrease in physician documentation workload. Yet clinical texts may present certain terminology that is ambiguous, some abbreviations, and some different writing styles, all of which pose challenges to automated medical text processing. This research aims to develop an automated clinical diagnosis prediction framework and an EHR summarization framework based on Natural Language Processing (NLP) and transformer. The proposed model uses contextual embedding, multi-head attention mechanisms and multitask learning to model disease classification and to summarize a patient's raw records into a concise description. Benchmark healthcare datasets such as the Medical Information Mission to Intelligent City (MIMIC-III) clinical notes were used for experimentation, and the clinical notes were preprocessed using tokenization, normalization, and medical entity extraction. The framework was assessed on a set of diagnosis prediction metrics (accuracy, precision, recall, F1-score, and AUROC) as well as summarization metrics (ROUGE and BLEU scores). Experimental results show that the model outperforms the current deep learning models in terms of the increased accuracy of prediction and the quality of summarization. The proposed framework has good potential for intelligent clinical decision support and healthcare documentation automation.

Downloads

Published

2026-04-15

How to Cite

Jain, R., Verma, D. S., Narayan, D. J., Goswami, M. M., Balvir, S. U., Gade, A., … P, P. (2026). Transformer-Based Natural Language Processing Model for Automated Clinical Diagnosis and Electronic Health Record Summarization. International Journal of Artificial Intelligence and Machine Learning, 6(1s), 332–344. Retrieved from https://svedbergopen.com/index.php/ijaiml/article/view/122

Similar Articles

<< < 4 5 6 7 8 9 10 11 12 > >> 

You may also start an advanced similarity search for this article.