Improving diagnosis recording for better patient care: case study in heart failure using natural language processing (project complete)
Dr Anoop D. Shah
Area of studyData and AI
Host universityInstitute of Health Informatics
University College London
The ability to record diagnosis in a detailed, accurate way is essential for both clinical care and research about healthcare. But current electronic health record (EHR) systems store much of this information in free text.
Free-text EHRs can be analysed using natural language processing (NLP) to extract information for research purposes. But for the purpose of supporting safe clinical decision making, features of patients’ diagnoses need to be organised in an agreed way, according to an information model.
Standardised diagnosis information models can help to ensure consistent care throughout the NHS and reduce the need for duplicate data collection for audit or research.
This project aims to generate an evidence base for generalisable improvements in diagnosis recording in the NHS by applying natural language processing methods, with a focus on heart failure as a clinical example.
Patient journeys will be constructed from initial symptoms to detailed diagnosis and evaluated on how well proposed information models accommodate information that currently exists only in text.
Then, the study will develop information models and recommendations for systems. The proposed models will be evaluated in pilot implementations such as a local heart failure clinic.
The overall learning from the process will be disseminated through academic publications and via clinical academic networks, aiming to engage specialist societies to develop models for recording diagnoses in their domains.
Watch and listen to find out more about Anoop’s fellowship project
Anoop gave a lightning talk about his research project at our 2020 annual event, THIS Space.
Shah, A. D. et al. (2019) Natural language processing for disease phenotyping in UK primary care records for research: a pilot study in myocardial infarction and death. Journal of Biomedical Semantics.
Shah, A. D. et al. (2019) Recording problems and diagnoses in clinical care: developing guidance for healthcare professionals and system designers. BMJ Health & Care Informatics.
Shah, A. D. et al. (2021) Data gaps in electronic health record (EHR) systems: An audit of problem list completeness during the COVID-19 pandemic. International Journal of Medical Informatics.
Shah, A. D. et al. (2023) Translating and evaluating historic phenotyping algorithms using SNOMED CT. Journal of the American Medical Informatics Association.
Shah, A. D. et al. (2023) Long Covid symptoms and diagnosis in primary care: a cohort study using structured and unstructured data in The Health Improvement Network primary care database. medRxiv.
Case study: Using SNOMED CT to define patient phenotypes for research
UCL-THIN Long Covid Study: Studying symptoms of Long COVID using the THIN primary care database
MiADE: making it easier for clinicians to enter structured information in electronic health records
DataTools4Heart: federated analysis of electronic health record data across Europe
Software tools for using SNOMED CT: Rdiagnosislist R package
Find out more