Client Background
Our client is a US-based healthcare technology and services company.
Client Need
Our client sought to develop a risk adjustment tool to improve the efficiency of the medical coding process by leveraging natural language processing, machine learning, and deep learning. Key challenges included:
Extracting clinical insights from unstructured and non-standard document formats, including scanned charts, lab reports, and patient history
Building a medical record review utilizing OCR and NLP to assist coders with PHI identification and patient/provider information extraction
Automating the manual review process for consistent and accurate results
Solution
Key solutions included:
Custom OCR Model: Developed an OCR module, training a custom Tesseract model to extract text from patient medical charts
NLP Engine Development: Built an NLP engine comprising an entity extraction module and an ICD extraction module
Entity Extraction: Created a pipeline to extract patient information (name, age, gender, etc.) and provider information (name, title, facility, etc.) using regular expressions and deep learning
ICD Extraction: Developed a pipeline, using a state-of-the-art multi-label classification model, to extract ICD codes and corresponding annotations
ICD Code Enhancement: Implemented regex logic to extract existing (AS-IS) ICD codes in addition to annotation-based ICDs
Realized Benefits
We demonstrated high accuracy, achieving a 90% success rate while processing over 5,000 documents. Key outcomes included:
Reduced the time required for clinical guideline synthesis by 60%
Increased output from 10-11 charts/day (manual) to 15-16 charts/day using NLP
Decreased administrative costs.
Tools & Technologies
Python
Scikit Learn
RabbitMQ
PyTorch
Sci-Hub
SpaCy
Tesseract OCR
OpenCV
Trending Success Stories
Ready to Innovate with Us?
Let’s Talk!
Connect with us on social media
Write to us at
[email protected]