This project aims to improve the extraction of structured data from free-text clinical notes in Electronic Medical Records (EMRs) using Large Language Models (LLMs). It focuses on generating synthetic clinical notes to facilitate data extraction and sharing while maintaining patient privacy. By using curated templates and LLMs, the project creates synthetic notes that mimic real ones without exposing Protected Health Information (PHI). It also involves fine-tuning LLMs to enhance data extraction accuracy and plans to validate synthetic notes through a Turing Test-style experiment. Future developments include expanding the tool to support various clinical note types and disease sites, and creating a web-based tool for customization.
Folder | Description |
---|---|
Documentation | all documentation the project team has created to describe the architecture, design, installation, and configuration of the project |
Notes and Research | Relevant helpful information to understand the tools and techniques used in the project |
Project Deliverables | Folder that contains final pdf versions of all Fall and Spring Major Deliverables |
Status Reports | Project management documentation - weekly reports, milestones, etc. |
scr | Source code - create as many subdirectories as needed |
- Rishabh Kapoor - iHealth Solutions - Sponsor
- Preetam Ghosh - Department of Computer Science - Faculty Advisor
- Shashank Sinha - Computer Science - Student Team Member
- Connor Holden - Computer Science - Student Team Member
- August Moses - Computer Science - Student Team Member
- Sawiya Aidarus - Computer Science - Student Team Member