Skip to content

Samy-mri/medicalNLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

medical NLP project

Going to try and predict medical speciality based on medical transcript

Dataset found on kaggle. Cannot find the URL so uploaded it here.

Code implements a simple bag of words, from sklearn tutorial (https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html)

df.columns: Index(['index', 'description', 'medical_specialty', 'sample_name','transcription', 'keywords']

There are 39 medical specialities:

array(['Allergy / Immunology', 'Bariatrics', 'Cardiovascular / Pulmonary',
       'Dentistry', 'Urology', 'General Medicine', 'Surgery',
       'Speech - Language', 'SOAP / Chart / Progress Notes',
       'Sleep Medicine', 'Rheumatology', 'Radiology',
       'Psychiatry / Psychology', 'Podiatry', 'Physical Medicine - Rehab',
       'Pediatrics - Neonatal', 'Pain Management', 'Orthopedic',
       'Ophthalmology', 'Office Notes', 'Obstetrics / Gynecology',
       'Neurosurgery', 'Neurology', 'Nephrology', 'Letters',
       'Lab Medicine - Pathology', 'IME-QME-Work Comp etc.',
       'Hospice - Palliative Care', 'Hematology - Oncology',
       'Gastroenterology', 'ENT - Otolaryngology', 'Endocrinology',
       'Emergency Room Reports', 'Discharge Summary',
       'Diets and Nutritions', 'Dermatology',
       'Cosmetic / Plastic Surgery', 'Consult - History and Phy.',
       'Chiropractic'])	   

I'd like to classify medical speciality based on keywords. It's easy to change which speciality to keep using:

df = df[df['medical_specialty'].isin(['Neurosurgery','Neurology'])]

I used Gaussian NB, Multinomial NB, SVM from Linear model, SVM from SGDClassifier, and logistic regression from SGDClassifier both here.

Input data is discrete so Gaussian NB should perform badly because it assumes features of a continuous nature. Multinomial NB is its discrete classification equivalent.

Example 1: Classify between Neurosurgery and Neurology:

Example 2: Classify between all specialities:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages