CareSight Dashboard
CareSight Model Results

Overview

CareSight predicts excess hospital readmission ratios across approximately 18,000 US facilities using CMS public data. Hospital readmission rates are a direct quality metric — CMS financially penalizes hospitals with avoidable readmissions. This project builds and compares multiple regression models to surface which facilities are at risk and why.

What was built

  • Multi-model benchmark: Ridge regression, Lasso regression, XGBoost, and a neural network (TensorFlow/Keras) trained on CMS hospital-level data.
  • Best model: Neural Network (R² = 0.975), XGBoost (R² = 0.952) — results reported on a held-out test set.
  • SHAP-based feature importance analysis to explain which discharge and patient-volume features drive predicted readmission risk.
  • Saved model artifacts: .pkl (XGBoost) and .h5 (neural network) for reproducible inference.
  • Clean data pipeline: raw CMS CSV ingestion, preprocessing, feature engineering, and train/test split in a single notebook.
  • Well-structured README with problem statement, methodology, results table, and setup instructions.

Why it matters

Healthcare ML has real regulatory stakes. Identifying high-risk facilities before CMS penalties are issued gives administrators actionable lead time. SHAP explanations make the model's signals interpretable to non-technical stakeholders — a requirement in any healthcare decision-support context.

Project Info

  • Category: Healthcare ML
  • Best model: Neural Network — R² = 0.975
  • Dataset: CMS hospital readmission data (~18,000 facilities)
  • Stack: Python, XGBoost, TensorFlow, SHAP, Scikit-learn, Pandas
  • GitHub: Hospital-Readmission-Prediction