CDCC — Clinical Decision Confidence Consistency
Weight in Healthcare Readiness Score: 15%
Tests whether the AI knows what it doesn't know. Measures calibration (are confidence scores accurate?), demographic consistency (same quality across patient groups), and context robustness (stable answers despite irrelevant changes).
Evaluation Components
- Confidence calibration accuracy
- Demographic response consistency
- Context perturbation robustness
- Uncertainty acknowledgment rate
Test Cases
- Confidence score vs actual accuracy correlation
- Same case across different demographics
- Adding irrelevant context to clinical scenarios
- Edge case uncertainty detection