Executive Summary
The decision-maker view
Business question
Which customer behaviors best explain credit default risk, how predictive is the dataset across different ML tasks, and can the same data also support meaningful customer segmentation for risk management?
Main conclusion
The strongest practical value of the dataset lies in classification and behavioral segmentation. Regression is useful for understanding nonlinear dynamics, but default prediction and customer clustering produce the clearest operational insights.
Strategic takeaway
Banks should prioritize repayment behavior and utilization stress indicators in early-warning systems. These signals are more actionable than static demographic variables and better aligned with operational credit-risk decisions.
Project Context
Dataset scope and analytical ambition
Dataset overview
The project uses the UCI “Default of Credit Card Clients” dataset, covering 30,000 customers in Taiwan and combining customer characteristics, credit line allocation, monthly payment status, billing amounts, payment amounts, and next-month default labels.
Analytical objective
The analysis intentionally approaches the data from three angles: regression for financial behavior understanding, supervised classification for default prediction, and unsupervised clustering for customer segmentation and anomaly-oriented diagnostics.
Core Behavioral Drivers
What the data says matters most
1. Payment history
Repayment status variables emerge as the clearest risk signal. Even small delays materially increase default probability, and later-stage delinquency sharply shifts customers into the high-risk segment.
2. Utilization as stress signal
Credit utilization functions as a strong secondary indicator of financial pressure. High utilization does not act alone, but it becomes highly informative when combined with repayment patterns.
3. Spending is informative, not decisive
Average billing and payment amounts add value, but on their own they do not separate risk classes sharply. Their strength comes from interaction effects, not isolated linear relationships.
Regression Findings
Useful for behavior understanding, not the strongest final use case
Why regression was tested
The project explored whether behavioral features such as average bill amount, average payment amount, payment power, and utilization could predict credit limit allocation (LIMIT_BAL). This was meant to test how far behavior-only variables can explain financial capacity.
Main lesson
Linear regression delivered moderate performance around R² ≈ 0.59–0.61, confirming that the problem is only partially linear. Nonlinear ensemble methods captured the structure much better, demonstrating that customer financial behavior follows more complex dynamics.
| Model | Observed performance | Interpretation | Assessment |
|---|---|---|---|
| Linear Regression | R² ≈ 0.59–0.61 | Provides interpretable but limited explanatory power. Generalizes reasonably, yet misses nonlinear dynamics and leaves high error in monetary terms. | Baseline only |
| Extra Trees Regressor | R² ≈ 0.96 | Strongest regression performance in the report comparison. Captures nonlinear interactions effectively and aligns closely with actual credit limit values. | Best regression model |
| Strategic implication | Behavior-only prediction remains incomplete | Even with strong nonlinear fit, real-world credit limit decisions still require external variables such as income, employment, debt-to-income, and broader financial history. | Needs richer data |
Classification Findings
The strongest operational value of the project
Best model
Gradient Boosting Classifier emerged as the best-performing classifier in the comparison workflow, balancing discrimination ability with manageable false positives.
What drove performance
The model benefited from combining recent payment status variables with engineered financial behavior features such as payment power and utilization rate.
Why it matters
This part of the project demonstrates real potential for credit-risk scoring, preemptive intervention, and threshold-based decision strategies in banking.
| Metric | Value | What it means |
|---|---|---|
| AUC | ~0.81 | Good ability to distinguish defaulters from non-defaulters. |
| Accuracy | ~0.72 | Moderate overall correctness across both classes. |
| Precision | ~0.79 | Positive flags are fairly reliable, which limits unnecessary interventions. |
| Recall | ~0.61 | The model captures roughly 60% of actual defaulters, leaving meaningful false-negative risk. |
Confusion matrix message
- Most non-defaulters are correctly identified.
- A substantial number of actual defaulters are also captured.
- The most costly error remains the false negative: a customer who will default but is not flagged.
- False positives are present but materially lower than missed-risk cases.
Executive interpretation
This is a usable risk model, not a perfect one. It is well suited for probability-based screening and threshold tuning, especially in settings where the business wants to trade off missed defaulters against customer friction.
Segmentation Findings
Where clustering adds business value
K-Means outcome
The elbow analysis and PCA projection support a 3-cluster structure. These clusters map well to interpretable behavior-based profiles: low-risk, medium-risk, and higher-volatility customer groups.
- Cluster 0: low utilization, low payment stress, stable borrowers
- Cluster 1: moderate utilization and mixed risk patterns
- Cluster 2: high capacity, high activity, higher payment stress
DBSCAN diagnostic
DBSCAN showed that the dataset does not contain sharply separated density-based clusters. Instead, customer behavior follows broad, continuous gradients with overlapping regions and a meaningful amount of noise.
- Two dominant DBSCAN clusters cover most observations
- ~17% of observations behave like noise or irregular transition cases
- Small micro-clusters reflect niche financial behaviors rather than stable market segments
Business Implications
Where these results could be applied
Risk scoring
Recent payment behavior and utilization can strengthen early-warning models for delinquency and credit-risk monitoring.
Credit line strategy
Behavioral clusters can help distinguish clients who may qualify for limit increases from those who require tighter risk controls.
Targeted action
Segmentation supports differentiated communication, loyalty incentives, refinancing offers, and intervention strategies by customer profile.
Limitations & Next Steps
What would make the system stronger
Current limitations
- The dataset lacks external variables such as income, employment, and bureau-based credit history.
- Behavior-only models can be operationally useful, but not fully sufficient for production-grade lending decisions.
- Classification still leaves a meaningful false-negative cost that would matter in real financial settings.
Next analytical steps
- Add richer socioeconomic and credit-history features.
- Tune decision thresholds according to business risk tolerance.
- Extend explainability for executive use and governance.
- Test deployment use cases such as early-warning dashboards or risk tiering workflows.