
Gradient boosting is a machine learning technique used for regression and classification that builds a strong predictive model by combining multiple weak learners—typically shallow decision trees. Each model in the sequence is trained to correct the errors of the previous one by minimizing a loss function using gradient descent. This iterative improvement process makes gradient boosting highly effective for structured, tabular datasets.
Gradient boosting is widely used in industries such as finance, healthcare, and marketing because it provides high accuracy, handles complex feature relationships, and performs well even with noisy or imbalanced data. It is considered one of the most powerful classical machine learning methods before deep learning is required.
Small, simple models (often decision tree stumps) that perform only slightly better than random guessing.
Each weak learner is added to the ensemble sequentially, adjusting for previous prediction errors.
Measures prediction error; common choices include:
Each iteration trains the next model on the gradient of this loss.
Updates the ensemble based on how predictions should shift to minimize the loss.
Controls how much each new model contributes. Smaller values improve accuracy but require more boosting rounds.
Uses random sampling during training to reduce overfitting and improve generalization.
Highly optimized implementation with regularization, parallelism, and efficient handling of missing data.
Uses histogram-based splits and leaf-wise tree growth for faster training and better scaling.
Optimized for datasets with categorical variables and reduces preprocessing effort.
A bank builds a fraud detection system using gradient boosting. The model iteratively learns from previously misclassified transactions, improving fraud prediction accuracy over time.