Horse Racing Outcome Prediction — Deep Neural Network vs Random Forest
This group project analyzes Hong Kong horse racing data and compares several machine learning methods for predicting race outcomes. The work combines data preparation, feature engineering, model comparison, and result interpretation.
Highlights
- Worked with a dataset containing more than 130,000 Hong Kong horse racing records.
- Compared deep learning and classical machine learning models in the same prediction task.
- Examined which features contributed most strongly to predictive performance.
Methods
- Built a dataset combining horse demographics, race characteristics, jockey and trainer information, track conditions, and sectional time variables.
- Evaluated Deep Neural Network, Random Forest, XGBoost, and Logistic Regression models.
- Compared models using predictive performance and feature importance patterns.
Findings
- Random Forest achieved the strongest out-of-sample accuracy among the main models tested.
- Deep Neural Networks were competitive but required more tuning and model adjustment.
- Odds-related and horse-form variables contributed substantially to predictive performance.