Numerai: Model Deployment Pipeline on Azure
What is Numerai?
- Numerai is a quantitative hedge fund run by data scientists around the world. It holds a weekly (now daily) tournament to crowdsource machine learning models that predict the stock market.
- Data scientists can download the data, build models, upload predictions and stake on model performance. The models are evaluated on the live market data and the best models are rewarded. See Numerai for more details.
What is this project about? Why does it matter?
-
As Numerai daily submission window is only ~1 hour, while daily opening time fluctuates, staying up late to manually submit prediction is dumb…
-
This project aims to help Numerai participants to automate the daily submission on Azure at a very low cost (<$1 USD per month).
Link to repo: https://github.com/eses-wk/numerai-azure-example
Detailed submission pipeline:
- Train models locally, prepare barebone submission script
- Build, Test & Push Docker image to Docker Hub, or any container registry
- Set-up Azure Container Instance (ACI) by pulling Docker image from registry
- Set-up Azure Function to trigger ACI run upon receiving Numerai daily submission API call (via submission webhook), which automatically submit predictions to Numerai
Other Numerai tournament experiments done:
- Dimension reduction & feature generation using UMAP(Uniform Manifold Approximation and Projection), PCA
- Feature neutralization, Feature selection (SHAP,ANOVA)
- Trained XGBoost, LightGBM, Catboost, Neural Network models, hyperparameter search using RandomSearchCV
- Work-in-progress: portfolio optimization toolkit by weighted-average-return, minimum-variance methods