About
I'm a Machine Learning Engineer with a passion for deep learning and optimization algorithms. 2+ year of experience designing and deploying ML systems in fintech (invoice automation, Bayesian ML and time series forecasting) and research experience in optimization and uncertainty quantification. Skilled at bridging deep learning research with scalable production systems with hands-on experience in building end-to-end ML pipelines, model development and deployment (Azure). Currently working at the ELLIS institute as a research assistant with an interest in optimization for pretraining large transformer-based architectures.
Projects
Muon Research
Interested in understanding the dynamics of the Muon optimizer, particularly in the context of pretraining language models.
Below are some repos of interest:
Bias Correction in AdamW
Studying the necessity of bias correction in the AdamW optimizer. Proving that bias correction serves merely as a form of learning rate scheduling and is typically absorbed into the scheduler, thus rendering it redundant (I claim :))